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NOVEL NUCLEOTIDE AND AMINO ACID SEQUENCES, AND ASSAYS AND 
METHODS OF USE THEREOF FOR DIAGNOSIS OF LUNG CANCER 

FIELD OF THE INVENTION 
5 The present invention is related to novel nucleotide and protein sequences that are 

diagnostic markers for lung cancer, and assays and methods of use thereof. 

BACKGROUND OF THE INVENTION 

Lung cancer is the primary cause of cancer death among both men and women in the U. 
10 S. 5 with an estimated 172,000 new cases being reported in 1994, The five-year survival rate 

among all lung cancer patients, regardless of the stage of disease at diagnosis, is only 13%. This 

contrasts with a five-year survival rate of 46% among cases detected while the disease is still 

localized. However, only 16% of lung cancers are discovered before the disease has spread. 

Lung cancers are broadly classified into small cell or non-small cell lung cancers. Non-small 
15 cell lung cancers are further divided into adenocarcinomas, bronchoalveolar- alveolar, squamous 

cell and large cell carcinomas. Approximately, 75-85 percent of lung cancers are non-small cell 

cancers and 15-25 percent are small cell cancers of the lung. 

Early detection is difficult since clinical symptoms are often not seen until the disease 

has reached an advanced stage. Currently, diagnosis is aided by the use of chest x-rays, analysis 
20 of the type of cells contained in sputum and fiberoptic examination of the bronchial passages. 

Treatment regimens are determined by the type and stage of the cancer, and include surgery, 

radiation therapy and/or chemotherapy. 

Early detection of primary, metastatic, and recurrent disease can significantly impact the 

prognosis of individuals suffering from lung cancer. Non- small cell lung cancer diagnosed at an 
25 early stage has a significantly better outcome than that diagnosed at more advanced stages. 

Similarly, early diagnosis of small cell lung cancer potentially has a better prognosis. 

Although current radiotherapeutic agents, chemotherapeutic agents and biological toxins 

are potent cytotoxins, they do not discriminate between normal and malignant cells, producing 

adverse effects and dose- limiting toxicities. There remains a need for lung cancer specific 
30 cancer markers. There remains a need for reagents and kits which can be used to detect the 

presence of lung cancer markers in samples from patients. There remains a need for methods of 
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2 

screening and diagnosing individuals who have lung cancer and methods of monitoring response 
to treatment, disease progression and disease recurrence in patients diagnosed with lung cancer. 
There remains a need for reagents, kits and methods for determining the type of lung cancer that 
an individual who has lung cancer has. There remains a need for compositions which can 
specifically target lung cancer cells. There remains a need for imaging agents which can 
specifically bind to lung cancer cells. There remains a need for improved methods of imaging 
lung cancer cells. There remains a need for therapeutic agents which can specifically bind to 
lung cancer cells. There remains a need for improved methods of treating individuals who are 
suspected of suffering from lung cancer. 



SUMMARY OF THE INVENTION 

The background art does not teach or suggest markers for lung cancer that are 
sufficiently sensitive and/or accurate, alone or in combination. 

The present invention overcomes these deficiencies of the background art by providing 

1 5 novel markers for lung cancer that are both sensitive and accurate. Furthermore, these markers 
are able to distinguish between different types of lung cancer, such as small cell or none small 
cell lung cancer, and further between non- small cell lung cancer types, such as 
adenocarcinomas, squamous cell and large cell carcinomas. These markers are overexpressed in 
lung cancer specifically, as opposed to normal lung tissue. The measurement of these markers, 

20 alone or in combination, in patient (biological) samples provides information that the 

diagnostician can correlate with a probable diagnosis of lung cancer. The markers of the present 
invention, alone or in combination, show a high degree of differential detection between lung 
cancer and non-cancerous states. 

According to preferred embodiments of the present invention, examples of suitable 

25 biological samples which may optionally be used with preferred embodiments of the present 

invention include but are not limited to blood, serum, plasma, blood cells, urine, sputum, saliva, 
stool, spinal fluid or CSF, lymph fluid, the external secretions of the skin, respiratory, intestinal, 
and genitourinary tracts, tears, milk, neuronal tissue, lung tissue, any human organ or tissue, 
including any tumor or normal tissue, any sample obtained by lavage (for example of the 

30 bronchial system or of the breast ductal system), and also samples of in vivo cell culture 

constituents. In a preferred embodiment, the biological sample comprises lung tissue and/or 



WO 2006/131783 



PCT/IB2005/004037 



sputum and/or a serum sample and/or a urine sample and/or any other tissue or liquid sample. 
The sample can optionally be diluted with a suitable eluant before contacting the sample to an 
antibody and/or performing any other diagnostic assay. 

5 

Information given in the text with regard to cellular localization was determined 
according to four different software programs: (i) tmhmm (from Center for Biological Sequence 
Analysis, Technical University of Denmark DTU, 

http://www.cbs.dtu.dk/services/TMHMM/TMHMM2.0 or (ii) tmpred (from 

10 EMBnet, maintained by the ISREC Bionformatics group and the LICR Information Technology 
Office, Ludwig Institute for Cancer Research, Swiss Institute of Bioinformatics, 
http://www.ch.embnet.org/soflware/TMPRED_form.htnil for transmembrane region prediction; 
(iii) signalpjimm or (iv) signalp_nn (both from Center for Biological Sequence Analysis, 
Technical University of Denmark DTU, 

1 5 htip://www.cbs.dtudk/services/SignalP/background/prediction.php ) for signal peptide 
prediction. The terms "signalp_hmm" and "signalpjnn" refer to two modes of operation for the 
program SignalP: hmm refers to Hidden Markov Model, while nn refers to neural networks. 
Localization was also determined through manual inspection of known protein localization 
and/or gene structure, and the use of heuristics by the individual inventor. In some cases for the 

20 manual inspection of cellular localization prediction inventors used the ProLoc computational 
platform [Einat Hazkani-Covo, Erez Levanon, Galit Rotman, Dan Graur and Amit Novik; 
(2004) "Evolution of multicellularity in metazoa: comparative analysis of the subcellular 
localization of proteins in Saccharomyces, Drosophila and Caerorhabditis." Cell Biology 
International 2004;28(3): 171-8.], which predicts protein localization based on various 

25 parameters including, protein domains (e.g., prediction of trans-membranous regions and 
localization thereof within the protein), pi, protein length, amino acid composition, homology to 
pre-annotated proteins, recognition of sequence patterns which direct the protein to a certain 
organelle (such as, nuclear localization signal, NLS, mitochondria localization signal), signal 
peptide and anchor modeling and using unique domains from Pfam that are specific to a single 

30 compartment. 
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Information is given in the text with regard to SNPs (single nucleotide polymorphisms). 
A description of the abbreviations is as follows. "T - > C", for example, means that the SNP 
results in a change at the position given in the table from T to C. Similarly, "M - > Q" 5 for 
example, means that the SNP has caused a change in the corresponding amino acid sequence, 
5 from methionine (M) to glutamine (Q). If, in place of a letter at the right hand side for the 
nucleotide sequence SNP, there is a space, it indicates that a frameshift has occurred. A 
frameshift may also be indicated with a hyphen (-). A stop codon is indicated with an asterisk at 
the right hand side (*). As part of the description of an SNP, a comment may be found in 
parentheses after the above description of the SNP itself. This comment may include an FTId, 

10 which is an identifier to a SwissProt entry that was created with the indicated SNP. An FTId is 
a unique and stable feature identifier, which allows construction of links directly from position- 
specific annotation in the feature table to specialized protein-related databases. The FTId is 
always the last component of a feature in the description field, as fellows: FTId=XXX number, 
in which XXX is the 3 -letter code for the specific feature key, separated by an underscore from 

15 a 6-digit number. In the table of the amino acid mutations of the wild type proteins of the 
selected splice variants of the invention, the header of the first column is "SNP position(s) on 
amino acid sequence", representing a position of a known mutation on amino acid sequence. 
SNPs may optionally be used as diagnostic markers according to the present invention, alone or 
in combination with one or more other SNPs and/or any other diagnostic marker. Preferred 

20 embodiments of the present invention comprise such SNPs, including but not limited to novel 
SNPs on the known (WT or wild type) protein sequences given below, as well as novel nucleic 
acid and/or amino acid sequences formed through such SNPs, and/or any SNP on a variant 
amino acid and/or nucleic acid sequence described herein. 

Information given in the text with regard to the Homology to the known proteins was 

25 determined by Smith- Waterman version 5.1.2 using special (non default) parameters as follows: 
-model=sw.model 
-GAPEXT=0 
-GAPOP=100.0 

-MATRIX=blosuml 00 



30 
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Information is given with regard to overexpression of a cluster in cancer based on ESTs. 
A key to the p values with regard to the analysis of such overexpression is as follows: 

- library-based statistics: P- value without including the level of expression in cell- 
lines (PI) 

- library based statistics: P- value including the level of expression in cell- lines (P2) 

- EST clone statistics: P~ value without including the level of expression in cell- lines 

(SP1) 

- EST clone statistics: predicted overexpression ratio without including the level of 
expression in cell-lines (R3) 

- EST clone statistics: P- value including the level of expression in cell- lines (SP2) 

- EST clone statistics: predicted overexpression ratio including the level of 
expression in cell- lines (R4) 

Library-based statistics refer to statistics over an entire library, while EST clone statistics 
refer to expression only for ESTs from a particular tissue or cancer. 

Information is given with regard to overexpression of a cluster in cancer based on 
microarrays. As a microarray reference, in the specific segment paragraphs, the unabbreviated 
tissue name was used as the reference to the type of chip for which expression was measured. 
There are two types of microarray results: those from microarrays prepared according to a 
design by the present inventors, for which the microarray fabrication procedure is described in 
detail in Materials and Experimental Procedures section herein; and those results from 
microarrays using Affymetrix technology. As a microarray reference, in the specific segment 
paragraphs, the unabbreviated tissue name was used as the reference to the type of chip for 
which expression was measured. For microarrays prepared according to a design by the present 
inventors, the probe name begins with the name of the cluster (gene), followed by an identifying 
number. Oligonucleotide microarray results taken from Affymetrix data were from chips 
available from Affymetrix Inc, Santa Clara, CA, USA (see for example data regarding the 
Human Genome U133 (HG-U133) Set at 

www.affymetrix.coiri/products/arrays/specific/hgul33.affic; GeneChip Human Genome U133A 
2.0 Array at www.affyme1xixxom/products/arrays/specific/hgul33av2.affx; and Human 
Genome U133 Plus 2.0 Array at 
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www.affymetrix.corn/products/arrays/specific/hgul33plus.affx). The probe names follow the 
Affymetrix naming convention. The data is available from NCBI Gene Expression Omnibus 
(see www.ncbi.nlm.nih.gov/pi-ojects/geo/ and Edgar et al, Nucleic Acids Research, 2002, Vol. 
30, No. 1 207-210). The dataset (including results) is available from 
5 www.ncbi.nlm.nih.gov/geo/query/acc. cgi?acc=GSEl 133 for the Series GSE1133 database 
(published on March 2004); a reference to these results is as follows: Su et al (Proc Natl Acad 
Sci USA. 2004 Apr 20;101(16):6062-7. Epub 2004 Apr 09). Probes designed by the present 
inventors are listed below. 
>H61775_0_11_0 

1 0 CCCCAGCTTTTATAGAGCGGCCCAAGGAAGAATATTTCCAAGAAGTAGGG 
>M85491_0_0_25999 

GACATCTTTGCATATCATGTCAGAGCTATAACATCATTGTGGAGAAGCTC 
>M85491_0_14_0 

GTCATGAAAATCAACACCGAGGTGCGGAGCTTCGGACCTGTGTCCCGCAG 
15 >Z21368_0_0_61857 

AGTTCATCCTTCTTCAGTGTGACCAGTAAATTCTTCCCATACTCTTGAAG 
>HUMGRP5E_0_0_16630 

GCTGATATGGAAGTTGGGGAATCTGAATTGCCAGAGAATCTTGGGAAGAG 
>HUMGRP5E_0_2_0 

20 TCTCATAGAAGCAAAGGAGAACAGAAACCACCAGCCACCTCAACCCAAGG 
>D56406_0_5_0 

TCTGACTTTTACGGACTTGGCTTGTTAGAAGGCTGAAAGATGATGGCAGG 
>F05068_0_0_5744 

ACGGGAGGGAAGGAAGGTGTGCGGGAGGAGTTCTCTGTCTCCACTCCCCT 
25 >F05068_0_0_5754 

CAAGGGGAACTGACCGTTGGTCCCGAAGGTCTAGAAGTGAATGGGAGCAG 
>F05068_0_8_0 

CTGGGCTTGGACTTCGGAGTTTTGCCATTGCCAGTGGGACGTCTGAGACT 
>F05068_0_1_5751 

30 TCTTAGCAGGTAGGTGCCGCAGACCCTGCGGGTTAAGAGGTGGGGTGGGG 
>H38804 0 3 0 
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CGTAATTGCAGTGCATTTAGACAGGCATCTATTTGGACCTGTTTCTATCT 
>HSENA78_0_1_0 

TGAAGAGTGTGAGGAAAACCTATGTTTGCCGCTTAAGCTTTCAGCTCAGC 
>R00299_0_8_0 

5 CCAAGGCTCGTCTGCGCACCTTGTGTCTTGTAGGGTATGGTATGTGGGAC 
>Z44808_0_8_0 

AAAAGCATGAGTTTCTGACCAGCGTTCTGGACGCGCTGTCCACGGACATG 
>Z44808_0_0_72347 

ATGTTCTTAGGAGGCAAGCCAGGAGAAGCCGGGTCTGACTTTTCAGCTCA 
10 >Z44808_0_0_72349 

TCCTCCAGACCCAAAGCCACAACCCATCGCAAGTCAAGAACACTTTCCAG 
>AA161187_0_0_433 

ACCCTGGGTGGGCAAAAACGTGCTTTCCCGGACGGGGTTGAAGGGGAGAA 
>AA161187_0_0_430 

1 5 TGGAGACTGTTGCCCCACTCTGCAGATGCAGAAACGGAGGCTTGGCTGCT 
>R66178_0_7_0 

CCAGTGTGGTATCCTGGGAAACTCGGTTAAAAGGTGAGGCAGAGTACCAG 
>HUMPHOSLIP_0_0_1 845 8 

AAGGAAGCAGGACCAGTGGATGTGAGGCGTGGTCGAAGAACAACAGAAAG 
20 >HUMPHOSLIP_0_0_18487 

ACAGGGGCCAGATGGTGACCCATGACCCAGCCTAAAAGGCAGCCAGAGGG 
>AI076020_0_3_0 

ATCAGCACTGCCACCTACACCACGGTGCCGCGCGTGGCCTTCTACGCCGG 
>T23580_0_0_902 

25 GTGAAACCCCATTGGCTTCATTGGCTCCTTGATTTAAACCACGCCCGGCT 
>T23580_0_0_901 

TGAGTCCGTGTTATATCATCTGGTCTCATTGATAGGCGGGATAGGGAGGG 
>M79217_0_9_0 

TTTGTGGAATAGCAACCCATGGTTATGGCGAGTGACCCGACGTGATCTGG 
30 >M62096_0_0_20588 

AAGGCTTAGGTGCAAAGCCATTGGATACCATACCTGAGACCACACAGCCA 
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>M62096_0_7_0 

ACCAGAAGCAGCTGTCCAGACTCCGAGACGAAATTGAGGAGAAGCAGAAA 
>M78076_0_7_0 

GAGAAGATGAACCCGCTGGAACAGTATGAGCGAAAGGTGAATGCGTCTGT 
5 >T99080_0_0_58896 

AACTCACAGCAAGAGCTGTGTTCCAGTTAGCTTTGCTACCAGTTATGCAG 
>T08446_0_9_0 

CATTTCCACTACGAGAACGTTGACTTTGGCCACATTCAGCTCCTGCTGTC 
>HUMCA1XIA_0_0_14909 
1 0 GCTGCAATCTAAGTTTCGGAATACTTATACCACTCCAGAAATAATCCTCG 
>HUMCA1 XIA_0_1 8_0 

TTCAGAACTGTTAACATCGCTGACGGGAAGTGGCATCGGGTAGCAATCAG 
>T11628_0_9_0 

ACAAGATCCCCGTGAAGTACCTGGAGTTCATCTCGGAATGCATCATCCAG 
15 >T1 1 628_0_0_45 1 74 

TAAACAATCAAAGAGCATGTTGGCCTGGTCCTTTGCTAGGTACTGTAGAG 
>T11628_0_0_45161 

TGCCTCGCCACAATGGCACCTGCCCTAAAATAGCTTCCCATGTGAGGGCT 
>HUMCEA_0_0_96 

20 CAAGAGGGGTTTGGCTGAGACTTTAGGATTGTGATTCAGCTTAGAGGGAC 
>HUMCEA_0_0_1 5 1 83 

CCTGGTGGGAGCCCATGAGAAGCGAGTTCTCTGTGCAACGGACTTAGTAA 
>HUMCEA_0_0_1 5 1 82 

GCTCCCTGGAGCATCAGCATCATATTCTGGGGTGGAGTCTATCTGGTTCT 
25 >HUMCEA_0_0_15168 

TCCTGCCTGTCACCTGAAGTTCTAGATCATTCCCTGGACTCCACTCTATC 
>HUMCEA_0_0_1 5 1 80 

TTTAACACAGGATTGGGACAGGATTCAGAGGGACACTGTGGCCCTTCTAC 
>R35137_0_5_0 

30 TATGTGGAGGTGGTGAACATGGACGCTGCAGTGCAGCAGCAGATGCTGAA 
>Z25299 0 3 0 
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AACTCTGGCACCTTGGGCTGTGGAAGGCTCTGGAAAGTCCTTCAAAGCTG 
>HSSTROL3_0_0_1 25 1 8 

ATGAGAGTAACCTCACCCGTGCACTAGTTTACAGAGCATTCACTGCCCCA 
>HSSTROL3_0_0_1 25 1 7 
5 CAGAGATGAGAGCCTGGAGCATTGCAGATGCCAGGGACTTCACAAATGAA 
>HSS100PCB_0_0_12280 

CTCAAAATGAAACTCCCTCTCGCAGAGCACAATTCCAATTCGCTCTAAAA 
>R20779_0_0_30670 

CCGCGTTGCTTCTAGAGGCTGAATGCCTTTCAAATGGAGAAGGCTTCCAT 
10 The following list of abbreviations for tissues was used in the TAA histograms. The 

term "TAA" stands for "Tumor Associated Antigen", and the TAA histograms, given in the 
text, represent the cancerous tissue expression pattern as predicted by the biomarkers selection 
engine, as described in detail in examples 1-5 below: 



20 



25 



15 



"BONE" for "bone"; 
"COL" for "colon"; 
"EPF for "epithelial"; 
"GEN" for "general"; 
"LIVER" for "liver"; 
"LUN" for "lung"; 
"LYMPH" for "lymph nodes"; 
"MARROW" for "bone marrow"; 
"OVA" for "ovary"; 
"PANCREAS" for "pancreas"; 
"PRO" for "prostate"; 
"STOMACH" for "stomach"; 
"TCELL" for "T cells"; 
"THYROID" for "Thyroid"; 
"MAM" for "breast"; 
"BRAIN" for "brain"; 



30 



■UTERUS" for "uterus"; 
•SKIN" for "skin"; 
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"KIDNEY" for "kidney"; 
"MUSCLE" for "muscle"; 
"ADREN" for "adrenal"; 
"HEAD" for "head and neck"; 
5 "BLADDER" for "bladder"; 



It should be noted that the terms "segment", "seg" and "node" are used interchangeably 
in reference to nucleic acid sequences of the present invention; they refer to portions of nucleic 
acid sequences that were shown to have one or more properties as described below. They are 

10 also the building blocks that were used to construct complete nucleic acid sequences as 
described in greater detail below. Optionally and preferably, they are examples of 
oligonucleotides which are embodiments of the present invention, for example as amplicons, 
hybridization units and/or from which primers and/or complementary oligonucleotides may 
optionally be derived, and/or for any other use. 

15 As used herein the phrase "lung cancer" refers to cancers of the lung including small cell 

lung cancer and non- small cell lung cancer, including but not limited to lung adenocarcinoma, 
squamous cell carcinoma, and adenocarcinoma. 

The term "marker" in the context of the present invention refers to a nucleic acid 
fragment, a peptide, or a polypeptide, which is differentially present in a sample taken from 

20 subjects (patients) having lung cancer (or one of the above indicative conditions) as compared to 
a comparable sample taken from subjects who do not have lung cancer (or one of the above 
indicative conditions). 

The phrase "differentially present" refers to differences in the quantity of a marker 
present in a sample taken from patients having lung cancer (or one of the above indicative 

25 conditions) as compared to a comparable sample taken from patients who do not have lung 

cancer (or one of the above indicative conditions). For example, a nucleic acid fragment may 
optionally be differentially present between the two samples if the amount of the nucleic acid 
fragment in one sample is significantly different from the amount of the nucleic acid fragment in 
the other sample, fcr example as measured by hybridization and/or NAT-based assays. A 

30 polypeptide is differentially present between the two samples if the amount of the polypeptide in 
one sample is significantly different from the amount of the polypeptide in the other sample. It 
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should be noted that if the marker is detectable in one sample and not detectable in the other, 
then such a marker can be considered to be differentially present. 

As used herein the phrase "diagnostic" means identifying the presence or nature of a 
pathologic condition. Diagnostic methods differ in their sensitivity and specificity. The 
5 "sensitivity" of a diagnostic assay is the percentage of diseased individuals who test positive 
(percent of "true positives"). Diseased individuals not detected by the assay are "false 
negatives." Subjects who are not diseased and who test negative in the assay are termed "true 
negatives." The "specificity" of a diagnostic assay is 1 minus the false positive rate, where the 
"false positive" rate is defined as the proportion of those without the disease who test positive. 
10 While a particular diagnostic method may not provide a definitive diagnosis of a condition, it 
suffices if the method provides a positive indication that aids in diagnosis. 

As used herein the phrase "diagnosing" refers to classifying a disease or a symptom, 
determining a severity of the disease, monitoring disease progression, forecasting an outcome of 
a disease and/or prospects of recovery. The term "detecting" may also optionally encompass any 
15 of the above. 

Diagnosis of a disease according to the present invention can be effected by determining 
a level of a polynucleotide or a polypeptide of the present invention in a biological sample 
obtained from the subject, wherein the level determined can be correlated with predisposition to, 
or presence or absence of the disease. It should be noted that a "biological sample obtained from 
20 the subject" may also optionally comprise a sample that has not been physically removed from 
the subject, as described in greater detail below. 

As used herein, the term "level" refers to expression levels of RNA and/or protein or to 
DNA copy number of a marker of the present invention. 

Typically the level of the marker in a biological sample obtained from the subject is 
25 different (i.e., increased or decreased) from the level of the same variant in a similar sample 
obtained from a healthy individual (examples of biological samples are described herein). 

Numerous well known tissue or fluid collection methods can be utilized to collect the 
biological sample from the subject in order to determine the level of DNA, RNA and/or 
polypeptide of the variant of interest in the subject. 
30 Examples include, but are not limited to, fine needle biopsy, needle biopsy, core needle 

biopsy and surgical biopsy (e.g., brain biopsy), and lavage. Regardless of the procedure 
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employed, once a biopsy/sample is obtained the level of the variant can be determined and a 
diagnosis can thus be made. 

Determining the level of the same variant in normal tissues of the same origin is 
preferably effected along-side to detect an elevated expression and/or amplification and/or a 
5 decreased expression, of the variant as opposed to the normal tissues. 

A "test amount" of a marker refers to an amount of a marker in a subject's sample that is 
consistent with a diagnosis of lung cancer (or one of the above indicative conditions). A test 
amount can be either in absolute amount (e.g., microgram/ml) or a relative amount (e.g., relative 
intensity of signals). 

10 A "control amount" of a marker can be any amount or a range of amounts to be 

compared against a test amount of a marker. For example, a control amount of a marker can be 
the amount of a marker in a patient with lung cancer (or one of the above indicative conditions) 
or a person without lung cancer (or one of the above indicative conditions). A control amount 
can be either in absolute amount (e.g., microgram/ml) or a relative amount (e.g., relative 

1 5 intensity of signals). 

"Detect" refers to identifying the presence, absence or amount of the object to be 
detected. 

A "label" includes any moiety or item detectable by spectroscopic, photo chemical, 
biochemical, immunochemical, or chemical means. For example, useful labels include 32 P, 35 S, 

20 fluorescent dyes, electron- dense reagents, enzymes (e.g., as commonly used in an ELISA), 
biotin- streptavadin, dioxigenin, haptens and proteins for which antisera or monoclonal 
antibodies are available, or nucleic acid molecules with a sequence complementary to a target. 
The label often generates a measurable signal, such as a radioactive, chromogenic, or 
fluorescent signal, that can be used to quantify the amount of bound label in a sample. The label 

25 can be incorporated in or attached to a primer or probe either covalently, or through ionic, van 
der Waals or hydrogen bonds, e.g., incorporation of radioactive nucleotides, or biotinylated 
nucleotides that are recognized by streptavadin. The label may be directly or indirectly 
detectable. Indirect detection can involve the binding of a second label to the first label, directly 
or indirectly. For example, the label can be the ligand of a binding partner, such as biotin, which 

30 is a binding partner for streptavadin, or a nucleotide sequence, which is the binding partner for a 
complementary sequence, to which it can specifically hybridize. The binding partner may itself 
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be directly detectable, for example, an antibody may be itself labeled with a fluorescent 
molecule. The binding partner also may be indirectly detectable, for example, a nucleic acid 
having a complementary nucleotide sequence can be a part of a branched DNA molecule that is 
in turn detectable through hybridization with other labeled nucleic acid molecules (see, e.g., P. 
5 D. Fahrlander and A. Klausner, Bio/Technology 6:1165 (1988)). Quantitation of the signal is 
achieved by, e.g., scintillation counting, densitometry, or flow cytometry. 

Exemplary detectable labels, optionally and preferably for use with immunoassays, 
include but are not limited to magnetic beads, fluorescent dyes, radiolabels, enzymes (e.g., horse 
radish peroxide, alkaline phosphatase and others commonly used in an ELISA), and calorimetric 

10 labels such as colloidal gold or colored glass or plastic beads. Alternatively, the marker in the 
sample can be detected using an indirect assay, wherein, for example, a second, labeled antibody 
is used to detect bound marker- specific antibody, and/or in a competition or inhibition assay 
wherein, for example, a monoclonal antibody which binds to a distinct epitope of the marker are 
incubated simultaneously with the mixture. 

15 "Immunoassay" is an assay that uses an antibody to specifically bind an antigen. The 

immunoassay is characterized by the use of specific binding properties of a particular antibody 
to isolate, target, and/or quantify the antigen. 

The phrase "specifically (or selectively) binds" to an antibody or "specifically (or 
selectively) hnmunoreactive with," when referring to a protein or peptide (or other epitope), 

20 refers to a binding reaction that is determinative of the presence of the protein in a 

heterogeneous population of proteins and other biologies. Thus, under designated immunoassay 
conditions, the specified antibodies bind to a particular protein at least two times greater than the 
background (non-specific signal) and do not substantially bind in a significant amount to other 
proteins present in the sample. Specific binding to an antibody under such conditions may 

25 require an antibody that is selected for its specificity for a particular protein. For example, 

polyclonal antibodies raised to seminal basic protein from specific species such as rat, mouse, or 
human can be selected to obtain only those polyclonal antibodies that are specifically 
immunoreactive with seminal basic protein and not with other proteins, except for polymorphic 
variants and alleles of seminal basic protein. This selection may be achieved by subtracting out 

30 antibodies that cross- react with seminal basic protein molecules from other species. A variety of 
immunoassay formats may be used to select antibodies specifically immunoreactive with a 
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particular protein. For example, solid-phase ELISA immunoassays are routinely used to select 
antibodies specifically immunoreactive with a protein (see, e.g., Harlow & Lane, Antibodies, A 
Laboratory Manual (1988), for a description of immunoassay formats and conditions that can be 
used to determine specific immunoreactivity). Typically a specific or selective reaction will be 
5 at least twice background signal or noise and more typically more than 10 to 100 times 
background. 

According to preferred embodiments of the present invention, preferably any of the 
above nucleic acid and/or amino acid sequences further comprises any sequence having at least 
about 70%, preferably at least about 80%, more preferably at least about 90%, most preferably 
10 at least about 95% homology thereto. 

Unless otherwise noted, all experimental data relates to variants of the present invention, 
named according to the segment being tested (as expression was tested through RT-PCR as 
described). 

All nucleic acid sequences and/or amino acid sequences shown herein as embodiments 
15 of the present invention relate to their isolated form, as isolated polynucleotides (including for 
all transcripts), oligonucleotides (including for all segments, amplicons and primers), peptides 
(including for all tails, bridges, insertions or heads, optionally including other antibody epitopes 
as described herein) and/or polypeptides (including for all proteins). It should be noted that 
oligonucleotide and polynucleotide, or peptide and polypeptide, may optionally be used 
20 interchangeably. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 1 and 2. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 1022, 1023, 1024, 1025, 1026 and 1027. 
25 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide comprising SEQ ID NOs: 1281 and 1282. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 3 and 4. 

According to preferred embodiments of the present invention, there is provided an 
30 isolated polynucleotide comprising SEQ ID NOs: 1028, 1029, 1030, 1031, 1032, 1033, 1034, 
1035, 1036, 1037 and 1038. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs: 1283 and 1284. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 5, 6, 7 and 8. 
5 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 1039, 1040, 1041, 1042, 1043, 1044, 1045, 
1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 
1061, 1062, 1063, 1064, 1065 and 1066. 

According to preferred embodiments of the present invention, there is provided an 
10 isolated polypeptide comprising SEQ ID NOs: 1285, 1286, 1287 and 1288. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 9, 10, 11, 12, 13, 14 and 15. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 1067, 1068, 1069, 1070, 1071, 1072, 1073, 
15 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 
1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099 and 1100. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs 1289, 1290, 1291, 1292, 1293 and 1294. 

According to preferred embodiments of the present invention, there is provided an 
20 isolated polynucleotide comprising SEQ ID NOs: 20 and 21. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 1130, 1131, 1132, 1133 and 1134. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs: 1299 and 1300. 
25 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 22, 23 and 24. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 1135, 1136, 1137, 1138, 1139, 1140, 1141, 
1142, 1143 and 1144. 

30 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide comprising SEQ ID NOs 1301, 1302 and 1303. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 25, 26 and 27. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 1 145, 1 146, 1 147, 1 148, 1 149, 1 150, 1151, 
5 1152, 1153, 1154, 1155 and 1156. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs 1304 and 1305. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 28. 
10 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 1157, 1158, 1159, 1160, 1161, 1162, 1163, 
1164, 1165, 1166, 1167, 1168, 1169, 1170 and 1171. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NO: 1306. 
1 5 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 29 and 30. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 1172, 1173, 1174, 1175, 1176, 1177, 1178, 
1179, 1180, 1181, 1182, 1183, 1184, 1185, 1186, 1187, 1188, 1189, 1190 and 1191. 
20 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide comprising SEQ ID NOs 1307 and 1308. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 31. 

According to preferred embodiments of the present invention, there is provided an 
25 isolated polynucleotide comprising SEQ ID NOs: 1 192, 1 193, 1 194, 1 195, 1 196, 1 197 and 
1198. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NO: 1309. 

According to preferred embodiments of the present invention, there is provided an 
30 isolated polynucleotide comprising SEQ ID NOs: 32. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQIDNOs: 1199, 1200, 1201, 1202, 1203, 1204, 1205, 
1206, 1207, 1208, 1209, 1210, 1211, 1212, 1213, 1214 and 1215. 

According to preferred embodiments of the present invention, there is provided an 
5 isolated polypeptide comprising SEQ ID NO. 1310. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 33. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 1216 and 1217, 1218, 1219, 1220, 1221, 
10 1222, 1223, 1224, 1225, 1226 and 1227. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NO: 1311. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 34. 
1 5 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 1228, 1229, 1230, 1231, 1232 and 1223. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NO: 1312. 

According to preferred embodiments of the present invention, there is provided an 
20 isolated polynucleotide comprising SEQ ID NOs: 35. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 1234, 1235, 1236, 1237, 1238, 1239, 1240, 
1241, 1242, 1243, 1244, 1245, 1246, 1247, 1248, 1249, 1250, 1251, 1252, 1253 and 1254. 

According to preferred embodiments of the present invention, there is provided an 
25 isolated polypeptide comprising SEQ ID NO: 1313. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 36, 37, 38, 39 and 40. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQIDNOs: 1255, 1256, 1257, 1258, 1259, 1260, 1261, 
30 1262, 1263, 1264, 1265, 1266, 1267, 1268, 1269, 1270, 1271, 1272, 1*273, 1274 and 1275. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ IDNOs 1314, 131 5, 1316 and 1317. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 125, 126, 127, 128, 129 and 130. 
5 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 887, 888, 889, 890, 891, 892, 893, 894, 895, 
896, 897, 898, 899, 900, 901 and 902. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs: 1394, 1395, 1396, 1397 and 1398. 
10 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising a transcript SEQ ID NOs: 131 and 132. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 903, 904, 905, 906, 907, 907, 908 and 909. 

According to preferred embodiments of the present invention, there is provided an 
15 isolated polypeptide comprising SEQ ID NOs 1399 and 1400. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 99, 100, 101 and 102. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 742, 743, 744, 745, 746, 747, 748, 749, 750, 
20 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 
770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787 and 
788. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs 1372, 1373, 1374 and 1375. 
25 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 134. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 913, 914, 915, 916, 917, 918, 919, 920, 921, 
922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935 and 936. 
30 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide comprising SEQ ID NO: 1402. 
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According to preferred embodiments of the present invention, there is pro vided an 
isolated polynucleotide comprising SEQ ID NO: 133. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 910, 91 1 and 912. 
5 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 141, 142 and 142. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 961, 962, 963, 964, 965, 966, 967, 968, 969, 
970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 
10 989 and 990. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising : 
Protein Name 

HUMOSTROJPEA_lJPEA_l_P21 
1 5 HUMOSTRO JPEA_1 JPEA_1 JP25 
HUMOSTROJPEA__1_PEA_1_P30 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 51, 52, 53„ 54, 55, 56 and 57. 

According to preferred embodiments of the present invention, there is provided an 
20 isolated polynucleotide comprising SEQ ID NOs: 518, 519, 520, 521, 522, 523, 524, 525, 526, 
527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 
546, 547,548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563,, 564, 
565, 566, 567, 568, 569 and 570. 

According to preferred embodiments of the present invention, there is provided an 
25 isolated polypeptide comprising SEQ ID NOs 1327, 1328, 1329, 1330, 1331, 1332 and 1333. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 135, 136, 137, 138, 139 and 140. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 937, 938, 939, 940, 941, 942, 943, 944, 945, 
30 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959 and 960. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs 1403, 1404, 1405, 1406, 1407 and 1408. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 41, 42, 43, 44, 45, 46 and 47.. 
5 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 482, 483, 484, 495, 486, 487, 488, 489, 490, 
491, 492, 493, 494, 495, 496, 497, 498, 499, 500 and 501 . 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs: 1318, 1319, 1320, 1321, 1322 and 1323. 
10 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 121, 122, 123 and 124. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 876, 877, 878, 879, 880, 881, 882, 883, 884, 
885 and 886. 

1 5 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide comprising SEQ ID NOs: 1390, 1391, 1392 and 1393. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 48, 49 and 50. 

According to preferred embodiments of the present invention, there is provided an 
20 isolated polynucleotide comprising SEQ ID NOs: 502, 503, 504, 505, 506, 507, 508, 509, 510, 
511, 512, 513, 514, 515, 516 and 517. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs: 1324, 1325 and 1326. 

According to preferred embodiments of the present invention, there is provided an 
25 isolated polynucleotide comprising SEQ ID NOs: 1464 and 1465. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising a SEQ ID NOs: 1276, 1277, 1278, 1279 and 1280. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NO: 1415. 
30 Protein Name Corresponding Transcript(s) 

HSU33 147JPEA_1_P5 HSU33 147JPEA_1_T1 ; HSU33 147JPEA_1_T2 
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According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NO: 58. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 571, 572, 573, 574, 575, 576, 577 and 578. 
5 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide comprising SEQ ID NO: 1334. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 74, 75, 76, 77, 78, 79, 80, 81 and 82. 

According to preferred embodiments of the present invention, there is provided an 
10 isolated polynucleotide comprising SEQ ID NOs: 659, 660, 661, 662, 663, 664, 665, 666, 667, 
668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 
687, 688, 689, 690, 691, 692 and 693. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs 1350, 1351, 1352, 1353, 1354, 1355, 1356 and 
15 1357. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 
Transcript Name 
T23580_T10 

20 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 579, 580, 581, 582 and 583. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs 1335. 

According to preferred embodiments of the present invention, there is provided an 
25 isolated polynucleotide comprising SEQ ID NOs: 59, 60, 61, 62, 63 and 64. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 584, 585, 586, 587, 588, 589, 590, 591, 592, 
593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 
612, 613, 614 and 615. 

30 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide comprising SEQ ID NOs: 1336, 1337, 1338, 1339 and 1340. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 65, 66, 67, 68, 69, 70, 71, 72 and 73. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 616, 617, 618, 619, 620, 621, 622, 623, 624, 
5 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 
644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658 and 659. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs: 1341, 1342, 1343, 1344, 1345, 1346, 1347, 1348 
and 1349. 

10 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 
95 and 96. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 695, 696, 697, 698, 699, 700, 701, 702, 703, 
15 704 and 705. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs 1358, 1359, 1360, 1361, 1362, 1363, 1364, 1365, 
1366, 1367, 1368 and 1369. 

According to preferred embodiments of the present invention, there is provided an 
20 isolated polynucleotide comprising SEQ ID NOs: 97 and 98. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 706, 707, 708, 709, 710, 711, 712, 713, 714, 
715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 
734, 735, 736, 737, 738, 739, 740 and 741. 
25 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide comprising SEQ ID NOs: 1370 and 1371. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 103, 104, 105, 106, 107 and 108. 

According to preferred embodiments of the present invention, there is provided an 
30 isolated polynucleotide comprising SEQ ID NOs: 789, 790, 791, 792, 793, 794, 795, 796, 797, 
798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 8 09, 810, 811, 812 and 813. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs: 1376, 1 3 77, 1378 and 1379. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 114, 115, 116, 117, 118 and 119. 
5 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 856, 857, 858, 859, 860, 861, 862, 863, 864, 
865, 866, 867, 868, 869, 870, 871, 872, 873, 874 and 875. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs: 1385, 1386, 1387, 1388 and 1389. 
10 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 144, 145, 146, 147, 148 and 149. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 991, 992, 993, 994, 995, 996, 997, 998, 999, 
1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 
15 1015 and 1016. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NOs: 1409, 1410, 1411, 1412 and 1413. 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NO: 150. 
20 According to preferred embodiments of the present invention, there is provided an 

isolated polynucleotide comprising SEQ ID NOs: 1017, 1018, 1019, 1020 and 1021. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide comprising SEQ ID NO: 1414. 

According to preferred embodiments of the present invention, there is provided an 
25 isolated polynucleotide comprising SEQ ID NOs: 1 09, 1 1 0, 1 1 1 , 1 1 2 and 1 1 3 . 

According to preferred embodiments of the present invention, there is provided an 
isolated polynucleotide comprising SEQ ID NOs: 814, 815, 816, 817, 818, 819, 820, 821, 822, 
823, 824, 825, 826, 827, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 
843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854 and 855. 
30 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide comprising SEQ ID NOs 1380, 1381, 1382, 1383 and 1384. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for.HSSTROL3_P4, comprising a first amino acid 
sequence being at least 90 % homologous to 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSS 
5 PAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFP 
WQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYW corresponding to 
amino acids 1-163 of MM1 1 JHUMAN, which also corresponds to amino acids 1 - 163 of 
HSSTROL3JP4, a bridging amino acid H corresponding to amino acid 164 of HSSTROL3JP4, 
a second amino acid sequence being at least 90 % homologous to 

10 GDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLG 
LQHTTAAKALMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTN 
EIAPLEPDAPPDACEASFDAVSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGL 
PSPVDAAFEDAQGHIWFFQGAQYWVYDGEBCPVLGPAPLTELGLVRFPVHAALVWGPE 
KNKIYFFRGRDYWRFHPSTRRVDSPWRJRATDWRGVPSEIDAAFQDADG corresponding 

15 to amino acids 165 - 445 of MM1 1 HUMAN, which also corresponds to amino acids 165 - 445 
of HSSTROL3JP4, and a third amino acid sequence being at least 70%, optionally at least 80%, 
preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 

ALGVRQLVGGGHSSRFSHLVVAGLPHACHRKSGSSSQVLCPEPSALLSVAG 
20 corresponding to amino acids 446 - 496 of HSSTROL3P4, wherein said first amino acid 

sequence, bridging amino acid, second amino acid sequence and third amino acid sequence are 

contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of HSSTROL3JP4, comprising a polypeptide being at 
25 least 70%, optionally at least about 80%>, preferably at least about 85%, more preferably at least 

about 90% and most preferably at least about 95% homologous to the sequence 

ALGVRQLVGGGHSSRFSHLWAGLPHACHRKSGSSSQVLCPEPSALLSVAG in 

HSSTROL3_P4. 

According to preferred embodiments of the present invention, there is provided an 
30 isolated chimeric polypeptide encoding for HSSTROL3_P5, comprising a first amino acid 
sequence being at least 90 % homologous to 
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MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSS 
PAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRPP 
WQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYW corresponding to 
amino acids 1-163 of MM1 1 HUMAN, which also corresponds to amino acids 1 - 163 of 
5 HSSTROL3_P5, a bridging amino acid H corresponding to amino acid 164 of HSSTROL3_P5, 
a second amino acid sequence being at least 90 % homologous to 

GDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLG 

LQHTTAAKALMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTN 

EIAPLEPDAPPDACEASFDAVSTIRGELFFFKAGFVWRJLRGGQLQPGYPALASRHWQGL 

10 PSPVDAAFEDAQGHIWFFQ corresponding to amino acids 165 - 358 of MM1 INHUMAN, 
which also corresponds to amino acids 165 - 358 of HSSTROL3P5, and a third amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
ELGFPSSTGRDESLEHCRCQGLHK corresponding to amino acids 359 - 382 of 

15 HSSTROL3P5, wherein said first amino acid sequence, bridging amino acid, second amino 
acid sequence and third amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HSSTROL3JP5, comprising a polypeptide being at 
least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 

20 about 90% and most preferably at least about 95% homologous to the sequence 
ELGFPSSTGRDESLEHCRCQGLHK in HSSTROL3 J>5. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HSSTROL3 JP7, comprising a first amino acid 
sequence being at least 90 % homologous to 

25 MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSS 
PAPAPATQEAPRPASSLRPPRCGWDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFP 
WQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYW coiresponding to 
amino acids 1-163 of MM 1 INHUMAN, which also corresponds to amino acids 1 - 163 of 
HSSTROL3JP7, a bridging amino acid H corresponding to amino acid 164 of HSSTROL3JP7, 

30 a second amino acid sequence being at least 90 % homologous to 

GDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLG 
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LQHTTAAKALMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTN 
EIAPLEPDAPPDACEASFDAVSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGL 
PSPVDAAFEDAQGHIWFFQG corresponding to amino acids 165 - 359 of MM1 INHUMAN, 
which also corresponds to amino acids 165 - 359 of HSSTROL3JP7, and a third amino acid 
5 sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
TTGVSTPAPGV corresponding to amino acids 360 - 370 of HSSTROL3JP7, wherein said first 
amino acid sequence, bridging amino acid, second amino acid sequence and third amino acid 
sequence are contiguous and in a sequential order. 

10 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of HSSTROL3_P7, comprising a polypeptide being at 
least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 
about 90% and most preferably at least about 95% homologous to the sequence 
TTGVSTPAPGV in HSSTROL3JP7. 

15 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for HSSTROL3_P8, comprising a first amino acid 
sequence being at least 90 % homologous to 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSS 
PAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFP 
20 WQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYW coiresponding to 
amino acids 1-163 of MM1 INHUMAN, which also corresponds to amino acids 1 - 163 of 
HSSTROL3JP8, a bridging amino acid H corresponding to amino acid 164 of HSSTROL3_P8, 
a second amino acid sequence being at least 90 % homologous to 

GDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLG 
25 LQHTTAAKALMSAi^YTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTN 
EIAPLE corresponding to amino acids 165 - 286 of MM1 INHUMAN, which also corresponds 
to amino acids 165 - 286 of HSSTROL3P8, and a third amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
30 VRPCLPVPLLLCWPL corresponding to amino acids 287 - 301 of HSSTROL3JP8, wherein 
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said first amino acid sequence, bridging amino acid, second amino acid sequence and third 
amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HSSTROL3JP8, comprising a polypeptide being at 
5 least 70%, optionally at least about 80%, preferably at least about 85%), more preferably at least 
about 90% and most preferably at least about 95% homologous to the sequence 
VRPCLPVPLLLCWPL in HSSTROL3_P8. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HSSTROL3 JP9, comprising a first amino acid 

1 0 sequence being at least 90 % homologous to 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSS 
PAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQK corresponding to amino acids 1 - 
96 of MM1 1_HUMAN 7 which also corresponds to amino acids 1 - 96 of HSSTROL3JP9, a 
second amino acid sequence being at least 90 % homologous to 

15 RILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYW 

corresponding to amino acids 1 13 - 163 of MM1 INHUMAN, which also corresponds to amino 
acids 97 - 147 of HSSTROL3P9, a bridging amino acid H corresponding to amino acid 148 of 
HSSTROL3P9, a third amino acid sequence being at least 90 % homologous to 
GDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLG 

20 LQHTTAAKALMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTN 
EIAPLEPDAPPDACEASFDAVSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGL 
PSPVDAAFEDAQGHIWFFQG corresponding to amino acids 165 - 359 of MM 1 INHUMAN, 
which also corresponds to amino acids 149 - 343 of HSSTROL3P9, and a fourth amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 

25 least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 

TTGVSTPAPGV corresponding to amino acids 344 - 354 of HSSTROL3_P9 ? wherein said first 
amino acid sequence, second amino acid sequence, bridging amino acid, third amino acid 
sequence and fourth amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

30 isolated chimeric polypeptide encoding for an edge portion of HSSTROL3_P9, comprising a 
polypeptide having a length "n", wherein n is at least about 10 amino acids in length, optionally 
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at least about 20 amino acids in length, preferably at least about 30 amino acids in length, more 
preferably at least about 40 amino acids in length and most preferably at least about 50 amino 
acids in length, wherein at least two amino acids comprise KR, having a structure as follows: a 
sequence starting from any of amino acid numbers 96-x to 96; and ending at any of amino acid 
5 numbers 97+ ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HSSTROL3P9, comprising a polypeptide being at 
least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 
about 90% and most preferably at least about 95% homologous to the sequence 
10 TTGVSTPAPGV in HSSTROL3_P9. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMCA1XIA_P14, comprising a first amino acid 
sequence being at least 90 % homologous to 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTT 

1 5 GFCTNRJKNSKGSDTAYRVSKQAQLS APTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIY 
NEHGIQQIGVEVGRSPVFLFEDHTGKPAPEDYPLFRTVNIADGKWHRVAISVEKKTVTM 
IVDCKKKTTKPLDRSERAIVDTNGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEH 
YSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQT 
EANIVDDFQEYNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDSQRKNSED 

20 TLYENKEIDGRDSDLLVDGDLGEYDFYEYKEYEDKPTSPPNEEFGPGVPAETDITETSIN 
GHGAYGEKGQKGEPAVVEPGMLVEGPPGPAGPAGIMGPPGLQGPTGPPGDPGDRGPPG 
RPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARIALRGPPGPM 
GLTGRPGPVGGPGSSGAKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMP 
GEPGAKGDRGFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAG 

25 PRGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQGLPGPQG 
PIGPPGEKGPQGKPGLAGLPGADGPPGHPGKEGQSGEKGALGPPGPQGPIGYPGPRGVK 
GADGVRGLKGSKGEKGEDGFPGFKGDMGLKGDRGEVGQIGPRGEDGPEGPKGRAGPT 
GDPGPSGQAGEKGKLGVPGLPGYPGRQGPKGSTGFPGFPGANGEKGARGVAGKPGPR 
GQRGPTGPRGSRGARGPTGKPGPKGTSGGDGPPGPPGERGPQGPQGPVGFPGPKGPPGP 

3 0 PGKDGLPGHPGQRGETGFQGKTGPPGPGG WGPQGPTGETGPIGERGHPGPPGPPGEQG 
LPGAAGKEGAKGDPGPQGISGKDGPAGLRGFPGERGLPGAQGAPGLKGGEGPQGPPGP 
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V corresponding to amino acids 1 - 1056 of CA1B_HUMAN_V5 ? which also corresponds to 

amino acids 1 - 1056 of HUMCA1XIA_P14, and a second amino acid sequence being at least 

70%, optionally at least 80%, preferably at least 85%), more preferably at least 90%> and most 

preferably at least 95 % homologous to a polypeptide having the sequence 
5 VSMMIINSQTIMVVNYSSSFITLML corresponding to amino acids 1057 - 1081 of 

HUMCA1XIA_P14, wherein said first amino acid sequence and second amino acid sequence 

are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of HUMCA1XIA_P14, comprising a polypeptide being 
10 at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 

least about 90% and most preferably at least about 95% homologous to the sequence 

VSMMIINSQTIMVVNYSSSFITLML in HUMCA1XIAJP14. 

According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for HUMCA1XIAP15, comprising a first amino acid 
1 5 sequence being at least 90 % homologous to 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTT 

GFCTNRKNSKGSDTAYRVSKQAQ 

NEHGIQQIGVEVGRSPWLFEDHTGKPAPEDYPLFRTVNIADGKWHRVAISVEKKTVTM 
IVDCKKKTTKPLDRSERAIVDTNGITWGTMLDEEVFEGDIQQFLITGDPKAAYDYCEH 

20 YSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQT 
EAMVDDFQEYNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDSQRKNSED 
TLYENKEIDGRDSDLLVDGDLGEYDFYEYKEYEDKPTSPPNEEFGPGVPAETDITETSIN 
GHGAYGEKGQKGEPAVVEPGMLVEGPPGPAGPAGIMGPPGLQGPTGPPGDPGDRGPPG 
RPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARIALRGPPGPM 

25 GLTGRPGPVGGPGSSGAKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMP 
GEPGAKGDRGFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAG 
PRGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQGLPGPQG 
PIGPPGEK corresponding to amino acids 1 - 714 of C A 1 B_HUM AN, which also corresponds 
to amino acids 1-714 of HUMC A 1 XIA JP 1 5 , and a second amino acid sequence being at least 

30 70%>, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
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MCCNLSFGILIPLQK corresponding to amino acids 715 - 729 of HUMCA1XIA_P15, wherein 
said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HUMCA 1 XI A JP 1 5 , comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
MCCNLSFGILIPLQK in HUMCA1XIAJP15. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMCA 1XIAP 16, comprising a first amino acid 
sequence being at least 90 % homologous to 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTT 
GFCTNRKNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIY 
NEHGIQQIGVEVGRSPVFLFEDHTGKPAPEDYPLFRTVNIADGKWHRVAISVEKKTVTM 
IVDCKKKTTKPLDRSERAIVDTO 

YSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQT 
EANIVDDFQEYNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDSQRKNSED 
TLYENK^IDGRDSDLLVDGDLGEYDFYEYKEYEDKPTSPPNEEFGPGVPAETDITETSIN 
GHGAYGEKGQKGEPAWEPGMLVEGPPGPAGPAGIMGPPGLQGPTGPPGDPGDRGPPG 
RPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARIALRGPPGPM 
GLTGRPGPVGGPGSSGAKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMP 
GEPGAKGDRGFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEA 
corresponding to amino acids 1 - 648 of C A 1 B_HUMAN, which also corresponds to amino 
acids 1 - 648 of HUMCA 1XIA_P 16, a second amino acid sequence being at least 90 % 
homologous to GMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQGLPGPQGPIGPPGEK 
corresponding to amino acids 667 - 714 of CA1BJHUMAN, which also corresponds to amino 
acids 649 - 696 of HUMCA1XIA_P16, and a third amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
VSFSFSLFYlCKVn^ACDKRFVGRHDERKVVKLSLPLYLIYE corresponding to amino 
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acids 697 - 738 of HUMCA1XIAP16, wherein said first amino acid sequence, second amino 
acid sequence and third amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of HUMCA1XIA_P16, comprising a 
5 polypeptide having a length "n", wherein n is at least about 10 amino acids in length, optionally 
at least about 20 amino acids in length, preferably at least about 30 amino acids in length, more 
preferably at least about 40 amino acids in length and most preferably at least about 50 amino 
acids in length, wherein at least two amino acids comprise AG, having a structure as follows: a 
sequence starting from any of amino acid numbers 648-x to 648; and ending at any of amino 
10 acid numbers 649+ ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HUMCA1XIA_P16, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
15 VSFSFSLFYIQCVIKFA in HUMCA1XIA_P16. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMC A 1 XI A_P 17, comprising a first amino acid 
sequence being at least 90 % homologous to 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTT 
20 GFCTNRKNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIY 
NEHGIQQIGVEVGRSPWLFEDHTGKPAPEDYPLFRTVNIADGKWHRVAISV 
IVDCKKXTTKPLDRSERAIVDTNGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEH 
YSPDCDS S APKA AQ AQEPQIDE corresponding to amino acids 1 - 260 of CA1BHUMAN, 
which also corresponds to amino acids 1 - 260 of HUMCA1XIA_P17, and a second amino acid 
25 sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
VRSTRPEKVFVFQ corresponding to amino acids 261 - 273 of FIUMC A 1 XIA JP 1 7, wherein 
said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

30 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of HUMCA1XIAJP17, comprising a polypeptide being 
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at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
VRSTRPEKVFVFQ in HUMCA1XIA_JP17. 

According to preferred embodiments of the present invention, there is provided an 
5 isolated chimeric polypeptide encoding for R20779 P2, comprising a first amino acid sequence 
being at least 90 % homologous to 

MCAERLGQFMTLALVLATFDPARGTDATNPPEGPQDRSSQQKGRLSLQNTAEIQHCLV 
NAGDVGCGVFECFENNSCEIRGLHGICMTFLHNAGKFDAQGKSFIKDALKCKAHALRH 
RFGCISRKCPAIREMVSQLQRECYLKHDLCAAAQENTRVIVEMIHFKDLLLHE 

10 corresponding to amino acids 1-169 of STC2JHUMAN, which also corresponds to amino 

acids 1-169 of R20779_P2, and a second amino acid sequence being at least 70%, optionally at 
least 80%, preferably at least 85%, more preferably at least 90% and most preferably at least 
95% homologous to a polypeptide having the sequence CYKJEITMPKRRKVKLRD 
corresponding to amino acids 170 - 187 of R20779 JP2, wherein said first amino acid sequence 

1 5 and second amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R20779JP2, comprising a polypeptide being at least 
70%, optionally at least about 80%, preferably at least about 85%, more preferably at least about 
90% and most preferably at least about 95% homologous to the sequence 

20 CYKIEITMPKRRKVKLRD in R20779JP2. 

According to preferred embodiments of the present invention, there is provided an 
isolated cliimeric polypeptide encoding for HUMOSTROPEA 1_PEA_1JP21, comprising a 
first amino acid sequence being at least 90 % homologous to 

MRIAVICFCLLGITCAIPVKQADSGSSEEKQLYNKYPDAVATWLNPDPSQKQNLLAPQ 
25 corresponding to amino acids 1 - 58 of OSTP_HUMAN, which also corresponds to amino acids 
1 - 58 of HUMOSTROJPEA_l_PEA_l_P21, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence VFLNFS 
corresponding to amino acids 59 - 64 of HUMOSTRO_PEA_l_PEA_l JP21, wherein said first 
30 amino acid sequence and second amino acid sequence are contiguous and in a sequential order. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HUMOSTRO_PEA_l JPEA_1_P21, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
5 sequence VFLNFS in HUMOSTRO JPEA_1 J>EA_1_P2 1 . 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMOSTROPEA 1PEA 1P25, comprising a 
first amino acid sequence being at least 90 % homologous to 

MRIAVICFCLLGITCAIPVKQADSGSSEEKQ corresponding to amino acids 1 - 31 of 

10 OSTPHUMAN, which also corresponds to amino acids 1 - 31 of 

HUM O S TROJPEA_ 1 JPE A_ 1 P2 5 , and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence H corresponding to 
amino acids 32 - 32 of HUMOSTROJPEA__l_PEA_l_P25, wherein said first amino acid 

15 sequence and second amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMOSTRO_PEA_l_PEA_l JP30, comprising a 
first amino acid sequence being at least 90 % homologous to 

MRIAVICFCLLGITCAIPVKQADSGSSEEKQ corresponding to amino acids 1 - 31 of 

20 OSTPHUMAN, which also corresponds to amino acids 1 - 3 1 of 

HUMOSTRO_PEA_l_PEA_l JP30, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence VSIFYVFI 
corresponding to amino acids 32 - 39 of HUMOSTRO_PEA__1_PEA_1_P30, wherein said first 

25 amino acid sequence and second amino acid sequence are contiguous and in a sequential order. 
According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HUMOSTROJPEA_l_PEA 1_P30, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 

30 sequence VSIFYVFI in HUMOSTROJPEA_l_PEA_l JP30. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMPHOSLIP_PEA_2JP10, comprising a first 
amino acid sequence being at least 90 % homologous to 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGH 
5 FYYNISE corresponding to amino acids 1 - 67 of PLTPHUMAN, which also corresponds to 
amino acids 1 - 67 of HUMPHOSLIPJ > EA_2J > 10, and a second amino acid sequence being at 
least 90 % homologous to 

KVYDFLSTFITSGMRFLLNQQICPVLYHAGTVLLNSLLDTVPVRSSVDELVGIDYSLMK 
DPVASTSNLDMDFRGAFFPLTERNWSLPNRAVEPQLQEEERMVYVAFSEFFFDSAMES 
10 YFRAGALQLLLVGDKVPHDLDMLLRATYFGSIVLLSPAVIDSPLKLELRVLAPPRCTIKP 
SGTTISVTASVTIALVPPDQPEVQLSSM 

HSALESLALIPLQAPLKTMLQIGVMPMLNERTWRGVQIPLPEGINFVHEVVTNHAGFLTI 
GADLHFAKGLREVIEKNRPADVRASTAPTPSTAAV corresponding to amino acids 163 - 
493 of PLTP_HUMAN 5 which also corresponds to amino acids 68 - 398 of 

15 HUMPHOSLIP_PEA_2JP10, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of HUMPHOSLIP JPEA_J2_P 1 0, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 

20 length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise EK, having a 
structure as follows: a sequence starting from any of amino acid numbers 67-x to 67; and ending 
at any of amino acid numbers 68+ ((n-2) - x), in which x varies from 0 to n-2. 

25 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for HUMPHOSLIPJPEA_2JP12, comprising a first 
amino acid sequence being at least 90 % homologous to 
1VLALFGALFLALLAGAHAEF 

FYYMSEVKVTELQLTSSELDFQPQQELMLQITNASLGLRFRRQLLYWFFYDGGYINAS 
30 AEGVSIRTGLELSRDPAGRMKVSNVSCQASVSRMHAAFGGTFKKWDFLSTFITSGMRF 
LLNQQICPVLYHAGTVLLNSLLDTVPVRSSVDELVGIDYSLMKDPVASTSNLDMDFRG 
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AFFPLTERNWSLPNRAVEPQLQEEERMVYVAFSEFFFDSAMESYFRAGALQLLLVGDK 
VPHDLDMLLRATYFGSIVLLSPAVIDSPLKLELRVLAPPRCTIKPSGTTISVTASVTIALVP 
PDQPEVQLSSMTMDARLSAKMALRGKALRTQLDLRRFRIYSNHSALESLALIPLQAPLK 
TMLQIGVMPMLN corresponding to amino acids 1 - 427 of PLTP_HUMAN, which also 
5 corresponds to amino acids 1 - 427 of HUMPHOSLIPJPEA_2_P12, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
GKAGV corresponding to amino acids 428 - 432 of HUMPHOSLIP_PEA_2_P12, wherein said 
first amino acid sequence and second amino acid sequence are contiguous and in a sequential 
10 order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of PIUMPIiOSLIP_PEA_2_P 12, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
15 sequence GKAGV in IIUMPHOSLIPJPEA_2_P12. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMPHOSLIP_PEA_2JP31, comprising a first 
amino acid sequence being at least 90 % homologous to 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGH 
20 FYYNISE corresponding to amino acids 1-67 of PLTPHUMAN, which also corresponds to 
amino acids 1-67 of HUMPHOSLIP JPEA_2_P3 1 , and a second amino acid sequence being at 
least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and 
most preferably at least 95% homologous to a polypeptide having the sequence 
PGLERGADKFPWGGSSLFLALDLTLRPPVG corresponding to amino acids 68 - 98 of 
25 HUMPHOSLIP_PEAJ2 JP3 1 , wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HUMPHOSLIP_PEA_2JP31 ? comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85% 3 
30 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence PGLERGADKFPWGGSSLFLALDLTLRPPVG in HUMPHOSLIP_PEA_2_P3 1 . 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMPH O SLIP JPEA__2JP3 3, comprising a first 
amino acid sequence being at least 90 % homologous to 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGH 
FYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLRFRRQLLYWFFYDGGYINAS 
AEGVSIRTGLELSRDPAGRMKVSNVSCQASVSRMHAAFGGTFKKVYDFLSTFITSGMRF 
LLNQQ corresponding to amino acids 1-183 of PLTPJHUMAN, which also corresponds to 
amino acids 1-183 of HUMPHOSLIPJPEA_2_P33, and a second amino acid sequence being 
at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and 
most preferably at least 95% homologous to a polypeptide having the sequence 
VWAATGRRVARVGMLSL corresponding to amino acids 184 - 200 of 
HUMPHOSLIP_PEA_2JP33, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HUMPHOSLIPJPEA__2JP33, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95 % homologous to the 
sequence VWAATGRRVARVGMLSL in HUMPHOSLIP_PEA_2JP33. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMPHOSLIP J>EA_2JP34, comprising a first 
amino acid sequence being at least 90 % homologous to 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGH 
FYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLRFRRQLLYWFFYDGGYINAS 
AEGVSIRTGLELSRDPAGRMKVSNVSCQASVSRMHAAFGGTFKKVYDFLSTFITSGMRF 
LLNQQICPVLYHAGTVLLNSLLDTVPV corresponding to amino acids 1 - 205 of 
PLTPJHUMAN, which also corresponds to amino acids 1 - 205 of 

HUMPHO SLIP_PE A_2 JP3 4, and a second amino acid sequence being at least 70%, optionally 
at least 80%, preferably at least 85%, more preferably at least 90% and most preferably at least 
95% homologous to a polypeptide having the sequence LWTSLLALTDPS corresponding to 
amino acids 206-217 of HUMPHOSLIP PEA 2 P34, wherein said first amino acid sequence 
and second amino acid sequence are contiguous and in a sequential order. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HUMPHOSLIPJPEA2P34, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
5 sequence LWTSLLALTIPS in HUMPHOSLIP J>EA_2JP34. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMPHOSLIP_PEA_2J?35, comprising a first 
amino acid sequence being at least 90 % homologous to 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGH 
1 0 FYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLRPRRQLLYWF corresponding to 
amino acids 1-109 of PLTPJHUMAN, which also corresponds to amino acids 1 - 109 of 
HUMPHOSLIPJPEA2P35, a second amino acid sequence bridging amino acid sequence 
comprising of L, a third amino acid sequence being at least 90 % homologous to 
KVYDFLSTFITSGMRFLLNQQ corresponding to amino acids 163 - 183 of PLTPJHUMAN, 
15 which also corresponds to amino acids 111-131 of HUMPHOSLIP_JPEA_2JP35, and a fourth 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence VWAATGRRVARVGMLSL corresponding to amino acids 132 - 148 of 
HUMPHOSLIP JPE A_2_P3 5 , wherein said first amino acid sequence, second amino acid 
20 sequence, third amino acid sequence and fourth amino acid sequence are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for an edge portion of HUMPHOSLIP JPE A_2_P3 5 , comprising 
a polypeptide having a length "n", wherein n is at least about 10 amino acids in length, 

25 optionally at least about 20 amino acids in length, preferably at least about 30 amino acids in 
length, more preferably at least about 40 amino acids in length and most preferably at least 
about 50 amino acids in length, wherein at least two amino acids comprise FLK having a 
structure as follows (numbering according to HUMPHOSLIP_PEAJ2JP35): a sequence starting 
from any of amino acid numbers 109-x to 109; and ending at any of amino acid numbers 111 + 

30 ((n-2) - x), in which x varies from 0 to n-2. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HUMPHOSLIP PEA 2 P35, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
5 sequence VWAATGRRVARVGMLSL in HUMPHOSUPJPEA_2J>35. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R38144_PEA_2_P6, comprising a first amino acid 
sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFD 
10 ELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFET 
NIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPAFQTPTGMPYGTV 
NLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSDIGLV 
GNHIDVLTGKWVAQDAGIGAGVDSYFEYL^ 

FDDWYLWQMYKGTVSMPWQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGG 

15 LPEFYNIPQGYTVEKJREGYPLRPELIESAM 

GFAT corresponding to amino acids 1 - 412 of CT31_HUMAN, which also corresponds to 
amino acids 1-412 of R38144_PEA_2_P6, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

20 LASFSHMSDQRSARPQAGQPHGWLPGRDCEIPLPPV corresponding to amino acids 413 - 
449 of R38144JPEA2JP6, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R38144 PEA2P6, comprising a polypeptide being 

25 at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
LASFSHMSDQRSARPQAGQPHGWLPGRDCEIPLPPV in R38144_PEA_2_P6. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R3 8 1 44 PE A 2 P 1 3 , comprising a first amino acid 

30 sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFD 
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ELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFET 
NIRVVGGLLSAHLLSKXAGVEVEAGWPCSGPLLIUvIAEEAARKLLPAFQTPTGMPYGTV 
NLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSDIGLV 
GNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGAILLQDKEXMAMFLEYNKAIRNYTR 
5 FDDWYLWVQMYKGTVSMPVFQSLEAYWPGLQ corresponding to amino acids 1 - 323 of 
CT31HUMAN, which also corresponds to amino acids 1 - 323 of R38144JPEA_2_P13, and a 
second amino acid sequence being at least 70% 5 optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence NLLKAQCTSTVPRGIPPS corresponding to amino acids 324 - 341 of 

10 R38144_PEA J2JP13, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R38144JPEA__2JP13, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 

15 least about 90% and most preferably at least about 95% homologous to the sequence 
NLLKAQCTSTVPRGIPPS in R38144_PEA_2JP13. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R38144_PEA_2JP15, comprising a first amino acid 
sequence being at least 90 % homologous to 

20 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFD 
ELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFET 
NIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPAFQTPTGMPYGTV 
NLLHGVWGETPVTCTAGIGTFIVEFATLSSLTGDPWEDVARVALMRLWESRSDIGLV 
GNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGAILLQDKKLMAMFLE corresponding 

25 to amino acids 1 - 282 of CT3 INHUMAN, which also corresponds to amino acids 1 - 282 of 

R3 8 1 44_PE A_2 JP 1 5 , and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence PHWRH corresponding to amino acids 283 - 
287 of R38144_PEA_2JP15, wherein said first amino acid sequence and second amino acid 

30 sequence are contiguous and in a sequential order. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R38144JPEA_2_P15, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence PHWRH 
5 in R38144 PEA_2JP15. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R38144JPEA__2_P19, comprising a first amino acid 
sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFD 

10 ELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFET 
NIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPAFQTPTGMPYGTV 
NLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSDIGLV 
GNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTR 
FDDWLWQMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGG 

15 LPEFYMPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLELGRDAVESIEKISKVEC 
GFAT corresponding to amino acids 1 - 412 of CT31HUMAN, which also corresponds to 
amino acids 1 - 412 of R38144_PEA_2_P19, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

20 KRSRS VAQ AGVQW CDHDSPQP corresponding to amino acids 413 - 433 of 

R38144_PEA_2JP19, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R38144JPEA_2JP19, comprising a polypeptide being 

25 at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
KRSRSVAQAGVQWCDHDSPQP in R38144_PEA_2__P19. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R38144JPEAJ2_P24, comprising a first amino acid 

30 sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFD 
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ELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFET 
NIR corresponding to amino acids 1 - 121 of CT31_HUMAN ? which also corresponds to amino 
acids 1-121 of R38144JPEA_2JP24, and a second amino acid sequence being at least 90 % 
homologous to 

5 eynkairnytrfddwlwvqmykgtvsmpvfqsleaywpglqsligdidnamrtfln 
yytvwkqfgglpefympqgytvekre:gyplrpeliesamylyratgdptllelgrda 
vesiekiskvecgfatikdlrdhkldnrmesfflaetvkylyllfdptnfihnngstfda 
vitpygecilgaggyifnteahpidpaalhccqrlkeeqwevedlmrefyslkrsrskfq 
kntvssgpwepparpgtlfspenhdqarerkpakqkvpllscpsqpftsklallgqvfl 

10 DSS corresponding to amino acids 282 - 578 of CT3 INHUMAN, which also corresponds to 
amino acids 122-418 of R38144JPEA_2JP24, wherein said first amino acid sequence and 
second amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of R38144JPEA__2JP24, comprising 

15 a polypeptide having a length "n", wherein n is at least about 10 amino acids in length, 

optionally at least about 20 amino acids in length, preferably at least about 30 amino acids in 
length, more preferably at least about 40 amino acids in length and most preferably at least 
about 50 amino acids in length, wherein at least two amino acids comprise RE, having a 
structure as follows: a sequence starting from any of amino acid numbers 121-x to 121; and 

20 ending at any of amino acid numbers 122+ ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R38144JPEA_2_P36, comprising a first amino acid 
sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYR corresponding to amino acids 1 - 36 
25 of AAH16184, which also corresponds to amino acids 1 - 36 of R38144JPEAJ2JP36, and a 

second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence FWGMSQNSKEWLKCSRTAWTLILM corresponding to amino acids 37 
- 60 of R38144JPEA_2JP36, wherein said first amino acid sequence and second amino acid 
30 sequence are contiguous and in a sequential order. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R38144JPEA_2J>36, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
FWGMSQNSKEWLKCSRTAWTLILM in R38144JPEA_2JP36. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R38144JPEA_2_P36, comprising a first amino acid 
sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHY corresponding to amino acids 1 - 35 of 
AAQ88943, which also corresponds to amino acids 1 - 35 of R38144JPEA_2JP36, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence RFWGMSQNSKEWLKCSRTAWTLILM corresponding to amino acids 
36 - 60 of R38144JPEA_2JP36, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R38144_PEA_2_P36, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
RFWGMSQNSKEWLKCSRTAWTLILM in R38144JPEA_2_P36. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R38144_PEA_2_P36, comprising a first amino acid 
sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYR corresponding to amino acids 1 - 36 
of CT31_HUMAN, which also corresponds to amino acids 1 - 36 of R38144_PEA_2_P36, and 
a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 
85%, more preferably at least 90% and most preferably at least 95% homologous to a 
polypeptide having the sequence FWGMSQNSKEWLKCSRTAWTLILM corresponding to 
amino acids 37 - 60 of R3 8 1 44_PE AJ2JP3 6, wherein said first amino acid sequence and second 
amino acid sequence are contiguous and in a sequential order. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R38144_PEA_2JP36, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
5 FWGMSQNSKEWLKCSRTAWTLILM in R38144_PEA_2_P36. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for AA161 187JP6, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
10 HTREGTLGGQKRAFPDGVEGEKGRGRAWGAASRGSAVPLTIR corresponding to amino 
acids 1 - 42 of AA161187JP6, and a second amino acid sequence being at least 90 % 
homologous to 

GPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWALTAAHCFETYS 
DLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPV 

15 TYTKHIQPICLQASTFEFENRTDCWVTGWGYIK^DEALPSPHTLQEVQVAIINNSMCNH 
LFLKYSFRKDIFGDMVCAGNAQGGKDACFGDSGGPLACNKNGLWYQIGVVSWGVGC 
GRPNRPGVYTNISHHFEWIQKLMAQSGMSQPDPSWPLLFFPLLWALPLLGPV 
corresponding to amino acids 31-314 of TESTJHUMAN, which also corresponds to amino 
acids 43 - 326 of AA161187_P6, wherein said first amino acid sequence and second amino acid 

20 sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of AA161 187_P6, comprising a polypeptide being at 
least 70%o, optionally at least about 80%, preferably at least about 85%, more preferably at least 
about 90% and most preferably at least about 95% homologous to the sequence 

25 HTREGTLGGQKRAFPDGVEGEKGRGRAWG AASRGS A VPLTIR of AA 1611 87 JP6 . 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for AA161187JP13, comprising a first amino acid 
sequence being at least 90 % homologous to 

MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAELGRWPWQGS 
30 LRLWDSHVCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAY 
YTRYFVSMYLSPRYLGNSPYDIALVKLSAPVTYTKHIQPICLQASTFEFENRTDCWVTG 
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WGYIKEDE corresponding to amino acids 1-183 of TEST_HUMAN ? which also corresponds 
to amino acids 1-183 of AA161187_P13, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

5 GSSGRHHKQLYVQPPLPQVQFPQGHLWRHG corresponding to amino acids 184 - 213 of 
AA161187_P13, wherein said first amino acid sequence and second amino acid sequence are 
contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of AA161 187_P13, comprising a polypeptide being at 

10 least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 
about 90% and most preferably at least about 95% homologous to the sequence 
GSSGRHHKQLYVQPPLPQVQFPQGHLWRHG in AA161 187_P13. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for AA161 187JP14, comprising a first amino acid 

15 sequence being at least 90 % homologous to 

MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAELGRWPWQGS 
LRLWDSHVCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAY 
YTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYTKHIQPICLQASTFEFENRTDCWVTG 
WGYIKEDE corresponding to amino acids 1-183 of TESTHUMAN, which also corresponds 

20 to amino acids 1 - 1 83 of AA161 1 87_P14, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

GCCLSPSHYRPHSTAISPHPPGSSGRHHKQLYVQPPLPQVQFPQGHLWRHGLCWQCPRR 
EGCLLRECPCHHSQPRKASCVPVPYLTLMPTPGGGDCCPTLQMQKRRLGCCQGEEEDV 

25 HPVYPAP corresponding to amino acids 184 - 307 of AA161 187JP14, wherein said first amino 
acid sequence and second amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of AA161 187_P14, comprising a polypeptide being at 
least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 

30 about 90% and most preferably at least about 95% homologous to the sequence 

GCCLSPSHYRPHSTAISPHPPGSSGRHHKQLYVQPPLPQVQFPQGHLWRHGLCWQCPRR 
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EGCLLRECPCHHSQPRKASCVPWYLTLMPTPGGGDCCPTLQMQKRRLGCCQGEEEDV 

HPVYPAP in AA161 187JP14. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for AA161 187JP18, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
HTREGTLGGQKEAFPDGVEGEKGRGRAWGAASRGSAVPLTIR corresponding to amino 
acids 1-42 of AA161187JP18, a second amino acid sequence being at least 90 % homologous 
to GPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWALTAAHCFET 
corresponding to amino acids 31-86 of TESTHUMAN, which also corresponds to amino 
acids 43 - 98 of AA161 187JP18, a third amino acid sequence being at least 90 % homologous to 
DLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPV 
TYTKHIQPICLQASTFEFENRTDCWVTGWGYIKEDEALPSPHTLQEVQVAIINNSMCNH 
LFLKYSFRKDIFGDMVCAGNAQGGKDACF corresponding to amino acids 89-235 of 
TEST HUMAN, which also corresponds to amino acids 99 - 245 of AA161 187JP18, and a 
fourth amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence VSVPATTPSPGKHPVSLCLI corresponding to amino acids 246 - 265 of 
AA161187JP18, wherein said first amino acid sequence, second amino acid sequence, third 
amino acid sequence and fourth amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of AA161187JP18, comprising a polypeptide being at 
least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 
about 90% and most preferably at least about 95% homologous to the sequence 
HTREGTLGGQKRAFPDGVEGEKGRGRAWGAASRGSAVPLTIR of AA161 187„P18. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of AA161187_P18, comprising a 
polypeptide having a length "n", wherein n is at least about 10 amino acids in length, optionally 
at least about 20 amino acids in length, preferably at least about 30 amino acids in length, more 
preferably at least about 40 amino acids in length and most preferably at least about 50 amino 
acids in length, wherein at least two amino acids comprise TD, having a structure as follows: a 
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sequence starting from any of amino acid numbers 98-x to 99; and ending at any of amino acid 
numbers 99+ ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of AA161 187JP18, comprising a polypeptide being at 
least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 
about 90% and most preferably at least about 95% homologous to the sequence 
VSVPATTPSPGKHPVSLCLI in AA1 6 1 1 87 JP 1 8. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for AA161 187JP19, comprising a first amino acid 
sequence being at least 90 % homologous to 

MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAELGRWPWQGS 
LRLWDSHVCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAY 
YTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYTKHIQPICLQASTFEFENRTDCWVTG 
WGYIKEDE corresponding to amino acids 1-183 of TEST JHUMAN, which also corresponds 
to amino acids 1-183 of AA161187_P19, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence DKRTQ 
corresponding to amino acids 184 - 188 of AA161187JP19, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of AA161 187_P19, comprising a polypeptide being at 
least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 
about 90% and most preferably at least about 95% homologous to the sequence DKRTQ in 
AA161187JP19. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Z25299.PEA.2JP2, comprising a first amino acid 
sequence being at least 90 % homologous to 

MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQCP 
GKK^CCPDTCGIKCLDPVDTPNPTRRKTGKCPVTYGQCLMLNPPNFCEM 
CCMGMCGKSCVSPVK corresponding to amino acids 1-131 of ALK1 _HUMAN, which also 
corresponds to amino acids 1-131 of Z25299_PEA_2 JP2, and a second amino acid sequence 
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being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
90% and most preferably at least 95% homologous to a polypeptide having the sequence 
GKQGMRAH corresponding to amino acids 132 - 139 of Z25299_PEA_2_P2, wherein said 
first and second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Z25299JPEA_2 JP2, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
GKQGMRAH in Z2S299_PEA_2JP2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Z25299JPEA_2JP3, comprising a first amino acid 
sequence being at least 90 % homologous to 

MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQCP 
GKKRCCPDTCGIKCLDPVDTPNPTRRKPGKCPVTYGQCLMLNPPNFCEMDGQCKRDLK 
CCMGMCGKSCVSPVK corresponding to amino acids 1 - 131 of ALK1 HUMAN, which also 
corresponds to amino acids 1-131 of Z25299JPEAJ2JP3, and a second amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
90% and most preferably at feast 95% homologous to a polypeptide having the sequence 
GEKRHHKQLRDQEVDPLEMRRHSAG corresponding to amino acids 132 - 156 of 
Z2 5 2 9 9 PEA 2 P 3 , wherein said first and second amino acid sequences are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Z25299JPEA_2JP3, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
GEKRHHKQLRDQEVDPLEMRRHSAG in Z25299_PEA_2_P3. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Z25299_PEA_2_P7, comprising a first amino acid 
sequence being at least 90 % homologous to 

MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKXSAQCLRYKKPECQSDWQCP 
GKKRCCPDTCGIKCLDPVDTPNP corresponding to amino acids 1-81 of ALK1 JBUMAN, 
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which also corresponds to amino acids 1-81 of Z25299_PEA_2JP7, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
RGSLGSAQ corresponding to amino acids 82 - 89 of Z25299_PEA_2_P7, wherein said first 
and second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Z25299JPEA_2JP7, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
RGSLGSAQ in Z25299J 5 EAJ2JP7. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Z25299_PEA_2JP10, comprising a first amino acid 
sequence being at least 90 % homologous to 

MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQ 
GKKRCCPDTCGIKCLDPVDTPNPT corresponding to amino acids 1 - 82 of ALK1 JHUMAN, 
which also corresponds to amino acids 1 - 82 of Z25299_PEA_2_P10. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R66178_P3, comprising a first amino acid sequence 
being at least 90 % homologous to 

MAPJvIGLAGAAGRWWGLALGLTAFFLPGVHSQWQVNDSMYGFIGTDVVLHCSFANP 
LPSVKITQWWQKSTNGSKQNVArYWSMGVSVLAPYRERVEFLRPSFTDGTIRLSRLEL 
EDEGVYICEFATFPTGNRESQmLTVMAKPTNWIEGTQAVLRAKKGQDDKVLVATCTS 
ANGKPPSWSWTPvLKGEAEYQEIPJS^NGTVTVISRYRLWSREAHQQSLACIVNYHM 
DRFKESLTLNVQYEPEVTIEGFDGNWLQRMDVKLTCKADANPPATEYHWTTLNGSLP 
KGVEAQNRTLFFKGPINYSLAGTYICEATNPIGTRSGQVEVNIT corresponding to amino 
acids 1 - 334 of PVR1_HUMAN, which also corresponds to amino acids 1 - 334 of R66178_P3, 
and a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 
85%, more preferably at least 90% and most preferably at least 95% homologous to a 
polypeptide having the sequence GEGHSLPISPGVLQTQNCGP corresponding to amino acids 
335 - 354 of R66178_P3, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 



WO 2006/131783 



PCT/IB2005/004037 



49 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R66178_P3, comprising a polypeptide being at least 
70%, optionally at least about 80%, preferably at least about 85%, more preferably at least about 
90% and most preferably at least about 95% homologous to the sequence 
5 GEGHSLPISPGVLQTQNCGP in R66178_P3. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R66178_P4, comprising a first amino acid sequence 
being at least 90 % homologous to 

MARMGLAGAAGRWWGLALGLTAFFLPGVHSQWQVNDSMYGFIGTDVVLHCSFANP 
1 0 LPSVKITQVTWQKSTNGSKQNVAIYNPSMGVSVLAPYRERVEFLRPSFTDGTIRLSRLEL 
EDEGVYICEFATFPTGNRESQLNLTVMAKPTNWIEGTQAVLRAKKGQDDKVLVATCTS 
ANGKPPSVVSWETRLKGEAEYQEIRNPNGTVTVISRYRLVPSREAHQQSLACIVNYHM 
DRFKESLTLNVQYEPEVTIEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLP 
KGVEAQNRTLFFKGPINYSLAGTYICEATNPIGTRSGQVEVNIT corresponding to amino 
15 acids 1 - 334 of PVR1_HUMAN, which also corresponds to amino acids 1 - 334 of R66178_P4, 
and a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 
85%, more preferably at least 90% and most preferably at least 95% homologous to a 
polypeptide having the sequence AFCQLrYPGKGRTRARMF corresponding to amino acids 
335 - 352 of R66178 P4, wherein said first amino acid sequence and second amino acid 
20 sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R66178_P4, comprising a polypeptide being at least 
70%, optionally at least about 80%, preferably at least about 85%, more preferably at least about 
90% and most preferably at least about 95% homologous to the sequence 
25 AFCQLrYPGKGRTRARMF in R66178_P4. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R66178_P8, comprising a first amino acid sequence 
being at least 90 % homologous to 

MAPJ^GLAGAAGRWWGLALGLTAFFLPGVHSQWQVNDSMYGFIGTDVVLHCSFANP 
30 LPSVKITQVTWQKSTNGSKQNVAIYNPSMGVS VLAPYRERVEFLRPSFTDGTIRLSRLEL 
EDEGVYICEFATFPTGNRESQLNLTVMAKPTNWIEGTQAVLRAKKGQDDKVLVATCTS 
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ANGKPPSVVSWETRLKGEAEYQEIRNPNGTVTVISRYRLVPSREAHQQSLACIVNYHM 
DRFKESLTLNVQYEPEVTIEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLP 
KGVEAQNRTLFFKGPINYSLAGTYICEATNPIGTRSGQVE corresponding to amino acids 1 
- 330 of PVR1JHUMAN, which also corresponds to amino acids 1 - 330 of R66178JP8, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence NSPTPRLLPNMGGAPGRCPRPSLGAWRGASCWC corresponding to 
amino acids 331 - 363 of R66178JP8, wherein said first amino acid sequence and second amino 
acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R66178_P8, comprising a polypeptide being at least 
70%>, optionally at least about 80%, preferably at least about 85%, more preferably at least about 
90% and most preferably at least about 95% homologous to the sequence 
NSPTPRLLPNMGGAPGRCPRPSLGAWRGASCWC in R66178JP8. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HSU33147JPEA_1_P5, comprising a first amino 
acid sequence being at least 90 % homologous to 
MKLLMVLML 

DELKECFLNQTDETLSNVE corresponding to amino acids 1-78 of MGBA HUMAN, which 
also corresponds to amino acids 1 - 78 of HSU33147_PEA_1_P5, and a second amino acid 
sequence being at least 90 % homologous to QLIYDSSLCDLF corresponding to amino acids 82 
- 93 of MGBAJHUMAN, which also corresponds to amino acids 79 - 90 of 
HSU33147 PEA_1_P5, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of HSU33147_PEA_1_P5, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise EQ, having a 
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structure as follows: a sequence starting from any of amino acid numbers 78-x to 78; and ending 
at any of amino acid numbers 79+ ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HSU33147_PEA_1_P5, comprising a first amino 
acid sequence being at least 90 % homologous to 
MKLLMVLMLAALSQHCYAGSG 

DELKECFLNQTDETLSNVE corresponding to amino acids 1 - 78 of MGB AHUMAN, which 
also corresponds to amino acids 1 - 78 of HSU33147_PEA_1 JP5, and a second amino acid 
sequence being at least 90 % homologous to QLIYDS SLCDLF corresponding to amino acids 82 
- 93 of MGB AHUMAN, which also corresponds to amino acids 79 - 90 of 
HSU33147JPEA_1JP5, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of HSU33147_PEA_1P5, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise EQ, having a 
structure as follows: a sequence starting from any of amino acid numbers 78-x to 78; and ending 
at any of amino acid numbers 79+ ((n-2) - x), in which x varies from 0 to i>2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M78076JPEA_1 JP3, comprising a first amino acid 
sequence being at least 90 % homologous to 

MGPASPAARGLSRKPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 

CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 

RWCGGSRSGSCAHPHHQWPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 

EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 

SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 

DIYFGMPGEISEHEGFLRAKMDLEERRMRQINEWREWAMADNQSK^ 

EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 

ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQSLGLLD 
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QNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKD corresponding to 
amino acids 1-517 of APP INHUMAN, which also corresponds to amino acids 1 - 517 of 
M78076 PEA1 JP3, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 

5 homologous to a polypeptide having the sequence GE corresponding to amino acids 518-519 
of M78076JPEA_1JP3, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M78076JPEA__1 JP4, comprising a first amino acid 

10 sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 
CGRETLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 
RWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 
EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 

15 SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAWGKVTPTPRPTDGV 
DIYFGMPGEISEHEGFLRAKMDLEERRMRQINEVMREWAM 

EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 
ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQSLGLLD 
QNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMTLPKG 
20 corresponding to amino acids 1 - 526 of APP INHUMAN, which also corresponds to amino 
acids 1 - 526 of M78076JPEA_1JP4, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

ECLTVNPSLQIPLNP corresponding to amino acids 527 - 541 of M78076JPEA_1 JP4, wherein 
25 said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of M78076JPEA_1__P4, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
30 least about 90% and most preferably at least about 95% homologous to the sequence 
ECLTVNPSLQIPLNP in M78076JPEA_1_P4. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M78076_PEA_1_P12, comprising a first amino acid 
sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 

CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 

RWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 

EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 

SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAWGKVTPTPRPTDGV 

DIYFGMPGEISEHEGFLRAKMDLEERRMRQINEVM 

EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 
ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQSLGLLD 
QNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMTLPKG 
corresponding to amino acids 1 - 526 of APP1 JHUMAN, which also corresponds to amino 
acids 1 - 526 of M7 807 6 JPE A__l JP 1 2, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
ECVCSKGFPFPLIGDSEG corresponding to amino acids 527 - 544 of M78076_PEA_1_P12, 
wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of M78076JPEAJ JP12, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
ECVCSKGFPFPLIGDSEG in M78076JPEA_1JP12. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M78076_PEA_1_P14, comprising a first amino acid 
sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 
CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 
RWCGGSRSGSCAHPHHQWPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 
EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 
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SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 
DIYFGMPGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQAL^ 
EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 
ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQSLGLLD 
QNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMTLPKGST 
EQDAASPEKEKMNPLEQYERKVNASVPRGFPFHSSEIQRDEL corresponding to amino 
acids 1 - 570 of APP1 JHHJMAN, which also corresponds to amino acids 1 - 570 of 
M78076JPEA_1 JP14, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 

VRGGTAGYLGEETRGQRPGCDSQSHTGPSKKPSAPSPLPAGTSWDRGVP corresponding 
to amino acids 571 - 619 of M78076JPEA__1_P14, wherein said first amino acid sequence and 
second amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of M78076JPEAJLP14, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
VRGGTAGYLGEETRGQRPGCDSQSHTGPSKKPSAPSPLPAGTSWDRGVP in 

M78076_PEA_1JP14. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M78076_PEA_1_P21, comprising a first amino acid 
sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 

CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 

RWCGGSRSGSCAHPHHQWPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 

EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 

SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 

DIYFGWGEISEHEGFLRAKMDLEERRMRQINEVMREW 

E corresponding to amino acids 1 - 352 of APP1_HUMAN, which also corresponds to amino 
acids 1 - 352 of M78076_PEA_1_P21, and a second amino acid sequence being at least 90 % 
homologous to 
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AERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQ 
SLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMT 
LPKGSTEQDAASPEKEKMNPLEQYERKVNASVPRGFPFHSSEIQRDELAPAGTGVSREA 
VSGLLIMGAGGGSLIVLSMLLLRRKKPYGAISHGVVEVDPMLTLEEQQLRELQRHGYE 

5 NPTYRFLEERP corresponding to amino acids 406 - 650 of APP INHUMAN, which also 

corresponds to amino acids 353 - 597 of M78076JPEA_1_P21, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of M78076JPEA_1JP21, 

10 comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise EA, having a 
structure as follows: a sequence starting from any of amino acid numbers 352-x to 352; and 

1 5 ending at any of amino acid numbers 353+ ((n-2) - x), in which x varies from 0 to nr2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M78076_PEA_1_P24, comprising a first amino acid 
sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 
20 CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 
RWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 
EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 
SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 

DIYFGMPGEISEHEGFLRAXMDLEERRMRQINEVMREW 
25 EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 

ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQ 

QNPHLAQELRPQI corresponding to amino acids 1 - 481 of APP INHUMAN, which also 
corresponds to amino acids 1 - 481 of M78076JPEA_1 JP24, and a second amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
30 90% and most preferably at least 95% homologous to a polypeptide having the sequence 

RECLLPWLPLQISEGRS corresponding to amino acids 482 - 498 of M78076J > EA_1_P24 ? 
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wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of M78076_PEA_1 JP24, comprising a polypeptide 
5 being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
RECLLPWLPLQISEGRS in M78076_PEA_1_P24. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M78076JPEAJ JP2, comprising a first amino acid 

10 sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 

CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 

RWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 

EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 

15 SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 

DIYFGMPGEISEHEGFLRAKMDLEERRMRQINEVMREWAM 

EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 
ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQV corresponding to amino acids 
1 _ 449 of APP INHUMAN, which also corresponds to amino acids 1 - 449 of 
20 M78076JPEA_1_P2, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 

LTSFQLPNAPLFLRRPRLRLFSCPLDPLSVSWTPSYPLNTASLPLPSLSAQLPDPETWTLT 
CCVFDPCFLALGFLLPPPSILCSVPWIFTAFPRIVFFFFFFLRQVLALSPRQESSVRSWLIAT 
25 STSWVQAILLPQPLE corresponding to amino acids 450 - 588 of M78076_PEA_1J?2, 

wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of M78076JPEA_1_P2, comprising a polypeptide being 
30 at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
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LTSFQLPNAPLFLRRPRLRLFSCPLDPLSVSWTPSYPLNTASLPLPSLSAQLPDPETWTLT 
CCVFDPCFLALGFLLPPPSILCSVPWIFTAFPRIVFFFFFFLRQVLALSPRQESSVRSWLIAT 
STSWVQAILLPQPLE in M78076_PEA_1_P2. 

According to preferred embodiments of the present invention, there is provided an 
5 isolated chimeric polypeptide encoding for M78076_PEA_1_P25, comprising a first amino acid 
sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 

CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 

RWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 

1 0 EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 
SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 
DIWGMPGEISEHEGFLRAKMDLEERPJS1RQINEVMREWAMADNQSKNLPKADRQALN 
EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 
ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQ corresponding to amino acids 1 

15 _ 448 of APP1_HUMAN, which also corresponds to amino acids 1 - 448 of 

M78076JPEA_1_P25, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 

PQNPNSQPRAAGSLEVIISHPFVRRLEILISPFQFQNSIPKNSQIVPAASPRGTSSP 
20 corresponding to amino acids 449 - 505 of M78076_PEA_1_P25, wherein said first amino acid 

sequence and second amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of M78076_PEA_1_P25, comprising a polypeptide 

being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
25 at least about 90% and most preferably at least about 95% homologous to the sequence 

PQNPNSQPRAAGSLEVHSHPFVRRLEILISPFQFQNSIPKNSQIVPAASPRGTSSPin 

M78076_PEA_1_P25. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M79217_PEA_1_P1, comprising a first amino acid 

30 sequence being at least 90 % homologous to 

MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYYLTTLDEAD 
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EAGKRIFGPRVGNELCEVKHVLDLCRIRESVSEELLQLEAKRQELNSEIAKLNLKIEACK 

KSIENAKQDLLQLKNVISQTEHSYKELMAQNQPKLSLPIRLLPEKDDAGLPPPKATRGC 

RLHNCFDYSRCPLTSGFPVYVYDSDQFVFGSYLDPLVKQAFQATARANVYVTENADIA 

CLYVILVGEMQEPVVLRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTG 

RAMVAQSTFYTVQYRPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKIESL 

RSSLQEARSFEEEMEGDPPADYDDRIIATLKAVQDSKLDQVLVEFTCKNQPKPSLPTEW 

ALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATRLFEALEVGAVPVVLGEQVQLPY 

QDMLQWNEAALVVPKPRVTEVHFLLRSLSDSDLLAMRRQGRFLWETYFSTADSIFNTV 

LAMIRTRIQIPAAPIREEAAAEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYL 

RNFTLTVTDFYRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 

QAALGGNWREQFTVVMLTYEREEVLMNSLERLNGLPYLNKVVVVWNSPKLPSEDLL 

WPDIGWIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLRHDEIMFGFRVWREARD 

RIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHKYYAYLYSYVMPQAIRD 

MVDEYINCEDIAMNFLVSHITRKPPIKVTSRWTFRCPGCPQALSHDDSHFHERHKCINFF 

VKVYGYMPLLYTQFRVDSVLFKTRLPHDKTKCFKFI corresponding to amino acids 13 - 

931 of BAA25445, which also corresponds to amino acids 1 - 919 of M79217_PEA_1_P1. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M79217_PEA_1JP2, comprising a first amino acid 
sequence being at least 90 % homologous to 

MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYYLTTLDEAD 

EAGKRIFGPRVGNELCEVKHVLDLCRIRESVSEELLQLEAKRQELNSEIAKLNLKIEACK 

KSIENAKQDLLQLKNVISQTEHSYKELMAQNQPKLSLPIRLLPEKDDAGLPPPKATRGC 

RLHNCFDYSRCPLTSGFPVYVYDSDQFVFGSYLDPLVKQAFQATARANVYVTENADIA 

CLYVILVGEMQEPWLRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTG 

RAMVAQSTTYTVQYRPGFDLWSPLVHAMSEPNFMEIPPQWVKRKYLFTFQGEKIESL 

RSSLQEARSFEEEMEGDPPADYDDRnATLKAVQDSKLDQVLVEFTCKNQPKPSLPTEW 

ALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATRLFEALEVGAVPWLGEQVQLPY 

QDMLQWmAALVVPKPRVTEVHFLLRSLSDSDLLAMRRQGRFLWETYFSTADSIFNTV 

LAMIRTRIQIPAAPIREEAAAEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYL 

RNFTLTVTDFYRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 

QAALGGNWREQFTVVMLTYEREEVLMNSLERLNGLPYLNKVVVVWSPKLPSEDLL 
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WPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLRHDEIMFGFRVWREARD 
RIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHK corresponding to amino 
acids 1 - 807 of EXL3_HUMAN, which also corresponds to amino acids 1 - 807 of 
M79217JPEA_1_P2, and a second amino acid sequence being at least 90 % homologous to 
AIRDMVDEYINCEDIAMNFLVSHITRKPPIKVTSRWTFRCPGCPQALSHDDSHFHERHK 
CINFFVKVYGYMPLLYTQFRVDSVLFKTRLPHDKTKCFKFI corresponding to amino acids 
820 - 919 of EXL3_HUMAN 5 which also corresponds to amino acids 808 - 907 of 
M79217_PEA_1JP2, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of M79217_PEA_1_P2, comprising 
a polypeptide having a length "n", wherein n is at least about 10 amino acids in length, 
optionally at least about 20 amino acids in length, preferably at least about 30 amino acids in 
length, more preferably at least about 40 amino acids in length and most preferably at least 
about 50 amino acids in length, wherein at least two amino acids comprise KA, having a 
structure as follows: a sequence starting from any of amino acid numbers 807-x to 807; and 
ending at any of amino acid numbers 808+ ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M79217_PEA_1 JP4, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
PELRQPARLGLPECWDYRHEPRCPAQMGSHFIVQAGLKLLASSKPPKCWDY 
corresponding to amino acids 1-51 of M79217JPEA__1_P4, and a second amino acid sequence 
being at least 90 % homologous to 

RVWREARDRIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHKYYAYLYSY 
VMPQAIRDMVDEYINCEDIA^ 

ERHKCINFFVKVYGYMPLLYTQFRVDSVLFKTRLPHDKTKCFKFI corresponding to 
amino acids 759 - 919 of EXL3HUMAN, which also corresponds to amino acids 52 - 212 of 
M79217JPEA_1_P4, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 



WO 2006/131783 



PCT/IB2005/004037 



60 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of M79217J > EA_1_P4, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
PELRQPARLGLPECWDYRHEPRCPAQMGSHFIVQAGLKLLASSKPPKCWDY of 

M79217JPEA_1JP4. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M79217JPEA_1_P8, comprising a first amino acid 
sequence being at least 90 % homologous to 

MTGYTMLRNGGAGNGGQTCMLRWSNRIFILTWLSFTLFVILVFFPLIAHYYLTTLDEAD 

EAGKMFGPRVGNELCEVKJiVLDLCMRESVSEELLQLEAKRQELNSEIAKLNLKIEACK 

KSIENAKQDLLQLKNVISQTEHSYKELMAQNQPKLSLPIRLLPEKDDAGLPPPKATRGC 

RLHNCFDYSRCPLTSGFPVYVYDSDQFVFGSYLDPLVKQAFQATARANVYVTENADIA 

CLYVILVGEMQEPVVLRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTG 

RAMVAQSTFYTVQYRPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKIESL 

RSSLQEARSFEEEMEGDPPADYDDRIIATLKAVQDSKLDQVLVEFTCKNQPKPSLPTEW 

ALCGEREDRI.ELLKLSTFALIITPGDPRLVISSGCATRLFEALEVGAVPVVLGEQVQLPY 

QDMLQWNEAALVVPKPRVTEVHFLLRSLSDSDLLAMRRQGRFLWETYFSTADSIFNTV 

LAMIRTRIQIPAAPIREEAAAEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYL 

RNFTLTVTDFYRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 

QAALGGNWREQFTVVMLTYEREEV^ 

WPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLRHDEIMFGFRVWREARD 
RIVGFPGRYHAWD1PHQSWLYNSNYSCELSMVLTGAAFFHK corresponding to amino 
acids 1 - 807 of EXL3HUMAN, which also corresponds to amino acids 1 - 807 of 
M79217JPEA_1__P8, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence VRKSW corresponding to amino acids 808 - 
812 of M79217_PEA_J_P8, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of M79217JPEA_1_P8, comprising a polypeptide being 
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at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence VRKSW 
in M79217_PEA_1_P8. 

According to preferred embodiments of the present invention, there is provided an 
5 isolated chimeric polypeptide encoding for M62096JPEA_1 JP4, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MATYIH corresponding to amino acids 1-6 of M62096JPEA 1P4, and a second amino acid 
sequence being at least 90 % homologous to 
10 VSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNC 
RTTIVICCSPSVFNEAETKSTLM 
LKNVIQHLEMELNRWRN^ 

KYDEEIS SL YRQLDDKDDEINQQ SQL AEKLKQQMLDQDELLASTRRD YEKIQEELTRLQ 
IENEAAKDEVKEVLQALEELAVWDQK^ 
15 LSQLQELSNHQKKRATEILNLLLKDLGEIGGIIGTNDVKTLADVNGVIEEEFTM 
KMKSEVKSLVNRSKQLESAQM 

MEQKRRQLEESQDSLSEELAKLRAQEKMHEVSFQDKEKEHLTRLQDAEEMKKALEQQ 

MESHREAHQKQLSRLRDEffiEKQKIIDEIRDLN^ 

KLEKLLLLNDKREQAREDL 

20 GGGSAAQKQKISFLENNLEQLTK^ 

ALKEAKENAMRDRXRYQQEVDMKEAVRAKNMARRAHSAQIAKPIRPGH 
VHAIRGGGGSSSNSTHYQK corresponding to amino acids 239 - 957 of KF5C_HUMAN, 
which also corresponds to amino acids 7 - 725 of M62096 PEA1P4, wherein said first amino 
acid sequence and second amino acid sequence are contiguous and in a sequential order. 

25 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a head of M62096_PEA_1_P4, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
MATYIH of M62096_PEA_J _P4. 

30 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for M62096_PEA_1 JP5, comprising a first amino acid 
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sequence being at least 90 % homologous to 

MTRILQDSLGGNCRTTIVICCSPSVFNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWK 

KKYEKEKEKNKTLKNVIQHLEMELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIIDNI 

APWAGISTEEKEKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQMLDQDELLASTRR 

DYEKIQEELTRLQIENEAAKDEVKEVLQALEELAVNYDQKSQEVEDKTRANEQLTDEL 

AQKTTTLTTTQRELSQLQELSNHQKKRATEILNLLLKDLGEIGGIIGTNDVKTLADVNG 

VIEEEFTMARLYISKMKSEVKSLVNRSKQLESAQMDSNRKMNASERELAACQLLISQHE 

AK1KSLTDYMQNMEQKRRQLEESQDSLSEELAKLRAQEKMHEVSFQDKEKEHLTRLQ 

DAEEMKXALEQQMESHREAHQKQLSRLRDEIEEKQKIIDEIRDLNQKLQLEQEKLSSDY 

NKLKIEDQEREMKLEKLLLLNDKREQAREDLKGLEETVSRELQTLHNLRKLFVQDLTT 

RVKKSVELDNDDGGGSAAQKQKISFLENNLEQLTKVHKQLVRDNADLRCELPKLEKRL 

PvATAERVKALESALKEAK^NAMRDRKRYQQEVDRIKEAVRAKNMARRAHSAQIAKPI 

RPGHYPASSPTAVHAIRGGGGSSSNSTHYQK corresponding to amino acids 284 - 957 of 

KF5CJHUMAN, which also corresponds to amino acids 1 - 674 of M62096_PEA_1_P5. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M62096_PEA_1_P3, comprising a first amino acid 
sequence being at least 90 % homologous to 

MELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIIDNIAPWAGISTEEKEKYDEEISSL 

YRQLDDKDDEINQQSQLAEKLKQQMLDQDELLASTRRDYEKIQEELTRLQIENEAAKD 

EVKEVLQALEELAVNYDQKSQEVEDKTRANEQLTDELAQKTTTLTTTQRELSQLQELS 

NHQKKRATEILNLLLKDLGEIGGIIGTODVKTLADVNGVIEEEFTMARLYISKMKSEVKS 

LVNRSKQLESAQMDSNREMNASERELAACQLLISQHEAKIKSLTDYMQNMEQKRRQL 

EESQDSLSEELAKLRAQEKMHEVSFQDKEKEHLTRLQDAEEMKKALEQQMESHREAH 

QKQLSRLRDEffiEKQKiroEIRDLNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLN 

DKREQAREDLKGLEETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQK 

QKISFLENNLEQLTKVHKQLVRDNADLRCELPKLEKRLRATAERVKALESALKEAKEN 

AMRDRKIlYQQEVDRIKEAVRAKNMARRAHSAQIAKPmPGHYPASSPTAVHAI^^ 

SSSNSTHYQK corresponding to amino acids 365 - 957 of KF5C_HUMAN, which also 

corresponds to amino acids 1 - 593 of M62096_PEA_1_P3. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M62096_PEA_1_P7, comprising a first amino acid 
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sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MTQNFRLMWNILLFPLNFS corresponding to amino acids 1-19 of M62096_PEA_1_P7, and 
a second amino acid sequence being at least 90 % homologous to 

LNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQAPvEDLKGLEETVSREL 
QTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQKQKISFLENNLEQLTKVHKQLVR 
DNADLRCELPKLEKRLP^TAERVKALESALK^AKENAMPJDRKJR.YQQEVDRIKEAVRA 
KNMARRAHSAQIAKPIRPGHYPASSPTAVHAIRGGGGSSSNSTHYQK corresponding to 
amino acids 738 - 957 of KF5C_HUMAN, which also corresponds to amino acids 20 - 239 of 
M62096_PEA__1_P7, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of M62096_PEA_1_P7, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
MTQNFRLMWNILLFPLNFS of M62096_PEA_1_P7 . 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M62096_PEA_1_P8, comprising a first amino acid 
sequence being at least 90 % homologous to 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETWIGQGKPYVFDRVLPPNTTQ 

EQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIAHDIFD 

HIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLAVHEDKNRVPYVKGCTERFVSSPE 

EVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLINIKQENVETEKKLSGKLYLVDLAGSE 

KVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGN 

CRTTIVICCSPSVFNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEK^KEKNK 

TLKNVIQHLEMELNRWRNGEAWEDEQISAKDQKNLEPCDNTPIIDNIAPVVAGISTEEK 

EKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQMLDQDELLASTRRDYEKIQEELTRL 

QIENEAAKDEVKEVLQALEELAVNYDQKSQEVEDKTRANEQLTDELAQKTTTLTTTQR 

ELSQLQELSNHQKKRATEILNLLLKT3LGEIGGIIGTNDVKTLADVNGVIEEEFTMARLYI 

SKMKSEVKSLVNRSKQLESAQMDSNRKMNASERELAACQLLISQHEAKIKSLTDYMQN 
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MEQKRRQLEESQDSLSEELAKLRA 

MESHREAHQKQLSRLRDEIEEKQKIIDEIR corresponding to amino acids 1 - 736 of 
KF5CJHUMAN, which also corresponds to amino acids 1 - 736 of M62096JPEA_1_P8, and a 
second amino acid sequence being at least 70% 5 optionally at least 80%, preferably at least 85%, 
5 more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 

having the sequence E corresponding to amino acids 737 - 737 of M62096_PEA_1_P8, wherein 
said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
10 isolated chimeric polypeptide encoding for M62096JPEA_1JP9, comprising a first amino acid 
sequence being at least 90 % homologous to 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFDRVLPPNTTQ 
EQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPMAHDIFD 
HIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLAVHEDKNRVPYVKGCTERFVSSPE 
15 EVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLINIKQENVETEKKLSGKLYLVDLAGSE 
KVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGN 
CRTTIVICCSPSVFNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEKEK^ 

TLKNVIQHLEMELNRWRN 

EKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQMLDQDE corresponding to amino acids 
20 1 - 454 of KT5CJHUMAN, which also corresponds to amino acids 1 - 454 of 

M62096JPEA_1_P9, and a second amino acid sequence being at least 70%, optionally at least 

80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 

homologous to a polypeptide having the sequence 

VKNAIYFFFHKVLLLLFVVDVCSRNLIGIEAFH3STY 
25 corresponding to amino acids 455 - 514 of M62096_PEA_1 __P9, wherein said first amino acid 

sequence and second amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of M62096JPEA_1JP9, comprising a polypeptide being 

at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
30 least about 90% and most preferably at least about 95% homologous to the sequence 
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VKNAIYFFFHKVLLLLFVVDVCSRNLIGIEAFHNYRIMWKFLGRCPFTASYKLIITEFRK 

in M62096_PEA_1_P9. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding forM62096_PEA_l_P10, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MTQNFRLMWNILLFPLNFS corresponding to amino acids 1-19 of M62096_PEA_1JP10, a 
second amino acid sequence being at least 90 % homologous to 

LNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGLEETVSREL 
QTLHNLRKLFVQDLTTRVKK corresponding to amino acids 738- 815 of KF5C_HUMAN, 
which also corresponds to amino acids 20 - 97 of M62096_PEA_1_P10, and a third amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
VSSLCLNGTEKKIKDGREESFSVEISLA corresponding to amino acids 98 - 125 of 
M62096_PEA_1_P10, wherein said first amino acid sequence, second amino acid sequence and 
third amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of M62096JPEA1JP10, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
MTQNFRLMWNILLFPLNFS of M62096_PEA_1_P10. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of M62096_PEA_1_P10, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
VSSLCLNGTEKKIKDGREESFSVEISLA in M62096_PEA_1_P10. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M62096JPEA_1_P1 1, comprising a first amino acid 
sequence being at least 90 % homologous to 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETWIGQGKPYVFDRVLPPNTTQ 
EQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIAHDIFD 
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HIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLAVHEDKNRVPYVKGCTERFVSSPE 
EVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLINIKQENVETEKKLSGKLYLVDLAGSE 
KVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGN 
CRTTIVICCSPSVFNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEKEKNK 
TLKNVIQHLEMELNRWRN corresponding to amino acids 1 - 372 of KF5 C_HUMAN, which 
also corresponds to amino acids 1 - 372 of M62096_PEA_1_P1 1, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
DFLAAHVFGKLLE corresponding to amino acids 373 - 385 of M62096_PEA__1_P11, wherein 
said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of M62096_PEA_1_P1 1, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
DFLAAHVFGKLLE in M62096_PEA_1_P11. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M62096_PEA_1_P12, comprising a first amino acid 
sequence being at least 90 % homologous to 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFDRVLPPNTTQ 
EQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIAHDIFD 
HIYSMDENLEFHIKVSYFErYLDKIRDLLDVSKTNLAVHEDKNRVPYVKGCTERFVSSPE 
EVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLINIKQENVETEKKLSGKLYLVDLAGSE 
KVSKTGAEGAVLDEAKNINKSLSALGrWISALAEGTKTHWYRDSKMTRILQDSLGGN 
CRTTIVICCSPSVFNEAETKSTLMFGQR corresponding to amino acids 1 - 323 of 
KF5C HUMAN, which also corresponds to amino acids 1 - 323 of M62096_PEA_1_P12, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence V corresponding to amino acids 324 - 324 of M62096_PEA_1_P12, 
wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for T99080JPEA_4_P5, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MPASARLAGAGLLLAFLRALGCAGRAPGLS corresponding to amino acids 1 - 30 of 
T99080_PEA_4_P5, and a second amino acid sequence being at least 90 % homologous to 
MAEGNTLISVDYEIFGKVQGVFFRKHTQAEGKKLGLVGWVQNTDRGTVQGQLQGPIS 
KVRHMQEWLETRGSPKSHIDKANFNNEKVILKLDYSDFQIVK corresponding to amino 
acids 1 - 99 of ACYO_HUMAN_Vl, which also corresponds to amino acids 31 - 129 of 
T99080_PEA_4_P5, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of T99080_PEA_4_P5, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
MPASARLAGAGLLLAFLRALGCAGRAPGLS of T99080_PEA_4_P5. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for T99080_PEA_4_P8, comprising a fust amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence M 
corresponding to amino acids 1 - 1 of T99080_PEA_4_P8, and a second amino acid sequence 
being at least 90 % homologous to 

QAEGKKLGLVGWVQNTDRGTVQGQLQGPISKVRHMQEWLETRGSPKSHIDKANFNNE 
KVILKLDYSDFQIVK corresponding to amino acids 28 - 99 of ACYO_HUMAN_Vl, which 
also corresponds to amino acids 2 - 73 of T99080_PEA_4_P8, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for T08446_PEA_1_P18, comprising a first amino acid 
sequence being at least 90 % homologous to 

MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 
PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 
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DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQMLWLLLQYLETLSGLVDSNLNC 

GPVLTWME corresponding to amino acids 1-185 of SNXQHUMAN, which also 

corresponds to amino acids 1-185 of T08446_PEA_1_P18, and a second amino acid sequence 

being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 

90% and most preferably at least 95% homologous to a polypeptide having the sequence 

LDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIVSVIDMPPTEDRSW 

WRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAVPRPRGKLA 

GLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSEFIEAHGVV 

DGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPNPLLTYQLY 

GKFSEAMSVPGEEERLVRVHDVIQQLPPPHYRTLEYLLRHLARMARHSANTSMHARNL 

AIVWAPNLLRSMELESVGMGGAAAFREVRVQSVVVEFLLTHVDVLFSDTFTSAGLDPA 

GRCLLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAERRKGERGEK 

QRKPGGSSWXTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRSAKSEESLS 

SQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSCESLSSSSSSESSSSESSSSSSESSAAGL 

GALSGSPSHRTSAWLDDGDELDFSPPRCLEGLRGLDFDPLTFRCSSPTPGDPAPPASPAP 

PAPASAFPPRVTPQAISPRGPTSPASPAALDISEPLAVSVPPAVLELLGAGGAPASATPTP 

ALSPGRSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLPPPPLSLLR 

PGGAPPPPPKNPARLMALALAERAQQVAEQQSQQECGGTPPASQSPFHRSLSLEVGGEP 

LGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQRPMGTSRRGLRGPAQVSAQ 

LRAGGGGRDAPEAAAQSPCSVPSQVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSFQP 

SSP APVWRSSLGPP APLDRGENLYYEIGASEGSPYSGPTRSWSPFRSMPPDRLNAS YGM 

LGQSPPLHRSPDFLLSYPPAPSCFPPDHLGYSAPQHPARRPTPPEPLYVNLALGPRGPSPA 

SSSSSSPPAHPRSRSDPGPPVPRLPQKQRAPWGPRTPHRVPGPWGPPEPLLLYRAAPPAY 

GRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHSEGQTRSYC corresponding to 

amino acids 186 - 1305 of T08446_PEA_1_P18, wherein said first amino acid sequence and 

second amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of T08446_PEA_1_P18, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
LDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIVSVIDMPPTEDRSW 
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WRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAVPRPRGKLA 

GLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSEFIEAHGVV 

DGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPNPLLTYQLY 

GKFSEAMSVPGEEERLVRVHDVIQQLPPPHYRTLEYLLRHLARMARHSANTSMHARNL 

AIVWAPNLLRSMELESVGMGGAAAFREVRVQSVVVEFLLTHVDVLFSDTFTSAGLDP A 

GRCLLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAERRKGERGEK 

QRKPGGSSWKTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRSAKSEESLS 

SQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSCESLSSSSSSESSSSESSSSSSESSAAGL 

GALSGSPSHRTSAWLDDGDELDFSPPRCLEGLRGLDFDPLTFRCSSPTPGDPAPPASPAP 

PAPASAFPPRVTPQAISPRGPTSPASPAALDISEPLAVSVPPAVLELLGAGGAPASATPTP 

ALSPGRSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLPPPPLSLLR 

PGGAPPPPPKNPARLMALALAERAQQVAEQQSQQECGGTPPASQSPFHRSLSLEVGGEP 

LGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQRPMGTSRRGLRGPAQVSAQ 

LRAGGGGRDAPEAAAQSPCSVPSQVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSFQP 

SSPAPVWRSSLGPPAPLDRGENLYYEIGASEGSPYSGPTRSWSPFRSMPPDRLNASYGM 

LGQSPPLHRSPDFLLSYPPAPSCFPPDHLGYSAPQHPARRPTPPEPLYVNLALGPRGPSPA 

SSSSSSPPAHPRSRSDPGPPVPRLPQKQRAPWGPRTPHRVPGPWGPPEPLLLYRAAPPAY 

GRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHSEGQTRSYCin 

T08446_PEA_1_P1 8. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for T08446_PEA_1_P18, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 
PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 
DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQMLVPLLLQYLETLSGLVDSNLNC 
GPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIVSVIDM 
PPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAV 
PRPRGKLAGLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSE 
FIEAHGVVDGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPNP 
LLTYQLYGKFSEAMSVPGEEERLVRV corresponding to amino acids 1 - 443 of 
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T08446_PEA_1_P18, a second amino acid sequence being at least 90 % homologous to 

HDVIQQLPPPHYRTLEYLLRHLARMARHSANTSMHARNLAIVWAPNLLRSMELESVG 

MGGAAAFREVRVQSVVVEFLLTHVDVLFSDTFTSAGLDPAGRCLLPRPKSLAGSCPSTR 

LLTLEEAQARTQGRLGTPTEPTTPKAPASPAERRKGERGEKQRKPGGSSWKTFFALGRG 

PSVPPJCKPLPWLGGTRAPPQPSGSRPDTVTLRSAKSEESLSSQASGAGLQRLHRLRRPHS 

SSDAFPVGPAPAGSCESLSSSSSSESSSSESSSSSSESSAAGLGALSGSPSHRTSAWLDDG 

DELDFSPPRCLEGLRGLDFDPLTFRCSSPTPGDPAPPASPAPPAPASAFPPRVTPQAISPRG 

PTSPASPAALDISEPLAVSVPPAVLELLGAGGAPASATPTPALSPGRSLRPHLIPLLLRGA 

EAPLTDACQQEMCSKLRGAQGPLGPDMESPLPPPPLSLLRPGGAPPPPPKNPARLMALA 

LAERAQQVAEQQSQQECGGTPPASQSPFHRSLSLEVGGEPLGTSGSGPPPNSLAHPGAW 

VPGPPPYLPRQQSDGSLLRSQRPMGTSRRGLRGPAQVSAQLRAGGGGRDAPEAAAQSP 

CSVPSQVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGPPAPLDR 

GENLYYEIGASEGSPYSG corresponding to amino acids 1 - 674 of Q9NT23, which also 

corresponds to amino acids 444 - 1 1 17 of T08446_PEA_1_P18, a bridging amino acid P 

corresponding to amino acid 1 1 18 of T08446_PEA_1_P18, and a third amino acid sequence 

being at least 90 % homologous to 

TRSWSPFRSMPPDRLNASYGMLGQSPPLHRSPDFLLSYPPAPSCFPPDHLGYSAPQHPAR 
RPTPPEPLYVNLALGPRGPSPASSSSSSPPAHPRSRSDPGPPVPRLPQKQRAPWGPRTPHR 
VPGPWGPPEPLLLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHS 
EGQTRSYC corresponding to amino acids 676 - 862 of Q9NT23, which also corresponds to 
amino acids 1 1 19 - 1305 of T08446_PEA_1_P18, wherein said first amino acid sequence, 
second amino acid sequence, bridging amino acid and third amino acid sequence are contiguous 
and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of T08446_PEA_1_P1 8, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 
PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 
DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQMLVPLLLQYLETLSGLVDSNLNC 
GPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIVSVIDM 
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PPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAV 
PRPRGKXAGLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSE 
FIEAHGVVDGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPNP 
LLTYQLYGKFSEAMSVPGEEERLVRV of T08446_PEA_1_P 1 8. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for T08446_PEA_1_P18, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 
PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 
DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQMLVPLLLQYLETLSGLVDSNLNC 
GPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIVSVIDM 
PPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAV 
PRPRGKLAGLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSE 
FIEAHGWDGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPNP 
LLTYQLYGKFSEAMSVPGEEERLVRVHDVIQQLPPPHYRTLEYLLRHLARMARHSANT 
SMHARNLAIVWAPNLLRSMELESVGMGGAAAFREVRVQSWVEFLLTHVDVLFSDTF 
TSAGLDPAGRCLLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAER 
RKGERGEKQRKPGGSSWKTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRS 
AKSEESLSSQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSCESLSSSSSSESSSSESSSSS 
SESSAAGLGALSGSPSHRTSAWLDDGDELDFSPPRCLEGLRGLDFDPLTFRCSSPTPGDP 
APPASPAPPAPASAFPPRVTPQAISPRGPTSPASPAALDISEPLAVSVPPAVLELLGAGGA 
PASATPTPALSPGRSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLP 
PPPLSLLRPGGAPPPPPKNPARLMALALAERAQQVAEQQSQQECGGTPPASQSPFHRSLS 
LEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQRPMGTSRRG 
corresponding to amino acids 1 - 1010 of T08446_PEA_1_P18, and a second amino acid 
sequence being at least 90 % homologous to 

LRGPAQVSAQLRAGGGGRDAPEAAAQSPCSVPSQVPTPGFFSPAPRECLPPFLGVPKPG 
LYPLGPPSFQPSSPAPVWRSSLGPPAPLDRGENLYYEIGASEGSPYSGPTRSWSPFRSMPP 
DRLNASYGMLGQSPPLHRSPDFLLSYPPAPSCFPPDHLGYSAPQHPARRPTPPEPLYVNL 
ALGPRGPSPASSSSSSPPAHPRSRSDPGPPVPRLPQKQRAPWGPRTPHRVPGPWGPPEPL 
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LLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHSEGQTRSYC 
corresponding to amino acids 1 - 295 of Q96CP3, which also corresponds to amino acids 1011- 
1305 of T08446_PEA_1_P18, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of T08446_PEA_1JP18, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 
PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 
DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQMLVPLLLQYLETLSGLVDSNLNC 
GPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIVSVIDM 
PPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAV 
PRPRGKLAGLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSE 
FIEAHGVVDGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPNP 
LLTYQLYGKFSEAMSVPGEEERLVRVHDVIQQLPPPHYRTLEYLLRHLARMARHSANT 
SMHARNLAIVWAPNLLRSMELESVGMGGAAAFREVRVQSVVVEFLLTHVDVLFSDTF 
TSAGLDPAGRCLLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAER 
RKGERGEKQRKPGGSSWKTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRS 
AKSEESLSSQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSCESLSSSSSSESSSSESSSSS 
SESSAAGLGALSGSPSHRTSAWLDDGDELDFSPPRCLEGLRGLDFDPLTFRCSSPTPGDP 
APP ASP APP AP AS AFPPRVTPQ AISPRGPTSPASP AALDISEPLA VS VPP AVLELLG AGG A 
PASATPTPALSPGRSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLP 
PPPLSLLRPGGAPPPPPKNPARLMALALAERAQQVAEQQSQQECGGTPPASQSPFHRSLS 
LEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQRPMGTSRRGof 

T08446_PEA_1_P1 8. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for T08446_PEA_1_P18, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 
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PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 
DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQ corresponding to amino acids 1-154 
of T08446_PEA_1_P18, a second amino acid sequence being at least 90 % homologous to 
MLVPLLLQYLETLSGLVDSNLNCGPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVI 
5 KRYTAQAPDELSFEVGDIVSVIDMPPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPG 
LKADADGPPCGIPAPQGISSLTSAVPRPRGKLAGLLRTFMRSRPSRQRLRQRGILRQRVF 
GCDLGEHLSNSGQDVPQVLRCCSEFIEAHGVVDGIYRLSGVSSNIQRLRHEFDSERIPEL 
SGPAFLQDIHSVSSLCKLYFRELPNPLLTYQLYGKFSEAMSVPGEEERLVRVHDVIQQLP 
PPHYRTLEYLLRHLARMARHSANTSMHARNLAIVWAPNLLRSMELESVGMGGAAAFR 

1 0 EVRVQSV WEFLLTHVDVLFSDTFTS AGLDPAGRCLLPRPKSLAGSCPSTRLLTLEEAQ 
ARTQGRLGTPTEPTTPKAPASPAERRKGERGEKQRKPGGSSWKTFFALGRGPSVPRKKP 
LPWLGGTRAPPQPSGSRPDTVTLRSAKSEESLSSQASGAGLQREHRLRRPHSSSDAFPVG 
PAPAGSCESLSSSSSSESSSSESSSSSSESSAAGLGALSGSPSHRTSAWLDDGDELDFSPPR 
CLEGLRGLDFDPLTFRCSSPTPGDPAPPASPAPPAPASAFPPRVTPQAISPRGPTSPASPAA 

1 5 LDISEPLAVSVPPAVLELLG AGGAPASATPTPALSPGRSLRPHLIPLLLRGAEAPLTDACQ 
QEMCSKLRGAQGPLGPDMESPLPPPPLSLLRPGGAPPPPPKNPARLMALALAERAQQVA 
EQQSQQECGGTPPASQSPFHRSLSLEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPR 
QQSDGSLLRSQRPMGTSRRGLRGPA corresponding to amino acids 1 - 861 ofBAC86902, 
which also corresponds to amino acids 155 - 1015 of T08446_PEA_1_P18, a third amino acid 

20 sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
QVSAQLRAGGGGRDAPEAAAQSPCSVPS corresponding to amino acids 1016 - 1043 of 
T08446 PEA1JP18, a fourth amino acid sequence being at least 90 % homologous to 
QVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGPPAPLDRGENLY 

25 YEIGASEGSPYSGPTRSWSPFRSMPPDRLNASYGMLGQSPPLHRSPDFLLSYPPAPSCFPP 
DHLGYS corresponding to amino acids 862 - 989 of BAC86902, which also corresponds to 
amino acids 1044 - 1 171 of T08446_PEA_1_P18, and a fifth amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

30 APQHPARRPTPPEPLYVNLALGPRGPSPASSSSSSPPAHPRSRSDPGPPVPRLPQKQRAP 
WGPRTPHRVPGPWGPPEPLLLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYP 
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TPSWSLHSEGQTRSYC corresponding to amino acids 1 172 - 1305 of T08446JPEA_1_P18, 
wherein said first amino acid sequence, second amino acid sequence, third amino acid sequence, 
fourth amino acid sequence and fifth amino acid sequence are contiguous and in a sequential 
order. 

5 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a head of T08446 JPEA_1 JP18, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 

10 PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 
DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQ of T08446_PEA_1 _P18. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for an edge portion of T08446JPEA_1_P18, comprising an amino 
acid sequence being at least 70%, optionally at least about 80%, preferably at least about 85%, 

15 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence encoding for QVSAQLRAGGGGRDAPEAAAQSPCSVPS, corresponding to 

T08446_PEAJLP18. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of T08446JPEA_1 JP18, comprising a polypeptide being 

20 at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
APQHPARJRPTPPEPLYVNLALGPRGPSPASSSSSSPPAHPRSRSDPGPPVPRLPQKQRAP 
WGPRTPHRVPGPWGPPEPLLLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYP 
TPSWSLHSEGQTRSYC in T08446_PEA„1_P18. 

25 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for Tl 1628JPEA_1_P2 ? comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDE 

30 corresponding to amino acids 1-55 of Tl 1628JPEA_1 JP2, and a second amino acid sequence 
being at least 90 % homologous to 
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MKASEDLKKHGATVLTALGGILKKXGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQV 
LQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG corresponding to amino 
acids 1 - 99 of Q8WVH6, which also corresponds to amino acids 56 - 154 of 
Tl 162 8_PE A_ 1 J?2 , wherein said first amino acid sequence and second amino acid sequence 
5 are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of Tl 1628_PEA_1_P2, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 

1 0 MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDE of 
T11628JPEAJJP2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Tl 1628_PEA__1 JP5, comprising a first amino acid 
sequence being at least 90 % homologous to 

15 MKASEDLKKHGATVLTAm 

LQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG corresponding to amino 
acids 56 - 154 of MYGHUMANJV 1 , which also corresponds to amino acids 1 - 99 of 
T11628_PEA_1_P5. 

According to preferred embodiments of the present invention, there is provided an 

20 isolated chimeric polypeptide encoding for Tl 1628_PEA__1_P7, comprising a first amino acid 
sequence being at least 90 % homologous to 
MGLSDGEWQLVLNVWGKVEADIPGHGQ 

ASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQV 
SKHPGDFGADAQGAMNK corresponding to amino acids 1 - 134 of MYG„HUMAN__V1, 

25 which also corresponds to amino acids 1 - 134 of Tl 1628JPEA_1 JP7, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence G 
corresponding to amino acids 135 - 135 of T11628JPEA_1JP7, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 

30 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for Tl 1628_PEA_1 JP10, comprising a first amino acid 
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sequence being at least 70%, optionally at least 80%, preferably at least 85%), more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKPDKFKHLKSEDE 
corresponding to amino acids 1 - 55 of Tl 1628_PEA1_P10, and a second amino acid sequence 
5 being at least 90 % homologous to 

MKASEDLKKHGATVLTALGGILKEXGHHEAEIKPLAQSHATKHKIPVKYLEFK 
LQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG corresponding to amino 
acids 1-99 of Q8WVH6, which also corresponds to amino acids 56 - 154 of 
T11628 PEA1P10, wherein said first amino acid sequence and second amino acid sequence 

10 are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of Tl 1628_PEA_1_P10, comprising a polypeptide 
being at least 70%, optionally at least about 80%), preferably at least about 85%), more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 

1 5 MGLSDGEWQLVLNVWGKVEADIPGHGQ of 
T11628JPEA_1JP10. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R35137JPEA_1 JPEA_1_PEA_1 JP9, comprising a 
first amino acid sequence being at least 90 % homologous to 

20 MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVK 
KPFTEVIRANIGDAQAMGQRPITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACG 
GHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNWFLSTGASDAIVTVLKJLLVAGEG 
HTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDHCRP 
RALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEV corresponding to amino acids 1 - 

25 274 of ALATJHUMANJVl, which also corresponds to amino acids 1 - 274 of 

R35137_PEA_1 JPEA_1_PEA_1 JP9, and a second amino acid sequence being at least 70%, 
optionally at least 80%o, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

RGAGEREAGQQSAPVTPCALPGVPGQRVRRGFAVPLIQEGAHGDGAALRRAAGACLLP 
30 LHLQGLHGRVRAYEAGGGSRAMARPSSPDGP 

corresponding to amino acids 275 - 385 of R35137_PEA_1JPEA_1_PEA_1 JP9, wherein said 
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first amino acid sequence and second amino acid sequence are contiguous and in a sequential 
order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R35137_PEA_1_PEA_1 JPEA_1 JP9, comprising a 
5 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

RGAGEREAGQQSAPVTPCALPGVPGQRVRRGFAVPLIQEGAHGDGAALRRAAGACLLP 
LHLQGLHGRVRAYEAGGGSRAMARPSSPDGPPPPPHLTWPCAGAGSAAAMWRW in 
10 R35 1 37 JPEA_1 JPEA_1_PEA_1JP9. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R35137_PEA_1 JPEA_1_PEA_1 JP8, comprising a 
first amino acid sequence being at least 90 % homologous to 

MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVK 

15 KPFTEVIRANIGDAQAMGQRPITFL^ 

GHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNVFLSTGASDAIVTVLKLLVAGEG 
HTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDHCRP 
RALCVINPGNPTGQVQTRECIEAVIRFAFEERI.FLLADEVYQDNVYAAGSQFHSFKKVL 
MEMGPPYAGQQELASFHSTSKGYMGEC corresponding to amino acids 1 - 320 of 

20 ALATHUMANVl, which also corresponds to amino acids 1 - 320 of 

R35137JPEA_1 JPEA_1PEA_1 JP8, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
VRTRRVGARGPWPGPPRPMGHPLLRT corresponding to amino acids 321 - 346 of 

25 R35137JPEA_1JPEA_1 JPEA_1_P8, wherein said first amino acid sequence and second amino 
acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R35137_PEA_1 JPEA_1JPEA_1JP8, comprising a 
polypeptide being at least 70%, optionally at feast about 80%, preferably at least about 85%, 

30 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence VRTRRVGARGPWPGPPRPMGHPLLRT in R35137_PEA_1_PEA_1 JPEA_1 JP8. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R35137J?EA_1 J?EA_1 JPEA_1_P1 1, comprising a 
first amino acid sequence being at least 90 % homologous to 
5 MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVK 
KPFTEVIRANIGDAQAMGQRPITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACG 
GHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNVFLSTGASDAIVTVLKLLVAGEG 
HTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQAR 
corresponding to amino acids 1 - 229 of ALATHUMANJVl, which also corresponds to amino 

10 acids 1 - 229 of R35137 PEA1 JPEA_1_PEA_1 JP1 1, and a second amino acid sequence being 
at least 90 % homologous to SGFGQREGTYHFRMTILPPLEKERLLLEKLSRFHAKFTLEYS 
corresponding to amino acids 455 - 496 of AL AT JHUM AN_V 1 , which also corresponds to 
amino acids 230 - 271 of R35137JPEA_1JPEA_1JPEA_1_P11, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 

1 5 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for an edge portion of 

R35137_PEA_1_PEA_1JPEA_1 JP11, comprising a polypeptide having a length "n", wherein n 
is at least about 10 amino acids in length, optionally at least about 20 amino acids in length, 
preferably at least about 30 amino acids in length, more preferably at least about 40 amino acids 
20 in length and most preferably at least about 50 amino acids in length, wherein at least two amino 
acids comprise RS, having a structure as follows: a sequence starting from any of amino acid 
numbers 229-x to 229; and ending at any of amino acid numbers 230+ ((i>2) - x), in which x 
varies from 0 to i>2. 

According to preferred embodiments of the present invention, there is provided an 
25 isolated chimeric polypeptide encoding for R35137_PEA_1_PEA_1_PEA_1_P2, comprising a 
first amino acid sequence being at least 90 % homologous to 

MASSTGDRSQAVRHGLRAXVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVK 
KPFTEVIRAMGDAQAMGQRPIT^ 

GHSLGAYSVSSGIQLIREDVARYlERRDGGIPADPlSnsrvTLS 
30 HTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDHCRP 
RALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEV corresponding to amino acids 1 - 
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274 of AL ATJHUMAN_V 1 , which also corresponds to amino acids 1 - 274 of 
R35137JPEA_1 JPEA_1JPEA_1_P2, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

RGAGEREAGQQSAPVTPCALPGVPGQRVRRGFAVPLIQEGAHGDGAALRRAAGACLLP 
LHLQGLHGRVRVPRRLCGGGEHGRCSAAADAEADECAAVPAGARTGPAGPGGQPAR 
AHRPLLCAVPG corresponding to amino acids 275 - 399 of 

R35137_PEA_1JPEA_1 JPEA_1_P2, wherein said first amino acid sequence and second amino 
acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R35137JPEA_1JPEA_1 JPEA_1_P2, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

RGAGEREAGQQSAPVTPCALPGVPGQRVRRGFAVPLIQEGAHGDGAALRRAAGACLLP 
LHLQGLHGRVRVPRRLCGGGEHGRCSAAADAEADECAAVPAGARTGPAGPGGQPAR 
AHRPLLCAVPG in R35137JPEA_1 J > EA_1_PEA_1_P2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R35137_PEA_1JPEA_1_PEA_1_P4, comprising a 
first amino acid sequence being at least 90 % homologous to 

MASSTGDRSQAVRHGLRAKVLTLDGMNPRVFIRVEYAVRGPIVQRALELEQELRQGVK 

KPFTEVIRANIGDAQAMGQRPITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACG 

GHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNVFLSTGASDAIVTVLKLLVAGEG 

HTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDHCRP 

RALCVINPGNPTGQVQTRECIEAVI^ 

MEMGPPYAGQQELASFHSTSKGYMGECGFRGGYVEVVNMDAAVQQQMLKLMSVRL 

CPPWGQALLDLWSPPAPTDPSFAQFQAEKQAVLAELAAKAKLTEQVFNEAPGISCNP 

VQGAMYSFPRVQLPPRAVERAQELGLAPDMFFCLRLLEETGICWPGSGFGQREGTYH 

FRMTILPPLEKLRLLLEKLSRFHAKFTLE corresponding to amino acids 1 - 494 of 

AL AT_HUM AN_V 1 , which also corresponds to amino acids 1 - 494 of 

R35137 PEA 1 PEA 1 PEA 1 P4, and a second amino acid sequence being at least 70%, 
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optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

SPGRLWSPLYLLLMPGGVGWGGCWAPASLQVPNKAVWQSDSKKEALAAAWPAPTCL 
PFLQA corresponding to amino acids 495 - 555 of R35137JPEA_1 JPEA_1 JPEA_1_P4, 
5 wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of R35137JPEA_1JPEA_1JPEA_1._P4, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
10 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

SPGRLWSPLYLLLMPGGVGWGGCWAPASLQVPNKAVWQSDSKKEALAAAWPAPTCL 
PFLQA in R35137_PEA_1_PEA_1 _PEA_1 J>4. 

According to preferred embodiments of the present invention, there is provided an 

15 isolated chimeric polypeptide encoding for Rl 1723_PEA__1_P6, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MWLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
MEQSAGIMYRKSCASSAACLIASAGSPCRGLAPGREEQRALHKAGAVGGGVR 

20 corresponding to amino acids 1 - 1 10 of Rl 1723 JPEA_1 JP6, and a second amino acid sequence 
being at least 90 % homologous to 

MYAQALLWGVLQRQAAAQHLHEHPPKLLRGHRVQERVDDRAEVEKRLREGEEDHV 
RPEVGPRPVVLGFGRSHDPPNLVGHPAYGQCHNNQPWADTSRRERQRKEKHSMRTQ 
corresponding to amino acids 1 - 1 12 of Q8IXM0, which also corresponds to amino acids 111- 

25 222 of R11723JPEA_1 JP6, wherein said first and second amino acid sequences are contiguous 
and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of Rl 1723_PEA_1JP6, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 

30 at least about 90% and most preferably at least about 95% homologous to the sequence 
MWVLGIAATFCGLFLLPGFALQIQCYQCE 
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MEQSAGIMYRKSCASSAACLIASAGSPCRGLAPGREEQRALHKAGAVGGGVRof 
R11723JPEA_1_P6. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Rl 1723 PEA 1P6, comprising a first amino acid 
5 sequence being at least 90 % homologous to 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
MEQSAGIMYRKSCASSAACLIASAG corresponding to amino acids 1 - 83 of Q96AC2, 
which also corresponds to amino acids 1 - 83 of Rl 1723_PEA_1 JP6, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 

10 least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
SPCRGLAPGREEQRALHKAGAVGGGVRMYAQALLVVGVLQRQAAAQHLHEHPPKLL 
RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPVVLGFGRSHDPPNLVGHPAYGQ 
CHNNQPWADTSRRERQRKEKHSMRTQ corresponding to amino acids 84 - 222 of 
R11723JPEA1JP6, wherein said first and second amino acid sequences are contiguous and in 

15 a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Rl 1723JPEA_1_P6, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 

20 SPCRGLAPGREEQRALHKAGAVGGGVRMYAQALLVVGVLQRQAAAQHLHEHPPKLL 
RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPVVLGFGRSHDPPNLVGHPAYGQ 
CHNNQPWADTSRRERQRKEKHSMRTQ in R11723JPEA_1JP6. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Rl 1723 JPEA_1 JP6, comprising a first amino acid 

25 sequence being at least 90 % homologous to 

M\WLGIAATFCGLFLLPGFALQIQCYQC^ 

MEQSAGIMYRKSCASSAACLIASAG corresponding to amino acids 1 - 83 of Q8N2G4, 
which also corresponds to amino acids 1-83 of R11723_PEA_1_P6, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
30 least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
SPCRGLAPGREEQRAIJIKA.GAVGGGVRM 
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RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPVVLGFGRSHDPPNLVGHPAYGQ 
CHNNQPWADTSRRERQRKEKHSMRTQ corresponding to amino acids 84 - 222 of 
Rl 1723JPEA_1 JP6 ? wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

5 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of Rl 1723JPEA1P6, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
SPCRGLAPGREEQRALHKAGAVGGGVRMYAQALLVVGVLQRQAAAQHLHEHPPKLL 

10 RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPVVLGFGRSHDPPNLVGHPAYGQ 
CHNNQPWADTSRRERQRKEKHSMRTQ in Rl 1723JPEA1 JP6. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R11723_PEA_1_P6, comprising a first amino acid 
sequence being at least 90 % homologous to 

15 MWVLGIAATFCGLFLLPGFALQIQCYQ 

MEQSAGIMYRKSCASSAACLIASAG corresponding to amino acids 24- 106 of BAC85518, 
which also corresponds to amino acids 1-83 of Rl 1723_PEA1_P6 ? and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 

20 SPCRGLAPGREEQRALHKAGAVGGGVRMYAQALLWGVLQRQAAAQHLHEHPPKLL 
RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPVVLGFGRSHDPPNLVGHPAYGQ 
CHNNQPWADTSRRERQRKEKHSMRTQ corresponding to amino acids 84 - 222 of 
R11723JPEA1_P6, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

25 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of Rl 1723_PEA__1_P6, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
SPCRGLAPGREEQRALHKAGAVGGGVRMYAQALLWGVLQRQAAAQHLHEHPPKLL 

30 RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPWLGFGRSHDPPNLVGHPAYGQ 
CHNNQPWADTSRRERQRKEKHSMRTQ in R11723_PEA_1_P6. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Rl 1723_PEA_1JP7, comprising a first amino acid 
sequence being at least 90 % homologous to 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
5 MEQSAG corresponding to amino acids 1 - 64 of Q96AC2, which also corresponds to amino 
acids 1-64 of Rl 1723JPEA 1 JP7, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
SHCVTRLECSGTISAHCNLCLPGSNDHPT corresponding to amino acids 65 - 93 of 
10 Rl 1723JPEA_1 JP7, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Rl 1723 PEA_1_P7, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 

1 5 least about 90% and most preferably at least about 95% homologous to the sequence 
SHCVTRLECSGTISAHCNLCLPGSNDHPT in Rl 1723JPEA_1 JP7. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Rl 1 723_PEA_1_P7, comprising a first amino acid 
sequence being at least 90 % homologous to 

20 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQ 

MEQSAG corresponding to amino acids 1-64 of Q8N2G4, which also corresponds to amino 
acids 1-64 of Rl 1723JPEA_1 JP7, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

25 SHCVTRLECSGTISAHCNLCLPGSNDHPT corresponding to amino acids 65 - 93 of 

Rl 1723JPEA__1JP7, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Rl 1723_PEA_1_P7, comprising a polypeptide being 
30 at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
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least about 90% and most preferably at least about 95% homologous to the sequence 
SHCVTRLECSGTISAHCNLCLPGSNDHPT in Rl 1723JPEA_1JP7. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Rl 1723_PEA_1 JP7, comprising a first amino acid 
5 sequence being at least 70%, optionally at least 80%>, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MWVLG corresponding to amino acids 1-5 of R11723_PEA_1 JP7, second amino acid 
sequence being at least 90 % homologous to 

IAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEVMEQSAG 
10 corresponding to amino acids 22 - 80 of BAC85273, which also corresponds to amino acids 6 - 
64 of Rl 1723_PEA_1_P7, and a third amino acid sequence being at least 70%, optionally at 
least 80%), preferably at least 85%, more preferably at least 90% and most preferably at least 
95% homologous to a polypeptide having the sequence 

SHCVTRLECSGTISAHCNLCLPGSNDHPT corresponding to amino acids 65 - 93 of 
15 R11723JPEA_1_P7, wherein said first, second and third amino acid sequences are contiguous 

and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a head of Rl 1723JPEA_1 JP7, comprising a polypeptide 

being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
20 at least about 90% and most preferably at least about 95% homologous to the sequence 

MWVLG of R11723JPEA_1JP7. 

According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of Rl 1723JPEA_1_P7, comprising a polypeptide being 

at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
25 least about 90% and most preferably at least about 95% homologous to the sequence 

SHCVTRLECSGTISAHCNLCLPGSNDHPT in R11723JPEA_1_P7. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Rl 1723JPEA_1_P7, comprising a first amino acid 
30 sequence being at least 90 % homologous to 

MWLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
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MEQSAG corresponding to amino acids 24 - 87 of BAC85518, which also corresponds to 
amino acids 1 - 64 of Rl 1723_PEA_1_P7, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
SHCVTRLECSGTISAHCNLCLPGSNDHPT corresponding to amino acids 65 - 93 of 
Rl 1723_PEA_1_P7, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Rl 1723JPEAJ JP7, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
SHCVTRLECSGTISAHCNLCLPGSNDHPT in R11723_PEAJJP7. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Rl 1723_PEA_1_P13, comprising a first amino acid 
sequence being at least 90 % homologous to 
MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQL 

MEQSA corresponding to amino acids 1 - 63 of Q96AC2, which also corresponds to amino 
acids 1-63 of Rl 1723JPEA_1 JP13, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
DTKRTNTLLFEMRHFAKQLTT corresponding to amino acids 64 - 84 of 
R11723_PEA_1_P13, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Rl 1723_PEA_1 JP13, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
DTKRTNTLLFEMRHFAKQLTT in R11723_PEA_1_P13. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R11723_PEA_1JP10, comprising a first amino acid 
sequence being at least 90 % homologous to 
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MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
MEQSA corresponding to amino acids 1 - 63 of Q96AC2, which also corresponds to amino 
acids 1 - 63 of Rl 1723_PEA_1_P10 3 and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
DRVSLCHEAGVQWNNFSTLQPLPPRLK corresponding to amino acids 64 - 90 of 
R11723JPEA_1JP10 9 wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Rl 1723JPEA_1 JP10, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
DRVSLCHEAGVQWNNFSTLQPLPPRLK in Rl 1723JPEA_1JP10. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Rl 1723_PEA_1__P10, comprising a first amino acid 
sequence being at least 90 % homologous to 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
MEQSA corresponding to amino acids 1 - 63 of Q8N2G4, which also corresponds to amino 
acids 1-63 of Rl 1723_PEA_1 JP10, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
DRVSLCHEAGVQWNNFSTLQPLPPRLK corresponding to amino acids 64 - 90 of 
R11723 PEA_1_P10, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Rl 1723 JPEA_1 JP10, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
DRVSLCHEAGVQWNNFSTLQPLPPRLK in Rl 1723 JPEA_1_P10. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R11723JPEA_1JP10, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MWVLG corresponding to amino acids 1 - 5 of Rl 1723J > EA_1_P10, second amino acid 
sequence being at least 90 % homologous to 

IAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEVMEQSA 
corresponding to amino acids 22 - 79 of BAC85273, which also corresponds to amino acids 6 - 
63 of Rl 1723_PEA_1_P10, and a third amino acid sequence being at least 70%, optionally at 
least 80%, preferably at least 85%, more preferably at least 90% and most preferably at least 
95% homologous to a polypeptide having the sequence 

DRVSLCHEAGVQWNNFSTLQPLPPRLK corresponding to amino acids 64 - 90 of 

Rl 1723_PEA_1 JP10, wherein said first, second and third amino acid sequences are contiguous 

and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of Rl 1723_PEA_1_P10, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
MWVLG of R11723J > EA_1JP10. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Rl 1723_PEA_1 JP10, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
DRVSLCHEAGVQWNNFSTLQPLPPRLK in R11723J>EA_1JP10. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R11723JPEAJ JP10, comprising a first amino acid 
sequence being at least 90 % homologous to 

MWLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
MEQSA corresponding to amino acids 24 - 86 of BAC85518, which also corresponds to amino 
acids 1-63 of Rl 1723_PEA_1JP10, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
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preferably at least 95% homologous to a polypeptide having the sequence 
DRVSLCHEAGVQWNNFSTLQPLPPRLK corresponding to amino acids 64 - 90 of 
Rl 1723JPEA_1_P10, wherein said first and second amino acid sequences are contiguous and in 
a sequential order, 

5 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of Rl 1723_PEA__1 JP10, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
DRVSLCHEAGVQWNNFSTLQPLPPRLK in Rl 1723_PEA_1_P10. 
10 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for R16276JPEA_1_P7, comprising a first amino acid 
sequence being at least 90 % homologous to 

MQSVQSTSFCLRKQCLCLTFLLLHLLGQVAATQRCPPQCPG corresponding to amino 
acids 1-41 of NOV HUMAN, which also corresponds to amino acids 1 - 41 of 

15 R 1 62 7 6_PE A_ 1 _P7 , a bridging amino acid Q corresponding to amino acid 42 of 

Rl 6276 JPEA __1_P7, a second amino acid sequence being at least 90 % homologous to 
CPATPPTCAPGVRAVLDGCSCCLVCARQRGESCSDLEPCDESSGLYCDRSADPSNQTGI 
CT corresponding to amino acids 43 - 103 of NOV HUM AN, which also corresponds to amino 
acids 43 - 103 of R16276 PEA1P7, and a third amino acid sequence being at least 70%, 

20 optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence GNPAPSAV 
corresponding to amino acids 104 - 111 of R16276JPEA_1_P7, wherein said first amino acid 
sequence, bridging amino acid, second amino acid sequence and third amino acid sequence are 
contiguous and in a sequential order. 

25 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of R16276_JPEA_1_P7, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
GNPAPSAV in R16276__PEA_1_P7. 

30 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for R16276_PEA_1_P7, comprising a first amino acid 
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sequence being at least 90 % homologous to 

MQSVQSTSFCLRKQCLCLTFLLLHLLGQVAATQRCPPQCPG corresponding to amino 
acids 1-41 of NOV_HUMAN, which also corresponds to amino acids 1 - 41 of 
R16276JPEA_1P7, a bridging amino acid Q corresponding to amino acid 42 of 
5 R16276_PEA_1_P7, a second amino acid sequence being at least 90 % homologous to 

CPATPPTCAPGVRAVLDGCSCCLVCARQRGESCSDLEPCDESSGLYCDRSADPSNQTGI 
CT corresponding to amino acids 43 - 103 of NOVJHUMAN, which also corresponds to amino 
acids 43 - 103 of R16276JPEA_1 JP7, and a third amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 

10 preferably at least 95% homologous to a polypeptide having the sequence GNPAPSAV 

corresponding to amino acids 104 - 1 1 1 of R16276_PEA__1 JP7, wherein said first amino acid 
sequence, bridging amino acid, second amino acid sequence and third amino acid sequence are 
contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

15 isolated polypeptide encoding for a tail of R16276_PEA_1 JP7, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
GNPAPSAV in R16276_PEA_1_P7. 

According to preferred embodiments of the present invention, there is provided an 

20 isolated chimeric polypeptide encoding for HUMCEAPEA 1 P4, comprising a first amino 
acid sequence being at least 90 % homologous to 

MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKEVLLL^ 

HLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREIIYPNASLLIQNIIQNDTGFYT 

LHVIKSDLVNEEATGQFRVYPELPKPSISSNNSKPVEDKDAVAFTCEPETQDATYLWWV 

25 NNQSLPVSPRLQLSNGNRTLTLFNVTRNDTASYKCETQNPVSARRSDSVILNVL 

corresponding to amino acids 1 - 234 of CEA5HUMAN, which also corresponds to amino 
acids 1 - 234 of HUMCEAPEA 1P4, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%), more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

30 CEYICSSLAQAASPNPQGQRQDFSVPLRFKYTDPQPWTSRLSVTFCPRKTWADQVLTKN 
RRGGAASVLGGSGSTPYDGRNR corresponding to amino acids 235 - 315 of 
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HUMCEA_PEA 1P4, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HUMCEA_PEA_1_P4, comprising a polypeptide 
5 being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
CEYICSSLAQAASPNPQGQRQDFSVPLRFKYTDPQPWTSRLSVTFCPRKTWADQVLTKN 
RRGGAASVLGGSGSTPYDGRNR in HUMCEA_PEA_1 _P4. 

According to preferred embodiments of the present invention, there is provided an 
10 isolated chimeric polypeptide encoding for HUMCEAPEA _1 JP5, comprising a first amino 
acid sequence being at least 90 % homologous to 

MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKEVLLLVHNLPQ 
HLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREIIYPNASLLIQNIIQNDTGFYT 
LHVIKSDLVNEEATGQFRVYPELPKPSISSNNSKPVEDKDAVAFTCEPETQDATYLWWV 

15 NNQSLPVSPRLQLSNGNRTLTLFNVTRNDTASYKCETQNPVSARRSDSVILNVLYGPDA 
PTISPLNTSYRSGENLNLSCHAASNPPAQYSWFVNGTFQQSTQELFIPNITVNNSGSYTC 
QAHNSDTGLNRTTVTTITVYAEPPKPFITSNNSNPVEDEDAVALTCEPEIQNTTYLWWV 
NNQSLPVSPRJLQLSNDNRTLTLLSVTRNDVGPYECGIQNELSVDHSDPVILNVLYGPDD 
PTISPSYTYYRPGVNLSLSCHAASNPPAQYSWLIDGNIQQHTQELFISNITEKNSGLYTCQ 

20 ANNSASGHSRTTVKTITVSAELPKPSISSNNSKPVEDKDAVAFTCEPEAQNTTYLWWVN 
GQSLPVSPRLQLSNGNRTLTLFNVTRNDARAYVCGIQNSVSANRSDPVTLDVLYGPDTP 
IISPPDSSYLSGANLNLSCHSASNPSPQYSWRINGIPQQHTQVLFIAKITPNNNGTYACFV 
SNLATGRNNSIVKSITVS corresponding to amino acids 1 - 675 of CEA5_HUMAN, which 
also corresponds to amino acids 1 - 675 of HUMCEAJPEA_1_P5, and a second amino acid 

25 sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
GKWLPGASASYSGVESIWFSPKSQEDIFFPSLCSMGTRKSQILS coixesponding to amino 
acids 676 - 719 of HUMCEA PEA1P5, wherein said first amino acid sequence and second 
amino acid sequence are contiguous and in a sequential order. 

30 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of HUMCEA PEA 1P5, comprising a polypeptide 
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being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
GKWLPGASASYSGVESIWFSPKSQEDIFFPSLCSMGTRKSQILS in HUMCEAJ>EA_1 JP5. 
According to preferred embodiments of the present invention, there is provided an 
5 isolated chimeric polypeptide encoding for HUMCEAPEA1_P19, comprising a first amino 
acid sequence being at least 90 % homologous to 

MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKEVLLLVHNLPQ 
HLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREIIYPNASLLIQNIIQNDTGFYT 
LHVIKSDLVNEEATGQFRVYPELPKPSISSNNSKPVEDKDAVAFTCEPETQDATYLWWV 
10 NNQSLPVSPRLQLSNGNRTLTLFNVTRNDTASYKCETQNPVSARRSDSVILN 

corresponding to amino acids 1 - 232 of CEA5 JHUMAN, which also corresponds to amino 
acids 1 - 232 of HUMCEA_PEA_1 JP19, and a second amino acid sequence being at least 90 % 
homologous to 

VLYGPDTPIISPPDSSYLSGANLNLSCHSASNPSPQYSWRINGIPQQHTQVLFIAKITPNNN 

15 GTYACFVSNLATGRNNSIVKSITVSASGTSPGLSAGATVGIMIGVLVGVALI 

corresponding to amino acids 589 - 702 of CEA5JHUMAN, which also corresponds to amino 
acids 233 - 346 of HUMCEA_PEA_1_P 1 9, wherein said first amino acid sequence and second 
amino acid sequence are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

20 isolated chimeric polypeptide encoding for an edge portion of HUMCEAJPEA_1JP19, 

comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise NV, having a 

25 structure as follows: a sequence starting from any of amino acid numbers 232-x to 232; and 
ending at any of amino acid numbers 233+ ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMCEAJPEA_1_P20, comprising a first amino 
acid sequence being at least 90 % homologous to 

30 MESPSAPPHRWCIPWQRLLLTASLL^ 

HLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREIIYPNASLLIQNIIQNDTGFYT 
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LHVIKSDLVNEEATGQFRVYP corresponding to amino acids 1 - 142 of CEA5JHUMAN, 
which also corresponds to amino acids 1-142 of HUMCEA_PEA_1 JP20, and a second amino 
acid sequence being at least 90 % homologous to 

ELPKPSISSNNSKPVEDKDAVAFTCEPEAQNTTYLWWWGQSLPVSPRLQLSNGNRTLT 
5 LFNVTRNDARAYVCGIQNSVSANRSDPVTLDVLYGPDTPIISPPDSSYLSGANLNLSCHS 
ASNPSPQYSWRINGIPQQHTQVLFIAKITPNNNGTYACFVSNLATGRNNSIVKSITVSASG 
TSPGLSAGATVGIMIGVLVGVALI corresponding to amino acids 499 - 702 of 
CEA5_HUMAN, which also corresponds to amino acids 143 - 346 of HUMCEA_PEA_1_P20, 
wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 

1 0 sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of HUMCEA_PEA_1 JP20, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 

1 5 acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise PE, having a 
structure as follows: a sequence starting from any of amino acid numbers 142-x to 142; and 
ending at any of amino acid numbers 143+ ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 

20 isolated chimeric polypeptide encoding for Z44808JPEA_1 JP5, comprising a first amino acid 
sequence being at least 90 % homologous to 

MLLPQLCWLPLLAGLLPPWAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGR 
TFLSRCEFQRAXCKDPQLEIAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDD 
GTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCPGSVNEKLPQREGTGKTDDAA 
25 AJPALETQPQGDEEDIASRYPTLWTEQVKSRQNKTNKNSVSSCDQEHQSALEEAKQPKN 
DNWIPECAHGGLYKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPA 
KARDLYKGRQLQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEE 
RVVHWYFKELDKNSSGDIGKKEIKPFKRFLRKXSKPKK 

ELMGCLGVAKEDGKADTKKRHTPRGHAESTSNRQ corresponding to amino acids 1 - 441 
30 of SMG2JHUMAN, which also corresponds to amino acids 1-441 of Z44808_PEA_1_P5, and 
a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 
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85%, more preferably at least 90% and most preferably at least 95% homologous to a 
polypeptide having the sequence DAMVVSSRPKATTHRKSRTLSRR corresponding to amino 
acids 442 - 464 of Z44808 JPEA 1 JP5, wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 
5 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of Z44808JPEA_1 JP5, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
DAMVVSSRPKATTHRKSRTLSRR in Z44808_PEA_1 JP5. 
10 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for Z44808_PEA_1_P6, comprising a first amino acid 
sequence being at least 90 % homologous to 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGR 
TFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDD 

15 GTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCPGSVNEKLPQREGTGKTDDAA 
APALETQPQGDEEDI ASRYPTLWTEQ VKSRQNKTNKN S VS S CDQEHQ S ALEE AKQPKN 
DNVVIPECAHGGLYKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPA 
KARDLYKGRQLQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEE 
RWHWYFKLLDKNSSGDIGKKEI^^ 

20 ELMGCLGVAKEDGKADTKKRH corresponding to amino acids 1 - 428 of SM02 JHUMAN, 
which also corresponds to amino acids 1 - 428 of Z44808_PEA__1P6, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
RSKRNL corresponding to amino acids 429 - 434 of Z44808JPEA_1 JP6, wherein said first and 

25 second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Z44808_JPEA_1_P6, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence RSKRNL 

30 in Z44808 PEA 1 P6. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Z44808JPEA_1_P7, comprising a first amino acid 
sequence being at least 90 % homologous to 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGR 

TFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDD 

GTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCPGSVNEKLPQREGTGKTDDAA 

APALETQPQGDEEDIASRYPTLWTEQVKSRQNKTNKNSVSSCDQEHQSALEEAKQPKN 

DNVVIPECAHGGLYKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPA 

KARDLYKGRQLQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEE 
RVVHWYFKLLDKNSSGDIGKKEIKPF 

ELMGCLGVAKEDGKADTKKRHTPRGHAESTSNRQ coiTesponding to amino acids 1 - 441 
of SM02HUMAN, which also corresponds to amino acids 1-441 of Z44S08 JPEA_1_P7, and 
a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 
85%, more preferably at least 90% and most preferably at least 95% homologous to a 
polypeptide having the sequence LLWLRGKVSFYCF corresponding to amino acids 442 - 454 
of Z44808_PEA_1_P7, wherein said first and second amino acid sequences are contiguous and 
in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Z44808JPEA_1_P7, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
LLWLRGKVSFYCF in Z44808_PEA_1 JP7. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Z44 8 0 8_PE A_ 1 _P 1 1, comprising a first amino acid 
sequence being at least 90 % homologous to 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGR 
TFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDD 
GTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCPGSVNEKLPQREGTGKT 
corresponding to amino acids 1 - 170 of SM02_HUMAN, which also corresponds to amino 
acids 1 - 170 of Z44 8 0 8 PEA 1 _P 1 1, and a second amino acid sequence being at least 90 % 
homologous to 
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DIASRYPTLWTEQVKSRQNKTNKNSVSSCDQEHQSALEEAKQPKNDNVVIPECAHGGL 
YKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQLQ 
GCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEERVVHWYFKLLD 
KNSSGDIGKKEIKPFKRFLRKKSKP^ 
5 DGKADTKKRHTPRGHAESTSNRQPRKQG corresponding to amino acids 188 - 446 of 
SM02JHUMAN, which also corresponds to amino acids 171 - 429 of Z44808_PEA_1_P1 1, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of Z44808_PEA_1 JP1 1, comprising 

10 a polypeptide having a length "n", wherein n is at least about 10 amino acids in length, 

optionally at least about 20 amino acids in length, preferably at least about 30 amino acids in 
length, more preferably at least about 40 amino acids in length and most preferably at least 
about 50 amino acids in length, wherein at least two amino acids comprise TD, having a 
structure as follows: a sequence starting from any of amino acid numbers 170-x to -170; and 

15 ending at any of amino acid numbers 1714- ((n-2) - x), in which x varies from 0 to i>2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for H61775_JP16, comprising a first amino acid 
sequence being at least 90 % homologous to 

MVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESVVLGCDLLPPAGRPPLHVIEWL 
20 RFGFLLPIFIQFGLYSPRIDPDYVG corresponding to amino acids 1 1 - 93 of Q9P2J2, which 
also corresponds to amino acids 1-83 of H61775JP16, and a second amino acid sequence being 
at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and 
most preferably at least 95% homologous to a polypeptide having the sequence 
DCGFPAFRELKRAETVSPVFFTRRCIWEDLKSTGFSPAGGGRPPGGGPRTQEDSGLPCW 
25 RSSCSVTLQV corresponding to amino acids 84 - 152 of H61775_P16, wherein said first and 
second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of H61775JP16, comprising a polypeptide being at least 
70%, optionally at least about 80%, preferably at least about 85%, more preferably at least about 
30 90% and most preferably at least about 95% homologous to the sequence 
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DCGFPAFRELKRAETVSPVFFTRRCIWEDLKSTGFSPAGGGRPPGGGPRTQEDSGLPCW 
RSSCSVTLQV in H61775_P16. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for H61775 P16, comprising a first amino acid 
5 sequence being at least 90 % homologous to 

MVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESVVLGCDLLPPAGRPPLHVIEWL 
RFGFLLPIFIQFGLYSPRIDPDYVG corresponding to amino acids 1 - 83 of AAQ88495, which 
also corresponds to amino acids 1-83 of H61775JP16, and a second amino acid sequence being 
at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and 

10 most preferably at least 95% homologous to a polypeptide having the sequence 

DCGFPAFRELKRAETVSPVFFTRRCIWEDLKSTGFSPAGGGRPPGGGPRTQEDSGLPCW 
RSSCSVTLQV corresponding to amino acids 84 - 152 of H61775 _P16, wherein said first and 
second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

15 isolated polypeptide encoding for a tail of H61775_P16, comprising a polypeptide being at least 
70%, optionally at least about 80%, preferably at least about 85%, more preferably at least about 
90% and most preferably at least about 95% homologous to the sequence 

DCGFPAFRELKJIAETVSPVFFTRRCIWEDLKSTGFSPAGGGRPPGGGPRTQEDSGLPCW 
RSSCSVTLQV in H61775_P16. 
20 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for H61775JP17, comprising a first amino acid 
sequence being at least 90 % homologous to 

MVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESWLGCDLLPPAGRPPLHVIEWL 
RFGFLLPIFIQFGLYSPRIDPDYVG corresponding to amino acids 11-93 of Q9P2J2, which 
25 also corresponds to amino acids 1-83 of H61775_P17. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for H61775JP17, comprising a first amino acid 
sequence being at least 90 % homologous to 

MVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESVVLGCDLLPPAGRPPLHVIEWL 
30 RFGFLLPIFIQFGLYSPRIDPDYVG corresponding to amino acids 1 - 83 of AAQ88495, which 
also corresponds to amino acids 1-83 of H61775JP17. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for M85491_PEA_1_P13, comprising a first amino acid 
sequence being at least 90 % homologous to 

MALRRLGAALLLLPLLAAVEETLMDSTTATAELGWMVHPPSGWEEVSGYDENMNTIR 
5 TYQVCNVFESSQNNWLRTKFIRRRGAHRIHVEMKFSVRDCSSIPSVPGSCKETFNLYYY 
EADFDSATKTFPNWMENPWVKVDTIAADESFSQVDLGGRVMKINTEVRSFGPVSRSGF 
YLAFQDYGGCMSLIAVRVFYRKCPRIIQNGAIFQETLSGAESTSLVAARGSCIANAEEVD 
VPIKLYCNGDGEWLVPIGRCMCKAGFEAVENGTVCRGCPSGTFKANQGDEACTHCPIN 
SRTTSEGATNCVCRNGYYRADLDPLDMPCTTIPSAPQAVISSVNETSLMLEWTPPRDSG 

10 GRJEDLVYNIICKSCGSGRGACTRCGDNVQYAPRQLGLTEPRIYISDLLAHTQYTFEIQAV 
NGVTDQSPFSPQFASVNITTNQAAPSAVSIMHQVSRTVDSITLSWSQPDQPNGVILDYEL 
QYYEK corresponding to amino acids 1 - 476 of EPB2HUMAN, which also corresponds to 
amino acids 1 - 476 of M85491JPEAJMP13, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 

15 preferably at least 95% homologous to a polypeptide having the sequence 
VPIGWVLSPSPTSLRAPLPG corresponding to amino acids 477 - 496 of 
M85491_PEA_1_JP13, wherein said first and second amino acid sequences are contiguous and 
in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

20 isolated polypeptide encoding for a tail of M85491 JPEA1P13, comprising a polypeptide 

being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
VPIGWVLSPSPTSLRAPLPG in M85491_PEA_1 JP13. 

According to preferred embodiments of the present invention, there is provided an 

25 isolated chimeric polypeptide encoding for M8549 1 PEA 1_P 1 4, comprising a first amino acid 
sequence being at least 90 % homologous to 
MALRRLGAALLLLPLLAAVEETLMDSTTATAELGm 
TYQVCNWESSQNNWLRTKFI 
EADFDSATKTFPNWMENPWVKV 

30 YLAFQDYGGCMSLIAVRVFYRKCPRIIQNGAIFQETLSGAESTSLVAARGSCIANAEEVD 
VPIKLYCNGDGEWLVPIGRCMCKAGFEAVENGTVCR corresponding to amino acids 1 - 
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270 of EPB2_HUMAN, which also corresponds to amino acids 1 - 270 of 

M85491JPEA_1_P14, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 
5 ERQDLTMLSRLVLNSWPQMILPPQPPKVLEL corresponding to amino acids 271 - 301 of 
M85491 JPEA_1 JP14, wherein said first and second amino acid sequences are contiguous and 
in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of M85491JPEA_J_P14, comprising a polypeptide 
10 being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
ERQDLTMLSRLVLNSWPQMILPPQPPKVLEL in M85491_PEA„1_P14. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for T39971_P6, comprising a first amino acid sequence 
1 5 being at least 90 % homologous to 

1VLAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 
KPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPV 
LKPEEEAPAPEVGASKPEGIDSE^ETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 
GQYCYELDEKAVRPGYPKLIRJDVWGIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGV 

20 LDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKG corresponding to amino 
acids 1 - 276 of VTNCJHUMAN, which also corresponds to amino acids 1 - 276 of 
T39971 JP6, and a second amino acid sequence being at least 70%, optionally at least 80%, 
preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence TQGVVGD corresponding to amino acids 

25 277 - 283 of T39971 P6, wherein said first and second amino acid sequences are contiguous 
and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of T39971JP6, comprising a polypeptide being at least 
70%, optionally at least about 80%, preferably at least about 85%, more preferably at least about 

30 90% and most preferably at least about 95% homologous to the sequence TQGVVGD in 
T39971 P6. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for T39971 JP9, comprising a first amino acid sequence 
being at least 90 % homologous to 

MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 
5 KPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPV 
LKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 
GQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGV 
LDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEE 
CEGSSLSAVFEHFAMMQRDSWEDIFELLFWGRT corresponding to amino acids 1 - 325 of 
10 VTNCJHUMAN, which also corresponds to amino acids 1 - 325 of T39971JP9, and a second 
amino acid sequence being at least 90 % homologous to 

SGMAPRPSLAKKQRFRHRNRKGYRSQRGHSRGRNQNSRRPSRATWLSLFSSEESNLGA 
NNYDDYRMDWLWATCEPIQSWFFSGDKYYRWLRTRRVDTVDPPYPRSIAQYWLGC 
PAPGHL corresponding to amino acids 357 - 478 of VTNC_HUMAN, which also corresponds 

15 to amino acids 326 - 447 of T3997IJP9, wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of T39971JP9, comprising a 
polypeptide having a length V, wherein n is at least about 10 amino acids in length, optionally 

20 at least about 20 amino acids in length, preferably at least about 30 amino acids in length, more 
preferably at least about 40 amino acids in length and most preferably at least about 50 amino 
acids in length, wherein at least two amino acids comprise TS, having a structure as follows: a 
sequence starting from any of amino acid numbers 325-x to 325; and ending at any of amino 
acid numbers 326 + ((n-2) - x), in which x varies from 0 to n-2. 

25 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for T39971JP1 1, comprising a first amino acid sequence 
being at least 90 % homologous to 

MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 
KPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPV 
30 LKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 
GQYCYELDEKAVRPGYPKLIRDWGffiGProAAFTRINCQGKTYLFKGSQYWRFEDGV 
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LDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEE 
CEGSSLSAVFEHFAMMQRDSWEDIFELLFWGRTS corresponding to amino acids 1 - 326 of 
VTNCJHUMAN, which also corresponds to amino acids 1 - 326 of T39971 JP1 1, and a second 
amino acid sequence being at least 90 % homologous to 
5 DKYYRVNLRTRRVDTVDPPYPRSIAQYWLGCPAPGHL corresponding to amino acids 442 

- 478 of VTNC JHUMAN, which also corresponds to amino acids 327 - 363 of T39971_P1 1, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of T39971 JP1 1 5 comprising a 

10 polypeptide having a length V\ wherein n is at least about 10 amino acids in length, optionally 
at least about 20 amino acids in length, preferably at least about 30 amino acids in length, more 
preferably at least about 40 amino acids in length and most preferably at least about 50 amino 
acids in length, wherein at least two amino acids comprise SD, having a structure as follows: a 
sequence starting from any of amino acid numbers 326-x to 326; and ending at any of amino 

15 acid numbers 327 + ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for T39971_P11, comprising a first amino acid sequence 
being at least 90 % homologous to 

MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 
20 KPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPV 
LKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 
GQYCYELDEKAVRPGYPKLIRDVWGffi 

LDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEE 
CEGSSLSAVFEHFAMMQRDSWEDIFELLFWGRTS corresponding to amino acids 1 - 326 of 
25 Q9BSH7, which also corresponds to amino acids 1 - 326 of T39971_P1 1, and a second amino 
acid sequence being at least 90 % homologous to 

DKYYRVNLRTRJRVDTVDPPYPRSIAQYWLGCPAPGHL corresponding to amino acids 442 

- 478 of Q9BSH7, which also corresponds to amino acids 327 - 363 of T39971 JP1 1, wherein 
said first and second amino acid sequences are contiguous and in a sequential order. 

30 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for an edge portion of T39971 JP1 1, comprising a 
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polypeptide having a length V, wherein n is at least about 10 amino acids in length, optionally 
at least about 20 amino acids in length, preferably at least about 30 amino acids in length, more 
preferably at least about 40 amino acids in length and most preferably at least about 50 amino 
acids in length, wherein at least two amino acids comprise SD, having a structure as follows: a 
5 sequence starting from any of amino acid numbers 326-x to 326; and ending at any of amino 
acid numbers 327 + ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for T39971 JP12, comprising a first amino acid sequence 
being at least 90 % homologous to 

10 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 
KPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPV 
LKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 
GQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFK corresponding to 
amino acids 1 - 223 of VTNCHUM AN, which also corresponds to amino acids 1 - 223 of 

15 T39971JP12, and a second amino acid sequence being at least 70%, optionally at least 80%, 
preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence VPGAVGQGRKHLGRV corresponding to 
amino acids 224 - 238 of T39971_P12, wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 

20 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of T39971_P12, comprising a polypeptide being at least 
70%, optionally at least about 80%, preferably at least about 85%, more preferably at least about 
90% and most preferably at least about 95% homologous to the sequence 
VPGAVGQGRKHLGRV in T39971_P12. 

25 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for T39971JP12, comprising a first amino acid sequence 
being at least 90 % homologous to 

MAPLRPLLILALLAWVAJLADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 
KPQVTRGDVFTMPEDEYTV 
30 LKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 
GQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFK corresponding to 
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amino acids 1 - 223 of Q9BSH7, which also corresponds to amino acids 1 - 223 of T39971 JP12, 
and a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 
85%o, more preferably at least 90% and most preferably at least 95% homologous to a 
polypeptide having the sequence VPGAVGQGRKHLGRV corresponding to amino acids 224 - 
5 238 of T39971 JP12, wherein said first and second amino acid sequences are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of T39971_P12, comprising a polypeptide being at least 
70%o, optionally at least about 80%, preferably at least about 85%>, more preferably at least about 

10 90% and most preferably at least about 95%> homologous to the sequence 
VPGAVGQGRKHLGRV in T39971 JP12. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Z21368_PEA1_P2, comprising a first amino acid 
sequence being at least 90 % homologous to 

15 MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSL 
QVMNKTRKIMEHGGATFINAPVTTPMCCPSRSSMLTGKYVHNHNW 
QAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPPGWREWLGLKNSRFYNYTVCR 
NGIKJEKHGFDYAKDWTDU 

FSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNIL^ 
20 SVERLYNMLVETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEP 

GSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRTNKKAKIWRD 

VERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARYQTACEQPGQKWQCIEDTSGK 

LRJHKCKGPSDLLTVRQSTRNLYARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQ 

GTPKYKPRFVHTRQTRSLSVEFEGEIYDINLE^ 
25 ASSGGNRGRMLADSSNAVGPPTTWVTHKCFILPNDSfflCERELYQSARAWKDHKAY^ 

DKEIEALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVK^ 

AAQEVDSKLQLFKENNRRRKKERE^KRRQ 

corresponding to amino acids 1-761 of SUL1_HUMAN, which also corresponds to amino 
acids 1 - 761 of Z21368_PEA_1 JP2, and a second amino acid sequence being at least 70%, 
30 optionally at least 80%>, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
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PHKYSAHGRTRHFESATRTTNGAQKLSRI corresponding to amino acids 762 - 790 of 
Z21368JPEA_1JP2, wherein said first and second amino acid sequences are contiguous and in a 
sequential order. 

According to preferred embodiments of the present invention, there is provided an 
5 isolated polypeptide encoding for a tail of Z21368JPEA 1 JP2, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
PHKYSAHGRTRHFESATRTTNGAQKLSRI in Z21368JPEA_1JP2. 

According to preferred embodiments of the present invention, there is provided an 
10 isolated chimeric polypeptide encoding for Z21368 PEA_1 JP5, comprising a first amino acid 
sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVEL 
corresponding to amino acids 1-57 of Q7Z2W2, which also corresponds to amino acids 1-57 
of Z21368 PEA 1P5, second bridging amino acid sequence comprising A, and a third amino 
1 5 acid sequence being at least 90 % homologous to 
FFGKYLNEYNGSYIPPGWREWLGLIKNS 
ESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSK^ 
DKHWIMQYTGPMLPIHM^ 

ADHGYHIGQFGLVKGKSMPYDFDIRWFFIRGPSVEPGSIWQIVLNIDLAPTILDIAGLDT 

20 PPDVDGKSVLKLLDPEKPGNRFR 

PKYERVKELCQQARYQTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLY 
ARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEFE 
GEIYDmLEEEEELQVLQPRNIAKRHDEGHKGPRJDLQASSGGNRGRMLADSSNAVGPPT 
TVRVTHKCFILPNDSIHCERELYQSARAWKra 

25 RKPEECSCSKQSYYNKEKGVK 

KEKRRQRKGEECSLPGLTCFTHDN^ 

THNFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQCN 
PRPKNLDVGNKDGGSYDLHRGQLWDGWEG corresponding to amino acids 139 - 871 of 
Q7Z2W2, which also corresponds to amino acids 59 - 791 of Z21368_PEA_1_P5, wherein said 
30 first, second and third amino acid sequences are contiguous and in a sequential order. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for an edge portion of Z21368_PEA_1 JP5, comprising a 
polypeptide having a length "n", wherein n is at least about 10 amino acids in length, optionally 
at least about 20 amino acids in length, preferably at least about 30 amino acids in length, more 
5 preferably at least about 40 amino acids in length and most preferably at least about 50 amino 
acids in length, wherein at least two amino acids comprise LAF having a structure as follows 
(numbering according to Z21368JPEA_1 JP5): a sequence starting from any of amino acid 
numbers 57-x to 57; and ending at any of amino acid numbers 59 + ((i>2) - x), in which x varies 
from 0 to n-2. 

1 0 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for Z21368_PEA_1_P5, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGMQQERKNIRPNIILVLTDDQDVELAFF 

15 GKYLNEYNGSYIPPGWREWLGU 

INYFKMSKRMYPHRPVMMVISHAAPH 
HWIMQYTGPMLPIHMEFTNILQRKJRLQTLMSV 

HGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGLDTPP 
DVDGKSVLKiXDPEKPGNRFRT^ 

20 KYERVKELCQQARYQTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYA 
RGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEFEGE 
IYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGPPTTV 
RVTHKCFILPNDSIHCERELYQSARA 
PEECSCSKQSYYNKEKGVKKQE 

25 KRRQRKGEECSLPGLTCFTHDNNHWQTAPFWNLGSFCACTSSNNNTYWCLRTVNETH 
NFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLME corresponding to 
amino acids 1-751 of Z21368JPEA_1 JP5, and a second amino acid sequence being at least 90 
% homologous to LRSCQGYKQCNPRPKNLDVGNKDGGSYDLHRGQLWDGWEG 
corresponding to amino acids 1-40 of AAH12997, which also corresponds to amino acids 752 - 

30 791 of Z21368JPEA 1JP5, wherein said first and second amino acid sequences are contiguous 
and in a sequential order. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of Z21368_PEA_1_P5, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
5 MKYSCCALVLAVLGTELLGSLCSTVRSPRPRGRIQQERKNIRPNIILVLTDDQDVELAFF 
GKYLNEYNGSYIPPGWPvEWLGLIKNSRFYNYTVCPvNGIKEK^GFDYAKJDYFTDLITNES 
mYFKMSKllMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYNYAPNMDK 
HWIMQYTGPMLPIHMEFTNILQRKJR.LQTLMSVDDSVERLYNMLVETGELENTYIIYTAD 
HGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGLDTPP 

1 0 DVDGKS VLKLLDPEKPGNRFRTNKKAKIWRDTFLVERGKFLRXKEESSKNIQQSNHLP 
KYERVKELCQQARYQTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYA 
RGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEFEGE 
IYDmLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGPPTTV 
RVTHKCFILPNDSIFICERELYQSARAWKDHKAYIDKEIEALQDKIKNLREVRGHLKRRK 

1 5 PEECSCSKQSYYNKEKGVKKQEKLKSHLHPFKEAAQEVDSKXQLFKElSnSfRRRKKERKE 
KRRQRKGEECSLPGLTCFTHDNNHWQTAPFWNLGSFCACTSSNNNTYWCLRTVNETH 
NFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLMEof 
Z21368_PEA_1_P5. 

According to preferred embodiments of the present invention, there is provided an 
20 isolated chimeric polypeptide encoding for Z21368_PEA_1_P5, comprising a first amino acid 
sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVEL 
corresponding to amino acids 1 - 57 of SUL1HUMAN, which also corresponds to amino acids 
1 - 57 of Z21368_PEA_1_P5, and a second amino acid sequence being at least 90 % 
25 homologous to 

AFFGKYLISTEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLIT 
NESINYFKMSKRMYPHRPVMMVK 

MDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELENTYII 
YTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGL 
30 DTPPDVDGKSVLKLLDPEKPGNRFRTNKKAKIWRDTFLVERGKFLRKKEESSKNIQQSN 
HLPKYERVKELCQQARYQTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRN 
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LYARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVE 
FEGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGP 
PTTVRVTHKCFILPNDSIHCERELYQSARAWKDHKAYIDKEIEALQDKIKNLREVRGHL 
KRRKPEECSCSKQSYYNKJEKGVKKQEKLKSHLHPFKEAAQEVDSKLQL^ 
5 KERKEKRRQRKGEECSLPGLTCFTHD>WHWQTAPFWNLGSFCACTSSN>WTYWCLRT 
VNETHNFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYK 
QCNPRPKNLDVGNKDGGSYDLHRGQLWDGWEG corresponding to amino acids 138 - 871 
of SUL1_HUMAN, which also corresponds to amino acids 58 - 791 of Z21368JPEA_1JP5, 
wherein said first and second amino acid sequences are contiguous and in a sequential order, 

10 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for an edge portion of Z21368_PEA_1_P5, comprising a 
polypeptide having a length "n n 5 wherein n is at least about 10 amino acids in length, optionally 
at least about 20 amino acids in length, preferably at least about 30 amino acids in length, more 
preferably at least about 40 amino acids in length and most preferably at least about 50 amino 

15 acids in length, wherein at least two amino acids comprise LA, having a structure as follows: a 
sequence starting from any of amino acid numbers 57-x to 57; and ending at any of amino acid 
numbers 58 + ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Z21368JPEA_1 JP15, comprising a first amino acid 

20 sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSL 
QVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYVHNHNVY 
QAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCR 
NGIKEKHGFDYAKDWTDLITNESINYTKMSKRMYPHRPVMM 

25 FSKLYPNASQHITPSYNYAPNMDKHW^ 

SVERLYNMLVETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEP 
GSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRTNKKAKIWRD 
VERG corresponding to amino acids 1 - 416 of SUL INHUMAN, which also corresponds to 
amino acids 1 - 416 of Z21368JPEA_1_P15. 

30 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for Z21368_PEA_1_P16, comprising a first amino acid 
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sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSL 
QVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYVHNHNVYTNNENCSSPSW 
QAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCR 
5 NGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQ 
FSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDD 
SVERLYNMLVETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEP 
GSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNR corresponding to amino 
acids 1 - 397 of SUL1_HUMAN 5 which also corresponds to amino acids 1 - 397 of 

10 Z21368JPEA__1 JP16, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence CVIVPPLSQPQIH corresponding to amino 
acids 398 - 410 of Z2 1 3 6 8 PE A_ 1 _P 1 6, wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 

15 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of Z2 1 368 JPEA_1_P 1 6, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
CVIVPPLSQPQIH in Z21368_PEA_1JP16. 

20 According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for Z21368_PEA_1_P22, comprising a first amino acid 
sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQEIUKMRPNIILVLTDDQDVELGSL 
QVMNKTRKIMEHGGA 

25 QAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPPGWREWLGLIKNS 

NGIKEKHGFDYAK corresponding to amino acids 1-188 of SUL1JHUMAN, which also 
corresponds to amino acids 1 - 188 of Z21368JPEA_1_P22, and a second amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
90% and most preferably at least 95% homologous to a polypeptide having the sequence 

30 ARYDGDQPRCAPRPRGLSPTVF corresponding to amino acids 189 - 210 of 
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Z21368JPEA_JP22, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Z21368JPEA1_P22, comprising a polypeptide being 
5 at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
ARYDGDQPRCAPRPRGLSPTVF in Z21368JPEA_1 JP22. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Z21368JPEA1_P23, comprising a first amino acid 
10 sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSL 
QVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYVHNHNWTNNENCSSPSW 
QAMHEPRTFAVYLNNTGYRT corresponding to amino acids 1-137 of Q7Z2W2, which also 
corresponds to amino acids 1 - 137 of Z2 1 368_PEA_1_P23, and a second amino acid sequence 
15 being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
90% and most preferably at least 95% homologous to a polypeptide having the sequence 
GLLHRLNH corresponding to amino acids 138 - 145 of Z2 1 368_PEA_1 JP23, wherein said 
first and second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
20 isolated polypeptide encoding for a tail of Z21368_PEA_1_P23, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
GLLHRLNH in Z21368_PEA_1_P23. 

According to preferred embodiments of the present invention, there is provided an 
25 isolated chimeric polypeptide encoding for Z21368JPEA_1 JP23, comprising a first amino acid 
sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSL 
QVMNKTRKIMEHGGATFINAFVT^ 

QAMHEPRTFAVYLNNTGYRT corresponding to amino acids 1-137 of SUL INHUMAN, 
30 which also corresponds to amino acids 1 - 137 of Z21368_PEA_1_P23, and a second amino 
acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
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preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence GLLHRLNH corresponding to amino acids 138 - 145 of Z21368JPEA_1_P23, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
5 isolated polypeptide encoding for a tail of Z21368JPEA_1_P23, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
GLLHRLNH in Z21368JPEA_1 JP23. 

According to preferred embodiments of the present invention, there is provided an 
10 isolated chimeric polypeptide encoding for HUMGRP5EP4, comprising a first amino acid 
sequence being at least 90 % homologous to 

MRGSELPLVLLALVLCLAPRGRAVPLPAGGGTVLTKMYPRGNHWAVGHLMGKKSTG 
ESSSVSERGSLKQQLREYIRWEEAARNLLGLIEAKENRNHQPPQPKALGNQQPSWDSED 
S SNFKD VGSKGK corresponding to amino acids 1-127 of GRP_HUMAN, which also 

15 corresponds to amino acids 1-127 of HUMGRP5EJP4, and a second amino acid sequence 

being at least 90 % homologous to GSQREGRNPQLNQQ corresponding to amino acids 135 - 
148 of GRP HUMAN, which also corresponds to amino acids 128 - 141 of HUMGRP5EJP4, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 
According to preferred embodiments of the present invention, there is provided an 

20 isolated chimeric polypeptide encoding for an edge portion of HUMGRP5EP4, comprising a 
polypeptide having a length "n", wherein n is at least about 10 amino acids in length, optionally 
at least about 20 amino acids in length, preferably at least about 30 amino acids in length, more 
preferably at least about 40 amino acids in length and most preferably at least about 50 amino 
acids in length, wherein at least two amino acids comprise KG, having a structure as follows: a 

25 sequence starting from any of amino acid numbers 127-x to 127; and ending at any of amino 
acid numbers 128 + ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMGRP5EJP5, comprising a first amino acid 
sequence being at least 90 % homologous to 

30 MRGSELPLVLLALVLCLAPRGRAVPLPAGGGTVLTKMYPRGNHWAVGHLMGKKSTG 
ESSSVSERGSLKQQLREYIRWEEAARNLLGLIEAKENRNHQPPQPKALGNQQPSWDSED 
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S SNFKD VGSKGK corresponding to amino acids 1-127 of GRPHUMAN, which also 
corresponds to amino acids 1 - 127 of HUMGRP5EP5, and a second amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
90% and most preferably at least 95% homologous to a polypeptide having the sequence 
5 DSLLQVLNVKEGTPS corresponding to amino acids 128 - 142 of HUMGRP5E_P5, wherein 
said first and second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of HUMGRP5EJP5, comprising a polypeptide being at 
least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 

10 about 90% and most preferably at least about 95% homologous to the sequence 
DSLLQVLNVKEGTPS in HUMGRP5E_P5. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for D 5 640 6PE A 1 P2, comprising a first amino acid 
sequence being at least 90 % homologous to 

15 MMAGMKIQLVCMLLLAFSSWSLCSDSEEEMKALEA 

LLNVCSLVNNLNSPAEETGEVHEEELVARREXPTALDGFSLEAMLTIYQLHKICHSRAF 
QHWE corresponding to amino acids 1-120 of NEUTHUMAN, which also corresponds to 
amino acids 1-120 of D56406JPEA1P2, second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 

20 preferably at least 95% homologous to a polypeptide having the sequence 

ARWLTPVIPALWEAETGGSRGQEMETIPANT corresponding to amino acids 121 - 151 of 
D56406JPEA_1 JP2, and a third amino acid sequence being at least 90 % homologous to 
LIQEDILDTGNDKNGKEEVIKI^ corresponding to 

amino acids 121-170 of NEUT_HUMAN, which also corresponds to amino acids 152 - 201 of 

25 D56406JPEA_1 JP2, wherein said first, second and third amino acid sequences are contiguous 
and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for an edge portion of D56406_PEA_1_P2, comprising an amino 
acid sequence being at least 70%, optionally at least about 80%, preferably at least about 85%, 

30 more preferably at least about 90% and most preferably at least about 95% homologous to the 
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sequence encoding for ARWLTPVIPALWEAETGGSRGQEMETIPANT, corresponding to 
D56406JPEA_1_P2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for D56406_PEA_1_P5, comprising a first amino acid 
5 sequence being at least 90 % homologous to MM AGMKIQL VCMLLL AF S S WSLC 

corresponding to amino acids 1 - 23 of NEUTJHUMAN, which also corresponds to amino acids 
1-23 of D56406JPEA_1JP5, and a second amino acid sequence being at least 90 % 
homologous to 

SEEEMKALEADFLTNMHTSKISKAHVPSWKMTLLNVCSLVNNLNSPAEETGEVHEEEL 

10 VARRKLPTALDGFSLEAMLTIYQLHKICHSRAFQHWELIQEDILDTGNDKNGKEEVIKR 
KIPYILKRQLYENKPRRPYILKRDSYYY corresponding to amino acids 26- 170 of 
NEUTJHnUMAN, which also corresponds to amino acids 24 - 168 of D56406JPEA_J_P5, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 
According to preferred embodiments of the present invention, there is provided an 

15 isolated chimeric polypeptide encoding for an edge portion of D56406_PEA_1_P5, comprising 
a polypeptide having a length "n", wherein n is at least about 10 amino acids in length, 
optionally at least about 20 amino acids in length, preferably at least about 30 amino acids in 
length, more preferably at least about 40 amino acids in length and most preferably at least 
about 50 amino acids in length, wherein at least two amino acids comprise CS, having a 

20 structure as follows: a sequence starting from any of amino acid numbers 23-x to 24; and ending 
at any of amino acid numbers + ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for D56406_PEA_1JP6, comprising a first amino acid 
sequence being at least 90 % homologous to 

25 MMAGMKIQLVCMLLLAFSSWSLCSDS EEEMKALEADFLTNMHTSK corresponding to 
amino acids 1-45 of NEUTJHUMAN, which also corresponds to amino acids 1 - 45 of 
D56406_PEA_1_P6, and a second amino acid sequence being at least 90 % homologous to 
LIQEDILDTGNDKNGKEEVIKIIKIPYILKRQLYENKPRJ^ corresponding to 

amino acids 121 - 170 of NEUTJHUMAN, which also corresponds to amino acids 46 - 95 of 

30 D56406_PEA_1_P6, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for an edge portion of D56406JPEA_1_P6, comprising 
a polypeptide having a length V, wherein n is at least about 10 amino acids in length, 
optionally at least about 20 amino acids in length, preferably at least about 30 amino acids in 
5 length, more preferably at least about 40 amino acids in length and most preferably at least 
about 50 amino acids in length, wherein at least two amino acids comprise KL, having a 
structure as follows: a sequence starting from any of amino acid numbers 45-x to 46; and ending 
at any of amino acid numbers 46+ ((n-2) - x), in which x varies from 0 to n-2. 

According to preferred embodiments of the present invention, there is provided an 
10 isolated chimeric polypeptide encoding for F05068JPEA__1 JP7, comprising a first amino acid 
sequence being at least 90 % homologous to 

MKLVSVALMYLGSLAFLGADTARLDVASEFRKK corresponding to amino acids 1 - 33 of 
ADML_HUMAN, which also corresponds to amino acids 1 - 33 of F05068JPEA_1 JP7. 

According to preferred embodiments of the present invention, there is provided an 
15 isolated chimeric polypeptide encoding for F05068_PEA_1_P8, comprising a first amino acid 
sequence being at least 90 % homologous to 
MKLVSVALMYLGSLAFLGADTARL^ 

DVKAGPAQTLIRPQDMKGASRSPED corresponding to amino acids 1 - 82 of 

ADML HUMAN, which also corresponds to amino acids 1 - 82 of F05068_PEA_1_P8, and a 

20 second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence R corresponding to amino acids 83 - 83 of F05068JPEA_1 JP8, wherein 
said first and second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

25 isolated chimeric polypeptide encoding for H14624JP15, comprising a first amino acid 
sequence being at least 90 % homologous to 

MLQGPGSLLLLFLASHCCLGSARGLFLFGQPDFSYKRSNCKPIPANLQLCHGIEYQNMR 
LPNLLGHETMKEVLEQAGAWIPLVMKQCHPDTKXFLCSLFAPVCLDDLD 
VQVKDRCAPVMSAFGFPWPDMLECDRFPQDNDLCIPLASSDHLLPATEE corresponding 
30 to amino acids 1 - 167 of Q9HAP5, which also corresponds to amino acids 1 - 167 of 

H14624_P15, and a second amino acid sequence being at least 70%, optionally at least 80%, 
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preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence GKPSLLLPHSLLG corresponding to amino 
acids 168- 180 of H14624JM5, wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 
5 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of H14624 P15, comprising a polypeptide being at least 
70%, optionally at least about 80%, preferably at least about 85%, more preferably at least about 
90% and most preferably at least about 95% homologous to the sequence GKPSLLLPHSLLG 
in H14624_P15. 

1° According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for H38804JPEA_1_P5, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 

MGRVRTLAGECSAQAQAQSLLAVVLSAPPSGGTPSARLSVRSPSPRDPWGLWAPVLQ 
15 corresponding to amino acids 1 - 57 of H3 8 8 04JPEA 1 JP5 , and a second amino acid sequence 
being at least 90 % homologous to 

MTGSNEFKLNQPPEDGISSVKFSPNTSQFLLVSSWDTSVRLYDVPANSMRLKYQHTGA 
VLDCAFYDPTHAWSGGLDHQLKMHDLNTDQENLVGTHDAPIRCVEYCPEVNVMVTG 
SWDQTVKLWDPRTPCNAGTFSQPEKVYTLSVSGDRLIVGTAGRRVLVWDLRNMGYVQ 

20 QRRESSLKYQTRCIRAFPNKQGYVLSSIEGRVAVEYLDPSPEVQKKKYAFKCHRLKENN 
IEQIYPWAISFHNIHNTFATGGSDGFWIWDPFNKKRLCQFHRYPTSIASLAFSNDGTTL 
AIASSYMYEMDDTEHPEDGIFIRQVTDAETKPK corresponding to amino acids 1 - 324 of 
BUB3_HUMAN, which also corresponds to amino acids 58 - 381 of H38804JPEA_1JP5, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

25 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a head of H38804JPEA_1JP5, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 

MGRVRTLAGECSAQAQAQSLLAWLSAPPSGGTPSARLSVRSPSPRDPWGLWAPVLQ 
30 of H38804_PEA_1 JP5. 
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According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for H38804_PEA_1_P17, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
5 MGRVRTLAGECSAQAQAQSLLAVVLSAPPSGGTPSARLSVRSPSPRDPWGLWAPVLQ 
corresponding to amino acids 1 - 57 of H38804JPEA_1 JP17, and a second amino acid sequence 
being at least 90 % homologous to 

MTGSNEFKLNQPPEDGISSVKFSPNTSQFLLVSSWDTSVRLYDVPANSMRLKYQHTGA 
VLDCAFYDPTHAWSGGLDHQLKMHDLNTDQENLVGTHDAPIRCVEYCPEWVMVTG 

10 SWDQTVKLWDPRTPCNAGTFSQPEKVYTLSVSGDRLIVGTAGRRVLVWDLRNMGYVQ 
QRRESSLKYQTRCIRAFPNKQGWLSSIEGRVAVEYLDPSPEVQKKKYAFKCHRLKENN 
IEQIYPVNAISFHNIHNTFATGGSDGFWIWDPFNKKRLCQFHRYPISIASLAFSNDGTTL 
AIASSYMYEMDDTEHPEDGIFIRQVTDAETKPKSPCT corresponding to amino acids 1 - 
328 of BUB3 HUMAN, which also corresponds to amino acids 58 - 385 of 

15 H38804JPEA_1_P17, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of H38804_PEA1 _P17, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
20 at least about 90% and most preferably at least about 95% homologous to the sequence 

MGRVRTLAGECSAQAQAQSLLAWLSAPPSGGTPSARLSVRSPSPRDPWGLWAPVLQ 

of H38804JPEA_1_P17. 

According to preferred embodiments of the present invention, there is provided an 

isolated chimeric polypeptide encoding for HSENA78JP2, comprising a first amino acid 
25 sequence being at least 90 % homologous to 

MSLLSSRAARVPGPSSSLCALLVLLLLLTQPGPIASAGPAAAVLRELRCVCLQTTQGVHP 

KMISNLQVFAIGPQCSKVEW corresponding to amino acids 1-81 of SZ05_HUMAN, 

which also corresponds to amino acids 1-81 of HSENA78JP2. 

According to preferred embodiments of the present invention, there is provided an 
30 isolated chimeric polypeptide encoding for HUMODCAJP9, comprising a first amino acid 

sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
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least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MKSLTATSSMKVLLPRTFWTRKLMKFLLL corresponding to amino acids 1 - 29 of 
HUMODCAP9, and a second amino acid sequence being at least 90 % homologous to 
LVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAJCELNIDVVGVSFHVGSGCTDPETFV 
5 QAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEEITGVINPALDKYFPSDSG 
VRIIAEPGRYYVASAFTLAVNIIAKKIVLKEQTGSDDEDESSEQTFMYYVNDGVYGSFN 
CILYDHAHVKPLLQKRPKPDEKYYSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFEN 
MGAYTVAAASTFNGFQRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCA 
WESGMKRHRAACASASINV corresponding to amino acids 151-461 of DCORHUMAN, 

10 which also corresponds to amino acids 30 - 340 of HUMODCA JP9, wherein said first and 
second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of HUMODCAJP9, comprising a polypeptide being at 
least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 

15 about 90% and most preferably at least about 95% homologous to the sequence 
MKSLTATSSMKVLLPRTFWTRKLMKFLLL of HUMODCA JP9. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMODCA P9, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 

20 least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MKSLTATSSMKVLLPRTFWTRKLMKFLLL corresponding to amino acids 1 - 29 of 
HUMODCA JP9, and a second amino acid sequence being at least 90 % homologous to 
LVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAKELNIDVVGVSFHVGSGC1DPETFV 
QAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVEXKFEEITGVINPALDKYFPSDSG 

25 VWIAEPGRYYVASAFTLAVMIAKKIVLKEQTGS 

CILYDHAHVKPLLQKRPKPDEKYYSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFEN 
MGAYTVAAASTFNGFQRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCA 
WESGMKRHRAACASASINV corresponding to amino acids 40 - 350 of AAA59968, which 
also corresponds to amino acids 30 - 340 of HUMODCA JP9, wherein said first and second 
30 amino acid sequences are contiguous and in a sequential order. 
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According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of HUMODC AJP9, comprising a polypeptide being at 
least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 
about 90% and most preferably at least about 95% homologous to the sequence 
MKSLT ATS SMKVLLPRTF WTRKLMKFLLL of HUMODC AJP9. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for HUMODCAP9, comprising a first amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MKSLT ATSSMKVLLPRTFWTRXLMKPLLL corresponding to amino acids 1 - 29 of 
HUMODCAP9, and a second amino acid sequence being at least 90 % homologous to 
LVLMATDDSKAVCRLSVKFGATLRTSRLLLERAKELNIDVVGVSFHVGSGCTDPETFV 
QAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEEITGVINPALDKYFPSDSG 
VRIIAEPGRYYVASAFTLAVNIIAKKIVLKEQTGSDDEDESSEQTFMYYVNDGVYGSFN 
CILYDHAHVKPLLQKRPKPDEKYYSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFEN 
MGAYTVAAASTFNGFQRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCA 
WESGMKRHRAACASASINV corresponding to amino acids 86- 396 of AAH14562, which 
also corresponds to amino acids 30- 340 of HUMODCAJP9, wherein said first and second 
amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of HUMODCA_P9, comprising a polypeptide being at 
least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least 
about 90% and most preferably at least about 95% homologous to the sequence 
MKSLTATSSMKVLLPRTFWTRKLMKFLLL of HUMODC A_P9. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R00299JP3, comprising a first amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
90% and most preferably at least 95% homologous to a polypeptide having the sequence 
MAEKALLCPSSAGLGTWPWVLNSAWPVLPLAVDQGVDWRPRGPV corresponding to 
amino acids 1 - 44 of R00299JP3, second amino acid sequence being at least 90 % homologous 
to 
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SSDQIEQLHRRFKQLSGDQPTIRKENFNNVPDLELNPIRSKJVRAFFDNRNLRKGPSGLA 
DEINFEDFLTIMSYFRPIDTTMDEEQVELSRKEKLRFLFHMYDSDSDGRITLEEYRNV 
corresponding to amino acids 74 - 191 of Q9NWT9, which also corresponds to amino acids 45 - 
162 of R00299JP3, and a third amino acid sequence being at least 70%, optionally at least 80%, 
5 preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 

VEELLSGNPHIEKESARSIADGAMMEAASVCMGQMEPDQVYEGITFEDFLKIWQGIDIE 
TKMHVRFLNMETMALCH corresponding to amino acids 163 - 238 of R00299_P3, wherein 
said first, second and third amino acid sequences are contiguous and in a sequential order. 

10 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a head of R00299_P3, comprising a polypeptide being at least 
70%, optionally at least about 80%, preferably at least about 85%, more preferably at least about 
90% and most preferably at least about 95% homologous to the sequence 
MAEKALLCPSSAGLGTWPWVLNSAWPVLPLAVDQGVDWRPRGPV of R00299_P3. 

15 According to preferred embodiments of the present invention, there is provided an 

isolated polypeptide encoding for a tail of R00299JP3, comprising a polypeptide being at least 
70%, optionally at least about 80%, preferably at least about 85%, more preferably at least about 
90% and most preferably at least about 95% homologous to the sequence 

VEELLSGNPHIEKESARSIADGAMMEAASVCMGQMEPDQVYEGITFEDFLKIWQGIDIE 

20 TKMHVRFLNMETMALCH in R00299JP3. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for R00299_P3, comprising a first amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
90% and most preferably at least 95% homologous to a polypeptide having the sequence 

25 MAEKALLCPSSAGLGTWPWVLNSAWPVLPLAVDQGVDWRPRGPV corresponding to 
amino acids 1-44 of R00299JP3, and a second amino acid sequence being at least 90 % 
homologous to 

SSDQIEQLHRRFKQLSGDQPTIRKENFNN 
DEINFEDFLTIMSYFRPIDTTMDEEQVELS 
30 ELLSGNPHIEKESAJR.S1ADGAMMEAASVCMGQMEPDQWEGITFEDFL 

MHVRFLNMETMALCH corresponding to amino acids 21 - 214 of TESCJHIJMAN, which 
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also corresponds to amino acids 45 - 238 of R00299_P3 ? wherein said first and second amino 
acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a head of R00299_P3, comprising a polypeptide being at least 
5 70%, optionally at least about 80%, preferably at least about 85%, more preferably at least about 
90% and most preferably at least about 95% homologous to the sequence 
MAEKALLCPSSAGLGTWPWVLNSAWPVLPLAVDQGVDWRPRGPV of R00299JP3. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for W60282JPEA_1_P14, comprising a first amino acid 

10 sequence being at least 90 % homologous to 

MRILQLILLALATGLVGGETRIIKGFECKPHSQPWQAALFEKTRLLCGATLIAPRWLLTA 
AHCLKP corresponding to amino acids 1 - 66 of Q8IXD7, which also corresponds to amino 
acids 1-66 of W60282 PEA_1_P14, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 

1 5 preferably at least 95% homologous to a polypeptide having the sequence 

TPASHLAMRQHHHH corresponding to amino acids 67 - 80 of W60282JPEA_1_P14, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of W60282_PEA__1_P14, comprising a polypeptide 

20 being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
TPASHLAMRQHHHH in W60282_PEA_1JP14. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Z41644JPEA_1_P10, comprising a first amino acid 

25 sequence being at least 90 % homologous to 

MRLLAAALLLLLLALYTARVDGSKCKCSRKGPKJRYSDVKKLEMKPKYPHC 
TTKSVSRYRGQEHCLHPKLQSTKRFIKWYNAWNEKRR corresponding to amino acids 1 - 
95 of SZ14JHUMAN, which also corresponds to amino acids 1-95 of Z4 1 644 JPEA_1_P 1 0, 
and a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 

30 85%, more preferably at least 90% and most preferably at least 95% homologous to a 

polypeptide having the sequence YAPPLLTFLPTRPSCGSQDGKGPPHQVI corresponding to 



WO 2006/131783 



PCT/IB2005/004037 



119 

amino acids 96 - 123 of Z41644JPEA_1 JP10, wherein said first and second amino acid 
sequences are contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Z41644JPEA_1 JP10, comprising a polypeptide being 
5 at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
YAPPLLTFLPTRPSCGSQDGKGPPHQVI in Z41644J>EA_1 JP10. 

According to preferred embodiments of the present invention, there is provided an 
isolated chimeric polypeptide encoding for Z41644_PEA_1_P10, comprising a first amino acid 
10 sequence being at least 90 % homologous to 
MRLLAAALLLLLLALYTAR^^ 

TTKSVSRYRGQEHCLHPKLQSTKRFIKWYNAWNEKRR corresponding to amino acids 13 - 
107 of Q9NS21, which also corresponds to amino acids 1 - 95 of Z41644_PEA_1JP10, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 

15 more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 

having the sequence YAPPLLTFLPTRPSCGSQDGKGPPHQVI corresponding to amino acids 
96 - 123 of Z4 1 644 PE A_ 1 _P 1 0 , wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 

20 isolated polypeptide encoding for a tail of Z4 1 644 JPE A_ 1 P 1 0, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
YAPPLLTFLPTRPSCGSQDGKGPPHQVI in Z41644_PEA_1_P10. 

According to preferred embodiments of the present invention, there is provided an 

25 isolated chimeric polypeptide encoding for Z4 1 644 PE A_ 1 JP 1 0, comprising a first amino acid 
sequence being at least 90 % homologous to 

MRJLLAAALLLLLLALYTARVDGSKCKCSRKGPKIRYSDVKKLEMKPKYPHCEEKMVII 
TTKSVSRYRGQEHCLHPKLQSTKRFIKWYNAWNEKRR corresponding to amino acids 13 - 
107 of AAQ89265, which also corresponds to amino acids 1 - 95 of Z4 1 644 JPE A_ 1 JP 1 0, and a 
30 second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
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having the sequence YAPPLLTFLPTRPSCGSQDGKGPPHQVI corresponding to amino acids 
96 - 123 of Z41644_PEA_1__P10, wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 

According to preferred embodiments of the present invention, there is provided an 
isolated polypeptide encoding for a tail of Z41644JPEA_1_P10, comprising a polypeptide being 
at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably at 
least about 90% and most preferably at least about 95% homologous to the sequence 
YAPPLLTFLPTRPSCGSQDGKGPPHQVI in Z41644JPEA_1JP10. 

According to preferred embodiments of the present invention, there is provided an 
antibody capable of specifically binding to an epitope of an amino acid sequences. 

Optionally the amino acid sequence corresponds to a bridge, edge portion, tail, head or 
insertion. 

Optionally the antibody is capable of differentiating between a splice variant having said 
epitope and a corresponding known protein. 

According to preferred embodiments of the present invention, there is provided a kit for 
detecting lung cancer, comprising a kit detecting overexpression of a splice variant according to 
any of the above claims. 

Optionally the kit comprises a NAT-based technology. 

Optionally the kit further comprises at least one primer pair capable of selectively 
hybridizing to a nucleic acid sequence according to any of the above claims. 

Optionally the kit further comprises at least one oligonucleotide capable of selectively 
hybridizing to a nucleic acid sequence according to any of the above claims. 

Optionally the kit comprises an antibody according to any of the above claims. 

Optionally the kit further comprises at least one reagent for performing an ELISA or a 
Western blot. 

According to preferred embodiments of the present invention, there is provided a method 
for detecting lung cancer, comprising detecting overexpression of a splice variant according to 
any of the above claims. 

Optionally the detecting overexpression is performed with a NAT-based technology. 

Optionally detecting overexpression is performed with an immunoassay. 
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Optionally the immunoassay comprises an antibody according to any of the above 

claims. 

According to preferred embodiments of the present invention, there is provided a 
biomarker capable of detecting lung cancer, comprising any of the above nucleic acid sequences 
5 or a fragment thereof, or any of the above amino acid sequences or a fragment thereof 

According to preferred embodiments of the present invention, there is provided a method 
for screening for lung cancer, comprising detecting lung cancer cells with a biomarker or an 
antibody or a method or assay according to any of the above claims. 

According to preferred embodiments of the present invention, there is provided a method 
10 for diagnosing lung cancer, comprising detecting lung cancer cells with a biomarker or an 
antibody or a method or assay according to any of the above claims. 

According to preferred embodiments of the present invention, there is provided a method 
for monitoring disease progression and/or treatment efficacy and/or relapse of lung cancer, 
comprising detecting lung cancer cells with a biomarker or an antibody or a method or assay 
15 according to any of the above claims. 

According to preferred embodiments of the present invention, there is provided a method 
of selecting a therapy for lung cancer, comprising detecting lung cancer cells with a biomarker 
or an antibody or a method or assay according to any of the above claims and selecting a therapy 
according to said detection. 

20 

Unless defined otherwise, all technical and scientific terms used herein have the meaning 
commonly understood by a person skilled in the art to which this invention belongs. The 
following references provide one of skill with a general definition of many of the terms used in 
this invention: Singleton et al., Dictionary of Microbiology and Molecular Biology (2nd ed. 
25 1994); The Cambridge Dictionary of Science and Technology (Walker ed., 1988); The Glossary 
of Genetics, 5th Ed., R. Rieger et al. (eds.), Springer Verlag (1991); and Hale & Marham, The 
Harper Collins Dictionary of Biology (1991). All of these are hereby incorporated by reference 
as if fully set forth herein. As used herein, the following terms have the meanings ascribed to 
them unless specified otherwise. 

30 
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Figure 1 is schematic summary of cancer biomarkers selection engine and the wet 
validation stages. 

Figure 2. Schematic illustration, depicting grouping of transcripts of a given contig based 
5 on presence or absence of unique sequence regions. 

Figure 3 is schematic summary of quantitative real-time PCR analysis. 

Figure 4 is schematic presentation of the oligonucleotide based microarray fabrication. 

Figure 5 is schematic summary of the oligonucleotide based microarray experimental 

flow. 

10 • Figure 6 is a histogram showing Cancer and cell- line vs. normal tissue expression for 

Cluster H61775, demonstrating overexpression in brain malignant tumors and a mixture of 
malignant tumors from different tissues. 

Figure 7 is a histogram showing expression of transcripts of variants of the 
immunoglobulin superfamily, member 9,H61775 transcripts, which are detectable by amplicon 
15 as depicted in sequence name H61775seg8, in normal and cancerous lung tissues. 

Figure 8 is a histogram showing expression of immunoglobulin superfamily, member 9, 
H61775 transcripts, which are detectable by amplicon as depicted in sequence name 
H61775seg8, in different normal tissues. 

Figure 9 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
20 Cluster M85491, demonstrating overexpression in epithelial malignant tumors and a mixture of 
malignant tumors from different tissues. 

Figure 10 is a histogram showing over expression of the above -indicated Ephrin type-B 
receptor 2 precursor M85491 transcripts, which are detectable by amplicon as depicted in 
sequence name M85491seg24, in cancerous lung samples relative to the normal samples. 
25 Figure 1 1 is a histogram showing the expression of Ephrin type-B receptor 2 precursor 

(Tyrosine-protein kinase receptor EPH-3) M85491 transcripts which are detectable by amplicon 
as depicted in sequence name M85491seg24 in different normal tissues. 

Figure 12 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster T39971, demonstrating overexpression in liver cancer, lung malignant tumors and 
30 pancreas carcinoma. 
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Figure 13 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster Z21368, demonstrating overexpression in epithelial malignant tumors, a mixture of 
malignant tumors from different tissues and pancreas carcinoma. 

Figure 14 is a histogram showing over expression of the Extracellular sulfatase Sulf-1 
5 Z21368 transcripts, which are detectable by amplicon as depicted in sequence name 
Z21368juncl7-21, in cancerous lung samples relative to the normal samples. 

Figure 15 is a histogram showing the expression of Extracellular sulfatase Sulf-1 
Z21368 transcripts, which are detectable by amplicon as depicted in sequence name 
Z21368 juncl7-21, in different normal tissues. 
10 . Figure 16 is a histogram showing over expression of the SUL INHUMAN - 

Extracellular sulfatase Sulf-1, Z21368 transcripts, which are detectable by amplicon as depicted 
in sequence name Z21368seg39, in cancerous lung samples relative to the normal samples. 

Figure 17 is a histogram showing expression of SUL INHUMAN - Extracellular sulfatase 
15 Sul£l, Z21368 transcripts, which are detectable by amplicon as depicted in sequence name 
Z21368seg39, in different normal tissues. 

Figure 18 is a histogram showing the expression of SM02 HUMAN SPARC related 

modular calcium-binding protein 2 precursor (Secreted modular calcium-binding protein 2) 
(SMOC-2) (Smooth muscle-associated protein 2) Z44808 transcripts which are detectable by 
20 amplicon as depicted in sequence name Z44808 junc8-l 1 in different normal tissues. 



Figure 19 is a histogram showing over expression of the gastrin- releasing peptide 
(HUMGRP5E) transcripts, which are detectable by amplicon as depicted in sequence name 
25 HUMGRP5Ejunc3-7, in several cancerous lung samples relative to the normal samples. 

Figure 20 is a histogram showing the expression of gastrin-releasing peptide 
(HUMGRP5E) transcripts, which are detectable by amplicon as depicted in sequence name 
HUMGRP5Ejunc3-7, in different normal tissues. 

Figure 21 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
30 Cluster F05068, demonstrating overexpression in uterine malignancies. 
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Figure 22 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster HI 4624, demonstrating overexpression in colorectal cancer, epithelial malignant tumors, 
a mixture of malignant tumors from different tissues, lung malignant tumors and pancreas 
carcinoma. 

Figure 23 is a histogram showing Cancer and cell-line vs. norcnal tissue expression for 
Cluster H38804, demonstrating overexpression in transitional cell carcinoma, brain malignant 
tumors, a mixture of malignant tumors from different tissues and gastric carcinoma. 

Figure 24 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster HSENA78, demonstrating overexpression in epithelial malignant tumors and lung 
malignant tumors. 

Figure 25 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster HUMODCA, demonstrating overexpression in : brain malignant tumors, colorectal 
cancer, epithelial malignant tumors and a mixture of malignant tumors from different tissues. 

Figure 26 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster R00299, demonstrating overexpression in lung malignant tumors. 

Figure 27 is a histogram showing Cancer and cell- line vs. nonnal tissue expression for 
Cluster Z41644, demonstrating overexpression in lung malignant tumors, breast malignant 
tumors and pancreas carcinoma. 

Figure 28 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster Z44808, demonstrating overexpression in colorectal cancer, lung cancer and pancreas 
carcinoma. 

Figure 29 is a histogram showing over expression of the SM02 HUMAN SPARC related 
modular calcium-binding protein 2 Z44808 transcripts, which are detectable by amplicon as 
depicted in sequence name Z44808junc8-ll, in cancerous lung samples relative to the normal 
samples. 

Figure 30 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster AA161187, demonstrating overexpression in brain malignant tumors, epithelial 
malignant tumors and a mixture of malignant tumors from different tissues. 

Figure 31 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster AA161 187, demonstrating overexpression in brain malignant tumors and a mixture of 
malignant tumors from different tissues. 
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Figure 32 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster HUMCA1XIA, demonstrating overexpression in bone malignant tumors, epithelial 
malignant tumors, a mixture of malignant tumors from different tissues and lung malignant 
tumors. 

5 Figure 33 is a histogram showing Cancer and cell- line vs. normal tissue expression for 

Cluster HUMCEA, demonstrating overexpression in epithelial malignant tumors, a mixture of 
malignant tumors from different tissues and pancreas carcinoma. 

Figure 34 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster R35137, demonstrating overexpression in hepatocellular carcinoma. 
10 Figure 35 is a histogram showing Cancer and cell- line vs. normal tissue expression for 

Cluster Z25299, demonstrating overexpression in brain malignant tumors, a mixture of 
malignant tumors from different tissues and ovarian carcinoma. 

Figure 36 is a histogram showing down regulation of the Secretory leukocyte protease 
inhibitor Acid-stable proteinase inhibitor Z25299 transcripts, which are detectable by amplicon 
15 as depicted in sequence name Z25299 juncl3-14-21, in cancerous lung samples relative to the 
normal samples. 

Figure 37 is a histogram showing down regulation of the Secretory leukocyte protease 
inhibitor Acid-stable proteinase inhibitor Z25299 transcripts, which are detectable by amplicon 
as depicted in sequence name Z25299 seg20, in cancerous lung samples relative to the normal 
20 samples. 

Figure 38 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster HSSTROL3, demonstrating overexpression in transitional cell carcinoma, epithelial 
malignant tumors, a mixture of malignant tumors from different tissues and pancreas carcinoma. 

Figure 39 is a histogram showing over expression of the Stromelysin-3 HSSTROL3 
25 transcripts, which are detectable by amplicon as depicted in sequence name HSSTROL3 seg24, 
in cancerous lung samples relative to the normal samples. 

Figure 40 is a histogram showing the expression of Stromelysin-3 
HSSTROL3 transcripts, which are detectable by amplicon as depicted in sequence name 
HSSTROL3 seg24, in different normal tissues. 
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Figure 41 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster HUMTREFAC, demonstrating overexpression in a mixture of malignant tumors from 
different tissues, breast malignant tumors, pancreas carcinoma and prostate cancer. 

Figure 42 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster HSS100PCB, demonstrating overexpression in a mixture of malignant tumors from 
different tissues. 

Figure 43 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster HSU33147, demonstrating overexpression in a mixture of malignant tumors from 
different tissues. 

Figure 44 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster R20779, demonstrating overexpression in epithelial malignant tumors, a mixture of 
malignant tumors from different tissues and lung malignant tumors. 

Figure 45 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster R38144, demonstrating overexpression in epithelial malignant tumors, lung malignant 
tumors, skin malignancies and gastric carcinoma. 

Figure 46 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster HUMOSTRO, demonstrating overexpression in epithelial malignant tumors, a mixture 
of malignant tumors from different tissues, lung malignant tumors, breast malignant tumors, 
ovarian carcinoma and skin malignancies. 

Figure 47 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster HUMOSTRO, demonstrating overexpression in epithelial malignant tumors, a mixture 
of malignant tumors from different tissues and kidney malignant tumors. 

Figure 48 is a histogram showing over expression of the Rl 1723 transcripts, which are 
detectable by amplicon as depicted in sequence name Rl 1723 segl3, in cancerous lung sample; 
relative to the normal samples. 

Figure 49 is a histogram showing the expression of Rl 1723 transcripts which are 
detectable by amplicon as depicted in sequence name Rl 1723segl3 in different normal tissues. 

Figure 50 is a histogram showing over expression of the Rl 1723 transcripts, which are 
detectable by amplicon as depicted in sequence name Rl 1723 juncl 1-18 in cancerous lung 
samples relative to the normal samples. 
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Figure 51 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster R16276, demonstrating overexpression in: lung malignant tumors. 
Figures 52-53 are histograms, showing differential expression of the 6 sequences 
H61775seg8, HUMGRP5E junc3-7, M85491Seg24 5 Z21368 juncl7-21, HSSTROL3seg24 and 
Z25299seg20 in cancerous lung samples relative to the normal samples. 

Figure 54a is a histogram showing the relative expression of trophinin associated protein 
(tastin) ) [T86235] variants (e.g., variant no. 23-26, 31, 32) in normal and tumor derived lung 
samples as determined by real time PGR using primers for SEQ ID NO: 1480. 

Figure 54b is a histogram showing the relative expression of trophinin associated protein 
(tastin) ) [T86235] variants (e.g., variant no. 8-10, 22, 23, 26,27, 29-31, 33) in normal and tumor 
derived lung samples as determined micro-array analysis using oligos detailed in SEQ ID NO: 
1512-1514. 

Figure 55 is a histogram showing the relative expression of Homeo box CIO (HOXC10) 
[N31842] variants (e.g., variant no. 3) in normal and tumor derived lung samples as determined 
by real time PGR using primers for SEQ ID NO: 1517. 

Figures 56a-b are histograms showing on two different scales the relative expression of 
Nucleolar protein 4 (NOL4) [T06014] variants (e.g., variant no. 3, 11 and 12) in normal and 
tumor derived lung samples as determined by real time PCR using primers for SEQ ID NO: 
1529. Figure 56a shows the results on scale:0-1200. Figure 56b shows the results on scale:0- 
24. 

Figures 57a-b is a histogram showing on two different scales the relative expression of 
Nucleolar protein 4 £TOL4) [T06014] variants (e.g., variant no. 3, 11 and 12) in normal and 
tumor derived lung samples as determined by real time PCR using primers for SEQ ID NO: 
1532. Figure 57a shows the results on scale:0-2000. Figure 57b shows the results on scale:0- 
42. 

Figure 58 is a histogram showing the relative expression of AA281370 variants (e.g., 
variant no. 0, 1, 4 and 5) in normal and tumor derived lung samples as determined by real time 
PCR using primers for SEQ ID NO: 1558. 
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Figure 59 is a histogram showing the relative expression of Sulfatase 1 (SULF1)- 
[Z21368] variants (e.g., variant no. 13 and 14) in normal and tumor derived lung samples as 
determined by real time PCR using primers for SEQ ID NO: 1574. 

Figure 60 is a histogram showing the relative expression of SRY (sex determining region 
5 Y)-box 2 (SOX2))-[HUMHMGBOX] variants (e.g., variant no. 0) in normal and tumor derived 
lung samples as determined by real time PCR using primers for SEQ ID NO: 1594. 

Figure 61 is a histogram showing the relative expression of Plakophilin 1 (ectodermal 
dysplasia/skin fragility syndrome) (PKP1) -[HSB6PR] variants (e.g., variant no. 0, 5 and 6) in 
normal and tumor derived lung samples as determined by real time PCR using primers for SEQ 
10 ID NO: 1600. 

Figure 62 is a histogram showing the relative expression of transcripts detectable by SEQ 
IDNOs: 1480, 1517, 1529, 1532, 1558, 1574, 1594, 1600, 1616, 1619, 1622, 1625 in normal and 
tumor derived lung samples as determined by real time PCR. 



15 Figure 63 is an amino acid sequence alignment, using NCBI BLAST default parameters, 

demonstrating similarity between the AA281370 lung cancer biomarker if the present invention 
to WD40 domains of various proteins involved in MAPK signal transduction pathway. Figure 
63a: amino acids at positions 40-790 of AA281370 polypeptide SEQ ID NO: 99 has 75% 
homology to mouse Mapkbpl protein (gi|471 24622). Figure 63b: amino acids at positions 40- 

20 886 of the AA281370 polypeptide SEQ ID NO: 99 has 70% homology to rat JNK-binding 
protein JNKBP1 (gi|34856717). 



Figure 64 is a histogram showing over expression of the Homo sapiens protease, serine, 
21 (testisin) (PRSS21) AA 161 187 transcripts, which are detectable by amplicon as depicted in 
25 sequence name AA161 187 seg25, in cancerous lung samples relative to the normal samples. 

Figure 65 is a histogram showing over expression of the protein tyrosine phosphatase, 
receptor type, S (PTPRS) M62069 transcripts, which are detectable by amplicon as depicted in 
sequence name M62069 segl9, in cancerous lung samples relative to the normal samples. 

Figure 66 is a histogram showing over expression of the protein tyrosine phosphatase, 
30 receptor type, S (PTPRS) M62069 transcripts, which are detectable by amplicon as depicted in 
sequence name M62069 seg29, in cancerous lung samples relative to the normal samples. 
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Figure 67 is a histogram showing over expression of the above -indicated Homo sapiens 
collagen, type XI, alpha 1 (COL1 1 Al) transcripts which are detectable by amplicon as depicted 
in sequence name HUMCA1X1 A seg55 in cancerous lung samples relative to the normal 
samples. 

5 Figure 68 is a histogram showing down regulation of the Homo sapiens secretory 

leukocyte protease inhibitor (antileukoproteinase) (SLPI) Z25299 transcripts which are 
detectable by amplicon as depicted in sequence name Z25299 seg23 in cancerous lung samples 
relative to the normal samples. 

Figure 69 is a histogram showing the expression of Secretory leukocyte protease 
10 inhibitor Acid-stable proteinase inhibitor Z25299 transcripts which are detectable by amplicon 
as depicted in sequence name Z25299seg20 in different normal tissues. 

Figure 70 is a histogram showing the expression of Secretory leukocyte protease 
inhibitor Acid-stable proteinase inhibitor Z25299 transcripts which are detectable by amplicon 
as depicted in sequence name Z25299seg23 in different normal tissues. 
15 Figure 71 is a histogram showing over expression of the Homo sapiens matrix 

metalloproteinase 1 1 (stromelysin 3) (MMP1 1) HSSTROL3 transcripts which are detectable by 
amplicon as depicted in sequence name HSSTROL3 seg20-2 in cancerous lung samples relative 
to the normal samples. 

Figure 72 is a histogram showing over expression of the Homo sapiens matrix 
20 metalloproteinase 1 1 (stromelysin 3) (MMP1 1) HSSTROL3 transcripts which are detectable by 
amplicon as depicted in sequence name HSSTROL3 junc21-27 in cancerous lung samples 
relative to the normal samples. 

Figure 73 is a histogram showing the expression of Rl 1723 transcripts, which were 
detected by amplicon as depicted in the sequence name Rl 1723 juncl 1-18 in different normal 
25 tissues. 

Figure 74 is a histogram showing over expression of the Homo sapiens fibroblast 
growth factor receptor- like 1 (FGFRL1) H53626 transcripts, which are detectable by amplicon 
as depicted in sequence name H53626 junc24-27FlR3 in cancerous lung samples relative to the 
normal samples. 

30 Figure 75 is a histogram showing the expression of the Homo sapiens fibroblast growth 

factor receptor- like 1 (FGFRL1) H53626 transcripts, which are detectable by amplicon as 
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depicted in sequence name H53626 seg25 in cancerous lung samples relative to the normal 
samples. 

Figure 76 is a histogram showing Cancer and cell- line vs. normal tissue expression for 
Cluster H53626, demonstrating overexpression in epithelial malignant tumors, a mixture of 
malignant tumors from different tissues and myosarcoma. 

Figure 77 is a histogram showing the expression of of Homo sapiens fibroblast growth 
factor receptor- like 1 (FGFRL1) H53626 transcripts, which are detectable by amplicon as 
depicted in sequence name H53626 seg25 in different normal tissues. 

Figure 78 is a histogram showing the expression of of Homo sapiens fibroblast growth 
factor receptor- like 1 (FGFRL1) H53626 transcripts, which are detectable by amplicon as 
depicted in sequence name U53626junc24-27FJR3 in different normal tissues. 

Figure 79 shows P SEC R11723 PEA 1 T5 PCR product: Lane 1: PCR product: and 
Lane 2: Low DNA Mass Ladder MW marker (Tnvitrogen Cat# 10068-0131 

Figure 80: PSEC Rl 1723JPEA_1 T5 PCR product sequence; In Red- PSEC Forward 
primer; In Blue- PSEC Reverse complementary sequence; and Highlighted sequence- PSEC 
variant Rl 1723_PEA_1 T5 ORF. 

Figure 81- PRSEC PCR product digested with Nhel and Hindlll; Lane 1- PRSET PCR 
product; Lane 2- Fermentas GeneRuler 1Kb DNA Ladder #SM0313. 

Figure 82 shows a plasmid map of His PSEC T5 pRSETA. 

Figure 83: Protein sequence of PSEC variant Rl 1723JPEAJ T5; In red- 6His tag; In 
blue- PSEC. 

Figure 84 shows the DNA sequence of HisPSEC T5 pRSETA; bold- HisPSEC T5 open 
reading frame; Italic- flanking DNA sequence which was verified by sequence analysis. 

Figure 85 shows Western blot analysis of recombinant HisPSEC variant R11723_PEA_1 
T5; lane 1: molecular weight marker (ProSieve color, Cambrex, Cat #50550); lane 2: HisPSEC 
T5 pRSETA TO; lane 3: His HisPSEC T5 pRSETA T3; lane 4 :His HisPSEC T5 pRSETA To.n; 
lane 5: pRSET empty vector TO (negative control); lane 6: pRSET empty vector T3 (negative 
control); lane 7: pRSET empty vector To.n (negative control); and lane 8: His positive control 
protein (HisTroponinT7 pRSETA T3). 
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DESCRIPTION OF PREFERRED EMBODIMENTS 

The present invention is of novel markers for lung cancer that are both sensitive and 
accurate. Furthermore, at least certain of these markers are able to distinguish between various 
types of lung cancer, such as small cell carcinoma; large cell carcinoma; squamous cell 
carcinoma; and adenocarcinoma, alone or in combination. These markers are differentially 
expressed, and preferably overexpressed, in lung cancer specifically, as opposed to normal lung 
tissue. The measurement of these markers, alone or in combination, in patient samples provides 
information that the diagnostician can correlate with a probable diagnosis of lung cancer. The 
markers of the present invention, alone or in combination, show a high degree of differential 
detection between lung cancer and non-cancerous states. The markers of the present invention, 
alone or in combination, can be used for prognosis, prediction, screening, early diagnosis, 
therapy selection and treatment monitoring of lung cancer. For example, optionally and 
preferably, these markers may be used for staging lung cancer and/or monitoring the progression 
of the disease. Furthermore, the markers of the present invention, alone or in combination, can 
be used for detection of the source of metastasis found in anatomical places other than lung. 
Also, one or more of the markers may optionally be used in combination with one or more other 
lung cancer markers (other than those described herein). According to an optional embodiment 
of the present invention, such a combination may be used to differentiate between various types 
of lung cancer, such as small cell carcinoma; large cell carcinoma; squamous cell carcinoma; 
and adenocarcinoma. Furthermore, the markers of the present invention, alone or in 
combination, can be used for detection of other types of tumors by elimination (for example, for 
such detection of carcinoid tumors, which are 5% of lung cancers). 

The markers of the present invention, alone or in combination, can be used for 
prognosis, prediction, screening, early diagnosis, staging, therapy selection and treatment 
monitoring of lung cancer. For example, optionally and preferably, these markers may be used 
for staging lung cancer and/or monitoring the progression of the disease. Furthermore, the 
markers of the present invention, alone or in combination, can be used for detection of the 
source of metastasis found in anatomical places other then lung. Also, one or more of the 
markers may optionally be used in combination with one or more other lung cancer markers 
(other than those described herein). 
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Biomolecular sequences (amino acid and/or nucleic acid sequences) uncovered using the 
methodology of the present invention and described herein can be efficiently utilized as tissue or 
pathological markers and/or as drugs or drug targets for treating or preventing a disease. 

These markers are specifically released to the bloodstream under conditions of lung 
cancer, and/or are otherwise expressed at a much higher level and/or specifically expressed in 
lung cancer tissue or cells. The measurement of these markers, alone or in combination, in 
patient samples provides information that the diagnostician can correlate with a probable 
diagnosis of lung cancer. 

The present invention therefore also relates to diagnostic assays for lung cancer and/or 
an indicative condition, and methods of use of such markers for detection of lung cancer and/or 
an indicative condition, optionally and preferably in a sample taken from a subject (patient), 
which is more preferably some type of blood sample. 

In another embodiment, the present invention relates to bridges, tails, heads and/or 
insertions, and/or analogs, homologs and derivatives of such peptides. Such bridges, tails, heads 
and/or insertions are described in greater detail below with regard to the Examples. 

As used herein a "tail" refers to a peptide sequence at the end of an amino acid sequence 
that is unique to a splice variant according to the present invention. Therefore, a splice variant 
having such a tail may optionally be considered as a chimera, in that at least a first portion of the 
splice variant is typically highly homologous (often 100% identical) to a portion of the 
corresponding known protein, while at least a second portion of the variant comprises the tail. 

As used herein a "head" refers to a peptide sequence at the beginning of an amino acid 
sequence that is unique to a splice variant according to the present invention. Therefore, a splice 
variant having such a head may optionally be considered as a chimera, in that at least a first 
portion of the splice variant comprises the head, while at least a second portion is typically 
highly homologous (often 100% identical) to a portion of the corresponding known protein. 

As used herein "an edge portion" refers to a connection between two portions of a splice 
variant according to the present invention that were not joined in the wild type or known 
protein. An edge may optionally arise due to a join between the above "known protein" portion 
of a variant and the tail, for example, and/or may occur if an internal portion of the wild type 
sequence is no longer present, such that two portions of the sequence are now contiguous in the 
splice variant that were not contiguous in the known protein. A "bridge" may optionally be an 
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edge portion as described above, but may also include a join between a head and a "known 
protein" portion of a variant, or a join between a tail and a "known protein" portion of a variant, 
or a join between an insertion and a "known protein" portion of a variant. 

Optionally and preferably, a bridge between a tail or a head or a unique insertion, and a 
5 "known protein" portion of a variant, comprises at least about 10 amino acids, more preferably 
at least about 20 amino acids, most preferably at least about 30 amino acids, and even more 
preferably at least about 40 amino acids, in which at least one amino acid is from the 
tail/head/insertion and at least one amino acid is from the "known protein" portion of a variant. 
Also optionally, the bridge may comprise any number of amino acids from about 10 to about 40 
10 amino acids (for example, 10, 11, 12, 13. ..37, 38, 39, 40 amino acids in length, or any number 
in between). 

It should be noted that a bridge cannot be extended beyond the length of the sequence in 
either direction, and it should be assumed that every bridge description is to be read in such 
manner that the bridge length does not extend beyond the sequence itself. 

15 Furthermore, bridges are described with regard to a sliding window in certain contexts 

below. For example, certain descriptions of the bridges feature the following format: a bridge 
between two edges (in which a portion of the known protein is not present in the variant) may 
optionally be described as follows: a bridge portion of CONTIG-NAMEJP1 (representing the 
name of the protein), comprising a polypeptide having a length "n", wherein n is at least about 

20 10 amino acids in length, optionally at least about 20 amino acids in length, preferably at least 
about 30 amino acids in length, more preferably at least about 40 amino acids in length and most 
preferably at least about 50 amino acids in length, wherein at least two amino acids comprise 
XX (2 amino acids in the center of the bridge, one from each end of the edge), having a 
structure as follows (numbering according to the sequence of CONTIG-NAMEP1): a sequence 

25 starting from any of amino acid numbers 49- x to 49 (for example); and ending at any of amino 
acid numbers 50 + ((i>2) - x) (for example), in which x varies from 0 to n-2. In this example, it 
should also be read as including bridges in which n is any number of amino acids between 10-50 
amino acids in length. Furthermore, the bridge polypeptide cannot extend beyond the sequence, 
so it should be read such that 49-x (for example) is not less than 1, nor 50 + ((n-2) - x) (for 

30 example) greater than the total sequence length. 
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In another embodiment, this invention provides antibodies specifically recognizing the 
splice variants and polypeptide fragments thereof of this invention. Preferably such antibodies 
differentially recognize splice variants of the present invention but do not recognize a 
corresponding known protein (such known proteins are discussed with regard to their splice 
5 variants in the Examples below). 

In another embodiment, this invention provides an isolated nucleic acid molecule 
encoding for a splice variant according to the present invention, having a nucleotide sequence as 
set forth in any one of the sequences listed herein, or a sequence complementary thereto. In 
another embodiment, this invention provides an isolated nucleic acid molecule, having a 
10 nucleotide sequence as set forth in any one of the sequences listed herein, or a sequence 
complementary thereto. In another embodiment, this invention provides an oligonucleotide of at 
least about 12 nucleotides, specifically hybridizable with the nucleic acid molecules of this 
invention. In another embodiment, this invention provides vectors, cells, liposomes and 
compositions comprising the isolated nucleic acids of this invention. 

15 In another embodiment, this invention provides a method for detecting a splice variant 

according to the present invention in a biological sample, comprising: contacting a biological 
sample with an antibody specifically recognizing a splice variant according to the present 
invention under conditions whereby the antibody specifically interacts with the splice variant in 
the biological sample but do not recognize known corresponding proteins (wherein the known 

20 protein is discussed with regard to its splice variant(s) in the Examples below), and detecting 
said interaction; wherein the presence of an interaction correlates with the presence of a splice 
variant in the biological sample. 

In another embodiment, this invention provides a method for detecting a splice variant 
nucleic acid sequences in a biological sample, comprising: hybridizing the isolated nucleic acid 
25 molecules or oligonucleotide fragments of at least about a minimum length to a nucleic acid 
material of a biological sample and detecting a hybridization complex; wherein the presence of a 
hybridization complex correlates with the presence of a splice variant nucleic acid sequence in 
the biological sample. 

According to the present invention, the splice variants described herein are non- limiting 
30 examples of markers for diagnosing lung cancer. Each splice variant marker of the present 
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invention can be used alone or in combination, for various uses, including but not limited to, 
prognosis, prediction, screening, early diagnosis, determination of progression, therapy selection 
and treatment monitoring of lung cancer. 

According to optional but preferred embodiments of the present invention, any marker 
5 according to the present invention may optionally be used alone or combination. Such a 
combination may optionally comprise a plurality of markers described herein, optionally 
including any subcombination of markers, and/or a combination featuring at least one other 
marker, for example a known marker. Furthermore, such a combination may optionally and 
preferably be used as described above with regard to determining a ratio between a quantitative 

10 or semi- quantitative measurement of any marker described herein to any other marker described 
herein, and/or any other known marker, and/or any other marker. With regard to such a ratio 
between any marker described herein (or a combination thereof) and a known marker, more 
preferably the known marker comprises the "known protein" as described in greater detail 
below with regard to each cluster or gene. 

15 According to other preferred embodiments of the present invention, a splice variant 

protein or a fragment thereof, or a splice variant nucleic acid sequence or a fragment thereof, 
may be featured as a biomarker for detecting lung cancer, such that a biomarker may optionally 
comprise any of the above. 

According to still other preferred embodiments, the present invention optionally and 

20 preferably encompasses any amino acid sequence or fragment thereof encoded by a nucleic acid 
sequence corresponding to a splice variant protein as described herein. Any oligopeptide or 
peptide relating to such an amino acid sequence or fragment thereof may optionally also 
(additionally or alternatively) be used as a biomarker, including but not limited to the unique 
amino acid sequences of these proteins that are depicted as tails, heads, insertions, edges or 

25 bridges. The present invention also optionally encompasses antibodies capable of recognizing, 
and/or being elicited by, such oligopeptides or peptides. 

The present invention also optionally and preferably encompasses any nucleic acid 
sequence or fragment thereof, or amino acid sequence or fragment thereof, corresponding to a 
splice variant of the present invention as described above, optionally for any application. 

30 Non- limiting examples of methods or assays are described below. 

The present invention also relates to kits based upon such diagnostic methods or assays. 
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Various embodiments of the present invention encompass nucleic acid sequences 
described hereinabove; fragments thereof, sequences hybridizable therewith, sequences 
5 homologous thereto, sequences encoding similar polypeptides with different codon usage, 
altered sequences characterized by mutations, such as deletion, insertion or substitution of one or 
more nucleotides, either naturally occurring or artificially induced, either randomly or in a 
targeted fashion. 

The present invention encompasses nucleic acid sequences described herein; fragments 

10 thereof, sequences hybridizable therewith, sequences homologous thereto [e.g., at least 50 %, at 
least 55 %, at least 60%, at least 65 %, at least 70 %, at least 75 %, at least 80 %, at least 85 %, at 
least 95 % or more say 100 % identical to the nucleic acid sequences set forth below], sequences 
encoding similar polypeptides with different codon usage, altered sequences characterized by 
mutations, such as deletion, insertion or substitution of one or more nucleotides, either naturally 

15 occurring or man induced, either randomly or in a targeted fashion. The present invention also 
encompasses homologous nucleic acid sequences (i.e., which form a part of a polynucleotide 
sequence of the present invention) which include sequence regions unique to the polynucleotides 
of the present invention. 

In cases where the polynucleotide sequences of the present invention encode previously 

20 unidentified polypeptides, the present invention also encompasses novel polypeptides or portions 
thereof, which are encoded by the isolated polynucleotide and respective nucleic acid fragments 
thereof described hereinabove. 

A "nucleic acid fragment" or an "oligonucleotide" or a "polynucleotide" are used herein 
interchangeably to refer to a polymer of nucleic acids. A polynucleotide sequence of the present 

25 invention refers to a single or double stranded nucleic acid sequences which is isolated and 
provided in the form of an RNA sequence, a complementary polynucleotide sequence (cDNA), a 
genomic polynucleotide sequence and/or a composite polynucleotide sequences (e.g., a 
combination of the above). 

As used herein the phrase "complementary polynucleotide sequence" refers to a 

30 sequence, which results from reverse transcription of messenger RNA using a reverse 
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transcriptase or any other RNA dependent DNA polymerase. Such a sequence can be 
subsequently amplified in vivo or in vitro using a DNA dependent DNA polymerase. 

As used herein the phrase "genomic polynucleotide sequence" refers to a sequence 
derived (isolated) from a chromosome and thus it represents a contiguous portion of a 
5 chromosome. 

As used herein the phrase "composite polynucleotide sequence" refers to a sequence, 
which is composed of genomic and cDNA sequences. A composite sequence can include some 
exonal sequences required to encode the polypeptide of the present invention, as well as some 
intronic sequences interposing therebetween. The intronic sequences can be of any source, 
10 including of other genes, and typically will include conserved splicing signal sequences. Such 
intronic sequences may further include cis acting expression regulatory elements. 

Preferred embodiments of the present invention encompass oligonucleotide probes. 

An example of an oligonucleotide probe which can be utilized by the present invention is 
a single stranded polynucleotide which includes a sequence complementary to the unique 
15 sequence region of any variant according to the present invention, including but not limited to a 
nucleotide sequence coding for an amino sequence of a bridge, tail, head and/or insertion 
according to the present invention, and/or the equivalent portions of any nucleotide sequence 
given herein (including but not limited to a nucleotide sequence of a node, segment or amplicon 
described herein). 

20 Alternatively, an oligonucleotide probe of the present invention can be designed to 

hybridize with a nucleic acid sequence encompassed by any of the above nucleic acid sequences, 
particularly the portions specified above, including but not limited to a nucleotide sequence 
coding for an amino sequence of a bridge, tail, head and/or insertion according to the present 
invention, and/or the equivalent portions of any nucleotide sequence given herein (including but 

25 not limited to a nucleotide sequence of a node, segment or amplicon described herein). 

Oligonucleotides designed according to the teachings of the present invention can be 
generated according to any oligonucleotide synthesis method known in the art such as enzymatic 
synthesis or solid phase synthesis. Equipment and reagents for executing solid-phase synthesis 
are commercially available from, for example, Applied Biosystems. Any other means for such 

30 synthesis may also be employed; the actual synthesis of the oligonucleotides is well within the 
capabilities of one skilled in the art and can be accomplished via established methodologies as 
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detailed in, for example, "Molecular Cloning: A laboratory Manual" Sambrook et al., (1989); 

"Current Protocols in Molecular Biology" Volumes I- III Ausubel, R. M., ed. (1994); Ausubel et 

al., "Current Protocols in Molecular Biology", John Wiley and Sons, Baltimore, Maryland 

(1989); Perbal, "A Practical Guide to Molecular Cloning", John Wiley & Sons, New York 
5 (1988) and "Oligonucleotide Synthesis" Gait, M. J., ed. (1984) utilizing solid phase chemistry, 

e.g. cyanoethyl phosphoramidite followed by deprotection, desalting and purification by for 

example, an automated trityl-on method or HPLC. 

Oligonucleotides used according to this aspect of the present invention are those having a 

length selected from a range of about 10 to about 200 bases preferably about 15 to about 150 
10 bases, more preferably about 20 to about 100 bases, most preferably about 20 to about 50 bases. 

Preferably, the oligonucleotide of the present invention features at least 17, at least 18, at least 

19, at least 20, at least 22, at least 25, at least 30 or at least 40, bases specifically hybridizable 

with the biomarkers of the present invention. 

The oligonucleotides of the present invention may comprise heterocylic nucleosides 
15 consisting of purines and the pyrimidines bases, bonded in a 3' to 5' phosphodiester linkage. 

Preferably used oligonucleotides are those modified at one or more of the backbone, 

internucleoside linkages or bases, as is broadly described hereinunder. 

Specific examples of preferred oligonucleotides useful according to this aspect of the 

present invention include oligonucleotides containing modified backbones or non-natural 
20 internucleoside linkages. Oligonucleotides having modified backbones include those that retain 

a phosphorus atom in the backbone, as disclosed in U.S. Pat. NOs: 4,469,863; 4,476,301; 

5,023,243; 5,177,196; 5,188,897; 5,264,423; 5,276,019; 5 5 278,302; 5,286,717; 5,321,131; 

5,399,676; 5,405,939; 5,453,496; 5,455,233; 5,466, 677; 5,476,925; 5,519,126; 5,536,821; 

5,541,306; 5,550,111; 5,563,253; 5,571,799; 5,587,361; and 5,625,050. 
25 Preferred modified oligonucleotide backbones include, for example, phosphorothioates, 

chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkyl phosphotriesters, 

methyl and other alkyl phosphonates including 3'-alkylene phosphonates and chiral 

phosphonates, phosphinates, phosphoramidates including 3' -amino phosphoramidate and 

aminoalkylphosphoramidates, thionophosphoramidates, thionoalkylphosphonates, 

30 thionoalkylphosphotriesters, and boranophosphates having normal 3'-5 f linkages, 2-5 1 linked 

analogs of these, and those having inverted polarity wherein the adjacent pairs of nucleoside 
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units are linked 3 f -5' to 5*-3 r or 2'-5' to 5'-2 f . Various salts, mixed salts and free acid forms can 
also be used. 

Alternatively, modified oligonucleotide backbones that do not include a phosphorus atom 
therein have backbones that are formed by short chain alkyl or cycloalkyl internucleoside 
5 linkages, mixed heteroatom and alkyl or cycloalkyl internucleoside linkages, or one or more 
short chain heteroatomic or heterocyclic internucleoside linkages. These include those having 
morpholino linkages (formed in part from the sugar portion of a nucleoside); siloxane 
backbones; sulfide, sulfoxide and sulfone backbones; formacetyl and thioformacetyl backbones; 
methylene formacetyl and thioformacetyl backbones; alkene containing backbones; sulfamate 

10 backbones; methyleneimino and methylenehydrazino backbones; sulfonate and sulfonamide 
backbones; amide backbones; and others having mixed N, O, S and CKb component parts, as 
disclosed in U.S. Pat. Nos. 5,034,506; 5,166,315; 5,185,444; 5,214,134; 5,216,141; 5,235,033; 
5,264,562; 5,264,564; 5,405,938; 5,434,257; 5,466,677; 5,470,967; 5,489,677; 5,541,307; 
5,561,225; 5,596,086; 5,602,240; 5,610,289; 5,602,240; 5,608,046; 5,610,289; 5,618,704; 5,623, 

15 070; 5,663,312; 5,633,360; 5,677,437; and 5,677,439. 

Other oligonucleotides which can be used according to the present invention, are those 
modified in both sugar and the internucleoside linkage, i.e., the backbone, of the nucleotide units 
are replaced with novel groups. The base units are maintained for complementation with the 
appropriate polynucleotide target. An example for such an oligonucleotide mimetic, includes 

20 peptide nucleic acid (PNA). United States patents that teach the preparation of PNA compounds 
include, but are not limited to, U.S. Pat. Nos. 5,539,082; 5,714,331; and 5,719,262, each of 
which is herein incorporated by reference. Other backbone modifications, which can be used in 
the present invention are disclosed in U.S. Pat. No: 6,303,374. 

Oligonucleotides of the present invention may also include base modifications or 

25 substitutions. As used herein, "unmodified" or "natural" bases include the purine bases adenine 
(A) and guanine (G), and the pyrimidine bases thymine (T), cytosine (C) and uracil (U). 
Modified bases include but are not limited to other synthetic and natural bases such as 5- 
methylcytosine (5-me-C), 5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine, 
6-methyl and other alkyl derivatives of adenine and guanine, 2-propyl and other alkyl derivatives 

30 of adenine and guanine, 2-thiouracil, 2-thiothymine and 2-thiocytosine, 5-halouracil and 
cytosine, 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and thymine, 5-uracil 
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(pseudouracil), 4-thiouracil, 8-halo, 8-amino, 8-thiol, 8-thioalkyl, 8-hydroxyl and other 8- 
substituted adenines and guanines, 5- halo particularly 5-bromo, 5-trifluoromethyl and other 5- 
substituted uracils and cytosines, 7-methylguanine and 7-methyladenine, 8-azaguanine and 8- 
azaadenine, 7-deazaguanine and 7-deazaadenine and 3-deazaguanine and 3-deazaadenine. 
5 Further bases particularly useful for increasing the binding affinity of the oligomeric compounds 
of the invention include 5-substituted pyrimidines, 6-azapyrimidines and N-2, N-6 and 0-6 
substituted purines, including 2-aminopropyladenine, 5-propynyluracil and 5-propynylcytosine. 
5-methylcytosine substitutions have been shown to increase nucleic acid duplex stability by 0.6- 
1.2 °C and are presently preferred base substitutions, even more particularly when combined 

10 with 2-O-methoxyethyl sugar modifications. 

Another modification of the oligonucleotides of the invention involves chemically 
linking to the oligonucleotide one or more moieties or conjugates, which enhance the activity, 
cellular distribution or cellular uptake of the oligonucleotide. Such moieties include but are not 
limited to lipid moieties such as a cholesterol moiety, cholic acid, a thioether, e.g., hexyl-S- 

15 tritylthiol, a thiocholesterol, an aliphatic chain, e.g., dodecandiol or undecyl residues, a 
phospholipid, e.g., di-hexadecyl-rac- glycerol or triethylammonium 1,2-di-O-hexadecyl-rac- 
glycero-3-H-phosphonate, a polyamine or a polyethylene glycol chain, or adamantane acetic 
acid, a palmityl moiety, or an octadecylamine or hexylamino-carbonyl-oxycholesterol moiety, 
as disclosed in U.S. Pat No: 6,303,374. 

20 It is not necessary for all positions in a given oligonucleotide molecule to be uniformly 

modified, and in fact more than one of the aforementioned modifications may be incorporated in 
a single compound or even at a single nucleoside within an oligonucleotide. 

It will be appreciated that oligonucleotides of the present invention may include further 
modifications for more efficient use as diagnostic agents and/or to increase bioavailability, 

25 therapeutic efficacy and reduce cytotoxicity. 

To enable cellular expression of the polynucleotides of the present invention, a nucleic 
acid construct according to the present invention may be used, which includes at least a coding 
region of one of the above nucleic acid sequences, and further includes at least one cis acting 
regulatory element. As used herein, the phrase "cis acting regulatory element" refers to a 

30 polynucleotide sequence, preferably a promoter, which binds a trans acting regulator and 
regulates the transcription of a coding sequence located downstream thereto. 
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Any suitable promoter sequence can be used by the nucleic acid construct of the present 
invention. 

Preferably, the promoter utilized by the nucleic acid construct of the present invention is 
active in the specific cell population transformed. Examples of cell type-specific and/or tissue- 

5 specific promoters include promoters such as albumin that is liver specific, lymphoid specific 
promoters [Calame et al., (1988) Adv. Immunol. 43:235-275]; in particular promoters of T-cell 
receptors [Winoto et al., (1989) EMBO J. 8:729-733] and immunoglobulins; [Banerji et al. 
(1983) Cell 33729-740], neuron- specific promoters such as the neurofilament promoter [Byrne 
et al. (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477], pancreas- specific promoters [Edlunch 

10 et al. (1985) Science 230:912-916] or mammary gland- specific promoters such as the milk 
whey promoter (U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). 
The nucleic acid construct of the present invention can further include an enhancer, which can 
be adjacent or distant to the promoter sequence and can function in up regulating the 
transcription therefrom. 

15 The nucleic acid construct of the present invention preferably further includes an 

appropriate selectable marker and/or an origin of replication. Preferably, the nucleic acid 
construct utilized is a shuttle vector, which can propagate both in E. coli (wherein the construct 
comprises an appropriate selectable marker and origin of replication) and be compatible for 
propagation in cells, or integration in a gene and a tissue of choice. The construct according to 

20 the present invention can be, for example, a plasmid, a bacmid, a phagemid, a cosmid, a phage, 
a virus or an artificial chromosome. 

Examples of suitable constructs include, but are not limited to, pcDNA3, pcDNA3.1 
(+/-), pGL3, PzeoSV2 (+/-), pDisplay, pEF/myc/cyto, pCMV/myc/cyto each of which is 
commercially available from Invitrogen Co. (www.invitrogen.com). Examples of retroviral 

25 vector and packaging systems are those sold by Clontech, San Diego, Calif., includingRetro-X 
vectors pLNCX and pLXSN, which permit cloning into multiple cloning sites and the trasgene 
is transcribed from CMV promoter. Vectors derived from Mo-MuLV are also included such as 
pBabe, where the transgene will be transcribed from the 5'LTR promoter. 

Currently preferred in vivo nucleic acid transfer techniques include transfection with 

30 viral or non- viral constructs, such as adenovirus, lentivirus, Herpes simplex I virus, or adeno- 
associated virus (AAV) and lipid-based systems. Useful lipids for lipid- mediated transfer of the 
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gene are, for example, DOTMA, DOPE, and DC-Choi [Tonkinson et aL, Cancer Investigation, 
14(1): 54-65 (1996)]. The most preferred constructs for use in gene therapy are viruses, most 
preferably adenoviruses, AAV, lentiviruses, or retroviruses. A viral construct such as a 
retroviral construct includes at least one transcriptional promoter/enhancer or locus -defining 
5 element(s), or other elements that control gene expression by other means such as alternate 
splicing, nuclear RNA export, or post-translational modification of messenger. Such vector 
constructs also include a packaging signal, long terminal repeats (LTRs) or portions thereof, and 
positive and negative strand primer binding sites appropriate to the virus used, unless it is 
already present in the viral construct. In addition, such a construct typically includes a signal 

1 0 sequence for secretion of the peptide from a host cell in which it is placed. Preferably the signal 
sequence for this purpose is a mammalian signal sequence or the signal sequence of the 
polypeptide variants of the present invention. Optionally, the construct may also include a 
signal that directs polyadenylation, as well as one or more restriction sites and a translation 
termination sequence. By way of example, such constructs will typically include a 5 r LTR, a 

15 IRNA binding site, a packaging signal, an origin of second-strand DNA synthesis, and a 3 f LTR 
or a portion thereof. Other vectors can be used that are noivviral, such as cationic lipids, 
polylysine, and dendrimers. 

Hybridization assays 

20 Detection of a nucleic acid of interest in a biological sample may optionally be effected 

by hybridization-based assays using an oligonucleotide probe (non- limiting examples of probes 
according to the present invention were previously described). 

Traditional hybridization assays include PCR, RT-PCR, Realtime PGR, RNase 
protection, in-situ hybridization, primer extension, Southern blots (DNA detection), dot or slot 

25 blots (DNA, RNA), and Northern blots (RNA detection) (NAT type assays are described in 
greater detail below). More recently, PNAs have been described (Nielsen et aL 1999, Current 
Opin. Biotechnol. 10:71-75). Other detection methods include kits containing probes on a 
dipstick setup and the like. 

Hybridization based assays which allow the detection of a variant of interest (i.e., DNA 

30 or RNA) in a biological sample rely on the use of oligonucleotides which can be 10, 15, 20, or 
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30 to 100 nucleotides long preferably from 10 to 50, more preferably from 40 to 50 nucleotides 
long. 

Thus, the isolated polynucleotides (oligonucleotides) of the present invention are 
preferably hybridizable with any of the herein described nucleic acid sequences under moderate 
5 to stringent hybridization conditions. 

Moderate to stringent hybridization conditions are characterized by a hybridization 

solution such as containing 10 % dextrane sulfate, 1 M NaCl, 1 % SDS and 5 x 10 6 cpm 32 P 
labeled probe, at 65 °C, with a final wash solution of 0.2 x SSC and 0.1 % SDS and final wash 
at 65 °C and whereas moderate hybridization is effected using a hybridization solution 

10 containing 10 % dextrane sulfate, 1 M NaCl, 1 % SDS and 5 x 10 6 cpm 32 P labeled probe, at 65 
°C, with a final wash solution of 1 x SSC and 0.1 % SDS and final wash at 50 °C. 

More generally, hybridization of short nucleic acids (below 200 bp in length, e.g. 17-40 
bp in length) can be effected using the following exemplary hybridization protocols which can 
be modified according to the desired stringency; (i) hybridization solution of 6 x SSC and 1 % 

15 SDS or 3 M TMACI, 0.01 M sodium phosphate (pH 6.8), 1 mM EDTA (pH 7.6), 0.5 % SDS, 
100 JLig/ml denatured salmon sperm DNA and 0.1 % nonfat dried milk, hybridization temperature 
of 1 - 1.5 °C below the T m , final wash solution of 3 M TMACI, 0.01 M sodium phosphate (pH 
6.8), 1 mM EDTA (pH 7.6), 0.5 % SDS at 1 - 1.5 °C below the T m ; (ii) hybridization solution 
of 6 x SSC and 0.1 % SDS or 3 M TMACI, 0.01 M sodium phosphate (pH 6.8), 1 mM EDTA 

20 (pH 7.6), 0.5 % SDS, 100 p,g/ml denatured salmon sperm DNA and 0.1 % nonfat dried milk, 
hybridization temperature of 2 - 2.5 °C below the T m , final wash solution of 3 M TMACI, 0.01 
M sodium phosphate (pH 6.8), 1 mM EDTA (pH 7.6), 0.5 % SDS at 1 - 1.5 °C below the Tm, 
final wash solution of 6 x SSC, and final wash at 22 °C; (in) hybridization solution of 6 x SSC 
and 1 % SDS or 3 M TMACI, 0.01 M sodium phosphate (pH 6.8), 1 mM EDTA (pH 7.6), 0.5 % 

25 SDS, 100 |Xg/ml denatured salmon sperm DNA and 0.1 % nonfat dried milk, hybridization 
temperature. 

The detection of hybrid duplexes can be carried out by a number of methods. Typically, 
hybridization duplexes are separated from unhybridized nucleic acids and the labels bound to the 
duplexes are then detected. Such labels refer to radioactive, fluorescent, biological or enzymatic 
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tags or labels of standard use in the art. A label can be conjugated to either the oligonucleotide 
probes or the nucleic acids derived from the biological sample. 

Probes can be labeled according to numerous well known methods. Non- limiting 
examples of radioactive labels include 3H, 14C, 32P, and 35S. Non- limiting examples of 
detectable markers include ligands, fluorophores, chemiluminescent agents, enzymes, and 
antibodies. Other detectable markers for use with probes, which can enable an increase in 
sensitivity of the method of the invention, include biotin and radio- nucleotides. It will become 
evident to the person of ordinary skill that the choice of a particular label dictates the manner in 
which it is bound to the probe. 

For example, oligonucleotides of the present invention can be labeled subsequent to 
synthesis, by incorporating biotinylated dNTPs or rNTP, or some similar means (e.g., photo- 
cross- linking a psoralen derivative of biotin to RNAs), followed by addition of labeled 
streptavidin (e.g., phycoerythrin-conjugated streptavidin) or the equivalent. Alternatively, when 
fluorescently- labeled oligonucleotide probes are used, fluorescein, lissamine, phycoerythrin, 
rhodamine (Perkin Elmer Cetus), Cy2, Cy3, Cy3.5, Cy5, Cy5.5, Cy7, FluorX (Amersham) and 
others [e.g., Kricka et al. (1992), Academic Press San Diego, Calif] can be attached to the 
oligonucleotides. 

Those skilled in the art will appreciate that wash steps may be employed to wash away 
excess target DNA or probe as well as unbound conjugate. Further, standard heterogeneous assay 
formats are suitable for detecting the hybrids using the labels present on the oligonucleotide 
primers and probes. 

It will be appreciated that a variety of controls may be usefully employed to improve 
accuracy of hybridization assays. For instance, samples may be hybridized to an irrelevant probe 
and treated with RNAse A prior to hybridization, to assess false hybridization. 

Although the present invention is not specifically dependent on the use of a label for the 
detection of a particular nucleic acid sequence, such a label might be beneficial, by increasing 
the sensitivity of the detection. Furthermore, it enables automation. Probes can be labeled 
according to numerous well known methods. 

As commonly known, radioactive nucleotides can be incorporated into probes of the 
invention by several methods. Non- limiting examples of radioactive labels include 3 H, 14 C, 32 P, 
and 35 S. 
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Those skilled in the art will appreciate that wash steps may be employed to wash away 
excess target DNA or probe as well as unbound conjugate. Further, standard heterogeneous assay 
formats are suitable for detecting the hybrids using the labels present on the oligonucleotide 
primers and probes. 

5 It will be appreciated that a variety of controls may be usefully employed to improve 

accuracy of hybridization assays. 

Probes of the invention can be utilized with naturally occurring sugar-phosphate 
backbones as well as modified backbones including phosphorothioates, dithionates, alkyl 
phosphonates and a-nucleotides and the like. Probes of the invention can be constructed of either 
10 ribonucleic acid (RJSfA) or deoxyribonucleic acid (DNA), and preferably of DNA. 

NAT Assays 

Detection of a nucleic acid of interest in a biological sample may also optionally be 
effected by NAT-based assays, which involve nucleic acid amplification technology, such as 
15 PGR for example (or variations thereof such as real-time PGR for example). 

As used herein, a "primer" defines an oligonucleotide which is capable of annealing to 
(hybridizing with) a target sequence, thereby creating a double stranded region which can serve 
as an initiation point for DNA synthesis under suitable conditions. 

Amplification of a selected, or target, nucleic acid sequence may be carried out by a 
20 number of suitable methods. See generally Kwoh et al., 1990, Am. Biotechnol. Lab. 8:14 
Numerous amplification techniques have been described and can be readily adapted to suit 
particular needs of a person of ordinary skill. Non- limiting examples of amplification techniques 
include polymerase chain reaction (PCR), ligase chain reaction (LCR), strand displacement 
amplification (SDA), transcription-based amplification, the q3 replicase system and NASBA 
25 (Kwoh et al., 1989, Proc. Natl. Acad. Sci. USA 86, 1173-1177; Lizardi et al., 1988, 
BioTechnology 6:1197-1202; Malek et aL, 1994, Methods Mol. Biol., 28:253-260; and 
Sambrook et al, 1989, supra). 

The terminology "amplification pair" (or "primer pair") refers herein to a pair of 
oligonucleotides (oligos) of the present invention, which are selected to be used together in 
30 amplifying a selected nucleic acid sequence by one of a number of types of amplification 
processes, preferably a polymerase chain reaction. Other types of amplification processes 
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include ligase chain reaction, strand displacement amplification, or nucleic acid sequence-based 
amplification, as explained in greater detail below. As commonly known in the art, the oligos 
are designed to bind to a complementary sequence under selected conditions. 

In one particular embodiment, amplification of a nucleic acid sample from a patient is 
5 amplified under conditions which favor the amplification of the most abundant differentially 
expressed nucleic acid. In one preferred embodiment, RT-PCR is carried out on an mRNA 
sample from a patient under conditions which favor the amplification of the most abundant 
mRNA. In another preferred embodiment, the amplification of the differentially expressed 
nucleic acids is carried out simultaneously. It will be realized by a person skilled in the art that 

1 0 such methods could be adapted for the detection of differentially expressed proteins instead of 
differentially expressed nucleic acid sequences. 

The nucleic acid (i.e. DNA or RNA) for practicing the present invention may be 
obtained according to well known methods. 

Oligonucleotide primers of the present invention may be of any suitable length, 

15 depending on the particular assay format and the particular needs and targeted genomes 
employed. Optionally, the oligonucleotide primers are at least 12 nucleotides in length, 
preferably between 15 and 24 molecules, and they may be adapted to be especially suited to a 
chosen nucleic acid amplification system. As commonly known in the art, the oligonucleotide 
primers can be designed by taking into consideration the melting point of hybridization thereof 

20 with its targeted sequence (Sambrook et al., 1989, Molecular Cloning -A Laboratory Manual, 
2nd Edition, CSH Laboratories; Ausubel et al., 1989, in Current Protocols in Molecular Biology, 
John Wiley & Sons Inc., N.Y.). 

It will be appreciated that antisense oligonucleotides may be employed to quantify 
expression of a splice isoform of interest. Such detection is effected at the pre- mRNA level. 

25 Essentially the ability to quantitate transcription from a splice site of interest can be effected 
based on splice site accessibility. Oligonucleotides may compete with splicing factors for the 
splice site sequences. Thus, low activity of the antisense oligonucleotide is indicative of 
splicing activity. 

The polymerase chain reaction and other nucleic acid amplification reactions are well 
30 known in the art (various non- limiting examples of these reactions are described in greater detail 
below). The pair of oligonucleotides according to this aspect of the present invention are 
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preferably selected to have compatible melting temperatures (Tm), e.g., melting temperatures 
which differ by less than that 7 °C, preferably less than 5 °C, more preferably less than 4 °C, 
most preferably less than 3 °C, ideally between 3 °C and 0 °C. 

Polymerase Chain Reaction (PGR): The polymerase chain reaction (PCR), as described 
5 in U.S. Pat. Nos. 4,683,195 and 4,683,202 to Mullis and Mullis et aL 9 is a method of increasing 
the concentration of a segment of targst sequence in a mixture of genomic DNA without cloning 
or purification. This technology provides one approach to the problems of low target sequence 
concentration. PCR can be used to directly increase the concentration of the target to an easily 
detectable level. This process for amplifying the target sequence involves the introduction of a 

1 0 molar excess of two oligonucleotide primers which are complementary to their respective strands 
of the double- stranded target sequence to the DNA mixture containing the desired target 
sequence. The mixture is denatured and then allowed to hybridize. Following hybridization, the 
primers are extended with polymerase so as to form complementary strands. The steps of 
denaturation, hybridization (annealing), and polymerase extension (elongation) can be repeated 

15 as often as needed, in order to obtain relatively high concentrations of a segment of the desired 
target sequence. 

The length of the segment of the desired target sequence is determined by the relative 
positions of the primers with respect to each other, and, therefore, this length is a controllable 
parameter. Because the desired segments of the target sequence become the dominant sequences 

20 (in terms of concentration) in the mixture, they are said to be "PCR-amplified." 

Ligase Chain Reaction (LCR or LAR): The ligase chain reaction [LCR; sometimes 
referred to as "Ligase Amplification Reaction" (LAR)] has developed into a well-recognized 
alternative method of amplifying nucleic acids. In LCR, four oligonucleotides, two adjacent 
oligonucleotides which uniquely hybridize to one strand of target DNA, and a complementary set 

25 of adjacent oligonucleotides, which hybridize to the opposite strand are mixed and DNA ligase is 
added to the mixture. Provided that there is complete complementarity at the junction, ligase 
will covalently link each set of hybridized molecules. Importantly, in LCR, two probes are 
ligated together only when they base-pair with sequences in the target sample, without gaps or 
mismatches. Repeated cycles of denaturation, and ligation amplify a short segment of DNA. 

30 LCR has also been used in combination with PCR to achieve enhanced detection of single-base 
changes: see for example Segev, PCT Publication No. W09001069 Al (1990). However, 
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because the four oligonucleotides used in this assay can pair to form two short ligatable 
fragments, there is the potential for the generation of target- independent background signal. The 
use of LCR for mutant screening is limited to the examination of specific nucleic acid positions. 

Self-Sustained Synthetic Reaction (3SR/NASBA): The self- sustained sequence replication 
5 reaction (3SR) is a transcription-based in vitro amplification system that can exponentially 
amplify RNA sequences at a uniform temperature. The amplified RNA can then be utilized for 
mutation detection. In this method, an oligonucleotide primer is used to add a phage RNA 
polymerase promoter to the 5' end of the sequence of interest. In a cocktail of enzymes and 
substrates that includes a second primer, reverse transcriptase, RNase H, RNA polymerase and 
10 ribo-and deoxyribonucleoside triphosphates, the target sequence undergoes repeated rounds of 
transcription, cDNA synthesis and second-strand synthesis to amplify the area of interest. The 
use of 3SR to detect mutations is kinetically limited to screening small segments of DNA (e.g., 
200-300 base pairs). 

Q-Beta (QP) Replicase: In this method, a probe which recognizes the sequence of 

15 interest is attached to the replicatable RNA template for Qp replicase. A previously identified 
major problem with false positives resulting from the replication of unhybridized probes has 
been addressed through use of a sequence- specific ligation step. However, available 
thermostable DNA ligases are not effective on this RNA substrate, so the ligation must be 
performed by T4 DNA ligase at low temperatures (37 degrees C). This prevents the use of high 

20 temperature as a means of achieving specificity as in the LCR, the ligation event can be used to 
detect a mutatbn at the junction site, but not elsewhere. 

A successful diagnostic method must be very specific. A straight-forward method of 
controlling the specificity of nucleic acid hybridization is by controlling the temperature of the 
reaction. While the 3SR/NASBA, and Qp systems are all able to generate a large quantity of 

25 signal, one or more of the enzymes involved in each cannot be used at high temperature (i.e., > 
55 degrees C). Therefore the reaction temperatures cannot be raised to prevent non-specific 
hybridization of the probes. If probes are shortened in order to make them melt more easily at 
low temperatures, the likelihood of having more than one perfect match in a complex genome 
increases. For these reasons, PGR and LCR currently dominate the research field in detection 

30 technologies. 
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The basis of the amplification procedure in the PCR and LCR is the fact that the products 
of one cycle become usable templates in all subsequent cycles, consequently doubling the 
population with each cycle. The final yield of any such doubling system can be expressed as: 

(1+X) n =y, where "X" is the mean efficiency (percent copied in each cycle), "n" is the number of 
5 cycles, and "y" is the overall efficiency, or yield of the reaction. If every copy of a target DNA is 
utilized as a template in every cycle of a polymerase chain reaction, then the mean efficiency is 

100 %. If 20 cycles of PCR are performed, then the yield will be 2 20 , or 1,048,576 copies of the 
starting material. If the reaction conditions reduce the mean efficiency to 85 %, then the yield in 

those 20 cycles will be only 1.85^0, or 220,513 copies of the starting material. In other words, a 

10 PCR running at 85 % efficiency will yield only 21 % as much final product, compared to a 
reaction running at 100 % efficiency. A reaction that is reduced to 50 % mean efficiency will 
yield less than 1 % of the possible product. 

In practice, routine polymerase chain reactions rarely achieve the theoretical maximum 
yield, and PCRs are usually run for more than 20 cycles to compensate for the lower yield. At 

15 50 % mean efficiency, it would take 34 cycles to achieve the million- fold amplification 
theoretically possible in 20, and at lower efficiencies, the number of cycles required becomes 
prohibitive. In addition, any background products that amplify with a better mean efficiency 
than the intended target will become the dominant products. 

Also, many variables can influence the mean efficiency of PCR, including target DNA 

20 length and secondary structure, primer length and design, primer and dNTP concentrations, and 
buffer composition, to name but a few. Contamination of the reaction with exogenous DNA 
(e.g., DNA spilled onto lab surfaces) or cross-contamination is also a major consideration. 
Reaction conditions must be carefully optimized for each different primer pair and target 
sequence, and the process can take days, even for an experienced investigator. The 

25 laboriousness of this process, including numerous technical considerations and other factors, 
presents a significant drawback to using PCR in the clinical setting. Indeed, PCR has yet to 
penetrate the clinical market in a significant way. The same concerns arise with LCR, as LCR 
must also be optimized to use different oligonucleotide sequences for each target sequence. In 
addition, both methods require expensive equipment, capable of precise temperature cycling. 

30 Many applications of nucleic acid detection technologies, such as in studies of allelic 

variation, involve not only detection of a specific sequence in a complex background, but also 
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the discrimination between sequences with few, or single, nucleotide differences. One method of 
the detection of allele-specific variants by PCR is based upon the fact that it is difficult for Taq 
polymerase to synthesize a DNA strand when there is a mismatch between the template strand 
and the 3' end of the primer. An allele-specific variant may be detected by the use of a primer 
5 that is perfectly matched with only one of the possible alleles; the mismatch to the other allele 
acts to prevent the extension of the primer, thereby preventing the amplification of that sequence. 
This method has a substantial limitation in that the base composition of the mismatch influences 
the ability to prevent extension across the mismatch, and certain mismatches do not prevent 
extension or have only a minimal effect. 

10 A similar 3 '-mismatch strategy is used with greater effect to prevent ligation in the LCR. 

Any mismatch effectively blocks the action of the thermostable ligase, but LCR still has the 
drawback of target-independent background ligation products initiating the amplification. 
Moreover, the combination of PCR with subsequent LCR to identify the nucleotides at individual 
positions is also a clearly cumbersome proposition for the clinical laboratory. 

15 The direct detection method according to various preferred embodiments of the present 

invention may be, for example a cycling probe reaction (CPR) or a branched DNA analysis. 

When a sufficient amount of a nucleic acid to be detected is available, there are 
advantages to detecting that sequence directly, instead of making more copies of that target, 
(e.g., as in PCR and LCR). Most notably, a method that does not amplify the signal 

20 exponentially is more amenable to quantitative analysis. Even if the signal is enhanced by 
attaching multiple dyes to a single oligonucleotide, the correlation between the final signal 
intensity and amount of target is direct. Such a system has an additional advantage that the 
products of the reaction will rot themselves promote further reaction, so contamination of lab 
surfaces by the products is not as much of a concern. Recently devised techniques have sought to 

25 eliminate the use of radioactivity and/or improve the sensitivity in automatable formats. Two 
examples are the "Cycling Probe Reaction" (CPR), and "Branched DNA" (bDNA). 

Cycling probe reaction (CPR): The cycling probe reaction (CPR), uses a long chimeric 
oligonucleotide in which a central portion is made of RNA while the two termini are made of 
DNA. Hybridization of the probe to a target DNA and exposure to a thermostable RNase H 

30 causes the RNA portion to be digested. This destabilizes the remaining DNA portions of the 
duplex, releasing the remainder of the probe from the target DNA and allowing another probe 
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molecule to repeat the process. The signal, in the form of cleaved probe molecules, accumulates 
at a linear rate. While the repeating process increases the signal, the RNA portion of the 
oligonucleotide is vulnerable to RNases that may carried through sample preparation. 

Branched DNA: Branched DNA (bDNA), involves oligonucleotides with branched 
5 structures that allow each individual oligonucleotide to carry 35 to 40 labels (e.g., alkaline 
phosphatase enzymes). While this enhances the signal from a hybridization event, signal from 
nonspecific binding is similarly increased. 

The detection of at least one sequence change according to various preferred 
embodiments of the present invention may be accomplished by, for example restriction fragment 
10 length polymorphism (RFLP analysis), allele specific oligonucleotide (ASO) analysis, 
Denaturing/Temperature Gradient Gel Electrophoresis (DGGE/TGGE), Single -Strand 
Conformation Polymorphism (SSCP) analysis or Dideoxy fingerprinting (ddF). 

The demand for tests which allow the detection of specific nucleic acid sequences and 
sequence changes is growing rapidly in clinical diagnostics. As nucleic acid sequence data for 
15 genes from humans and pathogenic organisms accumulates, the demand for fast, cost-effective, 
and easy-to-use tests for as yet mutations within specific sequences is rapidly increasing. 

A handful of methods have been devised to scan nucleic acid segments for mutations. 
One option is to determine the entire gene sequence of each test sample (e.g., a bacterial isolate). 
For sequences under approximately 600 nucleotides, this may be accomplished using amplified 
20 material (e.g., PGR reaction products). This avoids the time and expense associated with cloning 
the segment of interest. However, specialized equipment and highly trained personnel are 
required, and the method is too labor- intense and expensive to be practical and effective in the 
clinical setting. 

In view of the difficulties associated with sequencing, a given segment of nucleic acid 
25 may be characterized on several other levels. At the lowest resolution, the size of the molecule 
can be determined by electrophoresis by comparison to a known standard run on the same gel. A 
more detailed picture of the molecule may be achieved by cleavage with combinations of 
restriction enzymes prior to electrophoresis, to allow construction of an ordered map. The 
presence of specific sequences within the fragment can be detected by hybridization of a labeled 
30 probe, or the precise nucleotide sequence can be determined by partial chemical degradation or 
by primer extension in the presence of chain- terminating nucleotide analogs. 
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Restriction fragment length polymorphism (RFLP): For detection of single-base 
differences between like sequences, the requirements of the analysis are often at the highest level 
of resolution. For cases in which the position of the nucleotide in question is known in advance, 
several methods have been developed for examining single base changes without direct 
5 sequencing. For example, if a mutation of interest happens to fall within a restriction recognition 
sequence, a change in the pattern of digestion can be used as a diagnostic tool (e.g., restriction 
fragment length polymorphism [RFLP] analysis). 

Single point mutations have been also detected by the creation or destruction of RFLPs. 
Mutations are detected and localized by the presence and size of the RNA fragments generated 

10 by cleavage at the mismatches. Single nucleotide mismatches in DNA heteroduplexes are also 
recognized and cleaved by some chemicals, providing an alternative strategy to detect single 
base substitutions, generically named the "Mismatch Chemical Cleavage" (MCC). However, this 
method requires the use of osmium tetroxide and piperidine, two highly noxious chemicals 
which are not suited for use in a clinical laboratory. 

15 RFLP analysis suffers from low sensitivity and requires a large amount of sample. When 

RFLP analysis is used for the detection of point mutations, it is, by its nature, limited to the 
detection of only those single base changes which fall within a restriction sequence of a known 
restriction endonuclease. Moreover, the majority of the available enzymes have 4 to 6 base-pair 
recognition sequences, and cleave too frequently for many large-scale DNA manipulations. 

20 Thus, it is applicable only in a small fraction of cases, as most mutations do not fall within such 
sites. 

A handful of rare-cutting restriction enzymes with 8 base-pair specificities have been 
isolated and these are widely used in genetic mapping, but these enzymes are few in number, are 
limited to the recognition of G+C-rich sequences, and cleave at sites that tend to be highly 
25 clustered. Recently, endonucleases encoded by group I introns have been discovered that might 
have greater than 12 base-pair specificity, but again, these are few in number. 

Allele specific oligonucleotide (ASO): If the change is not in a recognition sequence, 
then allele-specific oligonucleotides (ASOs), can be designed to hybridize in proximity to the 
mutated nucleotide, such that a primer extension or ligation event can bused as the indicator of a 
30 match or a mis- match. Hybridization with radioactively labeled allelic specific oligonucleotides 
(ASO) also has been applied to the detection of specific point mutations. The method is based 
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on the differences in the melting temperature of short DNA fragments differing by a single 
nucleotide. Stringent hybridization and washing conditions can differentiate between mutant and 
wild-type alleles. The ASO approach applied to PCR products also has been extensively utilized 
by various researchers to detect and characterize point mutations in ras genes and gsp/gip 
5 oncogenes. Because of the presence of various nucleotide changes in multiple positions, the 
ASO method requires the use of many oligonucleotides to cover all possible oncogenic 
mutations. 

With either of the techniques described above (i.e., RFLP and ASO), the precise location 
of the suspected mutation must be known in advance of the test. That is to say, they are 
10 inapplicable when one needs to detect the presence of a mutation within a gene or sequence of 
interest. 

Denaturing/Temperature Gradient Gel Electrophoresis (DGGE/TGGE): Two other 
methods rely on detecting changes in electrophoretic mobility in response to minor sequence 
changes. One of these methods, termed "Denaturing Gradient Gel Electrophoresis" (DGGE) is 

15 based on the observation that slightly different sequences will display different patterns of local 
melting when electrophoretically resolved on a gradient gel. In this manner, variants can be 
distinguished, as differences in melting properties of homoduplexes versus heteroduplexes 
differing in a single nucleotide can detect the presence of mutations in the target sequences 
because of the corresponding changes in their electrophoretic mobilities. The fragments to be 

20 analyzed, usually PCR products, are "clamped" at one end by a long stretch of GC base pairs 
(30-80) to allow complete denaturation of the sequence of interest without complete dissociation 
of the strands. The attachment of a GC "clamp" to the DNA fragments increases the fraction of 
mutations that can be recognized by DGGE. Attaching a GC clamp to one primer is critical to 
ensure that the amplified sequence has a low dissociation temp erature. Modifications of the 

25 technique have been developed, using temperature gradients, and the method can be also applied 
to RNA:RNA duplexes. 

Limitations on the utility of DGGE include the requirement that the denaturing conditions 
must be optimized for each type of DNA to be tested. Furthermore, the method requires 
specialized equipment to prepare the gels and maintain the needed high temperatures during 

30 electrophoresis. The expense associated with the synthesis of the clamping tail on one 
oligonucleotide for each sequence to be tested is also a major consideration. In addition, long 
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running times are required for DGGE. The long running time of DGGE was shortened in a 
modification of DGGE called constant denaturant gel electrophoresis (CDGE). CDGE requires 
that gels be performed under different denaturant conditions in order to reach high efficiency for 
the detection of mutations. 
5 A technique analogous to DGGE, termed temperature gradient gel electrophoresis 

(TGGE), uses a thermal gradient rather than a chemical denaturant gradient. TGGE requires the 
use of specialized equipment which can generate a temperature gradient perpendicularly oriented 
relative to the electrical field. TGGE can detect mutations in relatively small fragments of DNA 
therefore scanning of large gene segments requires the use of multiple PGR products prior to 
10 running the gel. 

Single-Strand Conformation Polymorphism (SSCP): Another common method, called 
"Single- Strand Conformation Polymorphism" (SSCP) was developed by Hayashi, Sekya and 
colleagues and is based on the observation that single strands of nucleic acid can take on 
characteristic conformations in non- denaturing conditions, and these conformations influence 

15 electrophoretic mobility. The complementary strands assume sufficiently different structures 
that one strand may be resolved from the other. Changes in sequences within the fragment will 
also change the conformation, consequently altering the mobility and allowing this to be used as 
an assay for sequence variations. 

The SSCP process involves denaturing a DNA segment (e.g., a PGR product) that is 

20 labeled on both strands, followed by slow electrophoretic separation on a non-denaturing 
polyacrylamide gel, so that intra- molecular interactions can form and not be disturbed during the 
run. This technique is extremely sensitive to variations in gel composition and temperature. A 
serious limitation of this method is the relative difficulty encountered in comparing data 
generated in different laboratories, under apparently similar conditions. 

25 Dideoxy fingerprinting (ddF): The dideoxy fingerprinting (ddF) is another technique 

developed to scan genes for the presence of mutations. The ddF technique combines 
components of Sanger dideoxy sequencing with SSCP. A dideoxy sequencing reaction is 
performed using one dideoxy terminator and then the reaction products are electrophoresed on 
nondenaturing polyacrylamide gels to detect alterations in mobility of the termination segments 

30 as in SSCP analysis. While ddF is an improvement over SSCP in terms of increased sensitivity, 
ddF requires the use of expensive dideoxynucleotides and this technique is still limited to the 
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analysis of fragments of the size suitable for SSCP (i.e., fragments of 200-300 bases for optimal 
detection of mutations). 

In addition to the above limitations, all of these methods are limited as to the size of the 
nucleic acid fragment that can be analyzed. For the direct sequencing approach, sequences of 
5 greater than 600 base pairs require cloning, with the consequent delays and expense of either 
deletion sub -cloning or primer walking, in order to cover the entire fragment. SSCP and DGGE 
have even more severe size limitations. Because of reduced sensitivity to sequence changes, 
these methods are not considered suitable for larger fragments. Although SSCP is reportedly able 
to detect 90 % of single-base substitutions within a 200 base-pair fragment, the detection drops 
10 to less than 50 % for 400 base pair fragments. Similarly, the sensitivity of DGGE decreases as 
the length of the fragment reaches 500 base-pairs. The ddF technique, as a combination of direct 
sequencing and SSCP, is also limited by the relatively small size of the DNA that can be 
screened. 

According to a presently preferred embodiment of the present invention the step of 

1 5 searching for any of the nucleic acid sequences described here, in tumor cells or in cells derived 
from a cancer patient is effected by any suitable technique, including, but not limited to, nucleic 
acid sequencing, polymerase chain reaction, ligase chain reaction, self- sustained synthetic 
reaction, QP-Replicase, cycling probe reaction, branched DNA, restriction fragment length 
polymorphism analysis, mismatch chemical cleavage, heteroduplex analysis, allele- specific 

20 oligonucleotides, denaturing gradient gel electrophoresis, constant denaturant gel 
electrophoresis, temperature gradient gel electrophoresis and dideoxy fingerprinting. 

Detection may also optionally be performed with a chip or other such device. The nucleic 
acid sample which includes the candidate region to be analyzed is preferably isolated, amplified 
and labeled with a reporter group. This reporter group can be a fluorescent group such as 

25 phycoeiythrin. The labeled nucleic acid is then incubated with the probes immobilized on the 
chip using a fluidics station, describe the fabrication of fluidics devices and particularly 
microcapillary devices, in silicon and glass substrates. 

Once the reaction is completed, the chip is inserted into a scanner and patterns of 
hybridization are detected. The hybridization data is collected, as a signal emitted from the 

30 reporter groups already incorporated into the nucleic acid, which is now bound to the probes 
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attached to the chip. Since the sequence and position of each probe immobilized on the chip is 
known, the identity of the nucleic acid hybridized to a given probe can be determined. 

It will be appreciated that when utilized along with automated equipment, the above 
described detection methods can be used to screen multiple samples for a disease and/or 
5 pathological condition both rapidly and easily. 

Amino acid sequences and peptides 

The terms "polypeptide," "peptide" and "protein" are used interchangeably herein to 

refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one 
10 or more amino acid residue is an analog or mimetic of a corresponding naturally occurring 

amino acid, as well as to naturally occurring amino acid polymers. Polypeptides can be 

modified, e.g., by the addition of carbohydrate residues to form glycoproteins. The terms 

"polypeptide," "peptide" and "protein" include glycoproteins, as well as non-glycoproteins. 

Polypeptide products can be biochemically synthesized such as by employing standard 
15 solid phase techniques. Such methods include but are not limited to exclusive solid phase 

synthesis, partial solid phase synthesis methods, fragment condensation, classical solution 

synthesis. These methods are preferably used when the peptide is relatively short (i.e., 10 kDa) 

and/or when it cannot be produced by recombinant techniques (i.e., not encoded by a nucleic 

acid sequence) and therefore involves different chemistry. 
20 Solid phase polypeptide synthesis procedures are well known in the art and further 

described by John Morrow Stewart and Janis Dillaha Young, Solid Phase Peptide Syntheses (2nd 

Ed., Pierce Chemical Company, 1984). 

Synthetic polypeptides can optionally be purified by preparative high performance liquid 

chromatography [Creighton T. (1983) Proteins, structures and molecular principles. WH 
25 Freeman and Co. N.Y.], after which their composition can be confirmed via amino acid 

sequencing. 

In cases where large amounts of a polypeptide are desired, it can be generated using 
recombinant techniques such as described by Bitter et al., (1987) Methods in Enzymol. 153:516- 
544, Studier et al. (1990) Methods in Enzymol. 185:60-89, Brisson et al. (1984) Nature 310:511- 
30 514, Takamatsu et al. (1987) EMBO J. 6:307-311, Coruzzi et al. (1984) EMBO J. 3:1671-1680 
and Brogli et al., (1984) Science 224:838-843, Gurley et al. (1986) Mol. Cell. Biol. 6:559-565 
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and Weissbach & Weissbach, 1988, Methods for Plant Molecular Biology, Academic Press, NY, 
Section VIII, pp 421-463. 

The present invention also encompasses polypeptides encoded by the polynucleotide 
sequences of the present invention, as well as polypeptides according to the amino acid 
5 sequences described herein. The present invention also encompasses homologues of these 
polypeptides, such homologues can be at least 50 %, at least 55 %, at least 60%, at least 65 %, at 
least 70 %, at least 75 %, at least 80 %, at least 85 %, at least 95 % or more say 100 % 
homologous to the amino acid sequences set forth below, as can be determined using BlastP 
software of the National Center of Biotechnology Information (NCBI) using default parameters, 

10 optionally and preferably including the following: filtering on (this option filters repetitive or 
low- complexity sequences from the query using the Seg (protein) program), scoring matrix is 
BLOSUM62 for proteins, word size is 3, E value is 10, gap costs are 11, 1 (initialization and 
extension), and number of alignments shown is 50. Optionally, nucleic acid sequence 
identity/homology may be determined by using BlastN software of the National Center of 

15 Biotechnology Information (NCBI) using default parameters, which preferably include using the 
DUST filter program, and also preferably include having an E value of 10, filtering low 
complexity sequences and a word size of 1 1 . Finally, the present invention also encompasses 
fragments of the above described polypeptides and polypeptides having mutations, such as 
deletions, insertions or substitutions of one or more amino acids, either naturally occurring or 

20 artificially induced, either randomly or in a targeted fashion. 

It will be appreciated that peptides identified according the present invention may be 
degradation products, synthetic peptides or recombinant peptides as well as peptidomimetics, 
typically, synthetic peptides and peptoids and semipeptoids which are peptide analogs, which 
may have, for example, modifications rendering the peptides more stable while in a body or 

25 more capable of penetrating into cells. Such modifications include, but are not limited to N 
terminus modification, C terminus modification, peptide bond modification, including, but not 
limited to, CH2-NH, CH2-S, CH2-S-0, 0=C-NH, CH2-0, CH2-CH2, S=C-NH, CH=CH or 
CF=CH, backbone modifications, and residue modification. Methods for preparing 
peptidomimetic compounds are well known in the art and are specified. Further details in this 

30 respect are provided hereinunder. 
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Peptide bonds (- CO-NH-) within the peptide may be substituted, for example, by N- 
methylated bonds (-N(CH3)-CO-), ester bonds (-C(R)H~C-0-0-C(R)-N-), ketomethylen bonds 
(-CO-CH2-), a-aza bonds (-NH-N(R)-CO), wherein R is any alkyl, e.g., methyl, carba bonds (- 
CH2-NH-), hydroxyethylene bonds (~CH(OH)~CH2-), thioamide bonds (-CS-NH-), olefmic 
5 double bonds (~CH=CH-), retro amide bonds (-NH-CO-), peptide derivatives (-N(R)-CH2~CO-), 
wherein R is the "normal" side chain, naturally presented on the carbon atom. 

These modifications can occur at any of the bonds along the peptide chain and even at 
several (2-3) at the same time. 

Natural aromatic amino acids, Trp, Tyr and Phe, may be substituted for synthetic non- 
10 natural acid such as Phenylglycine, TIC, naphthylelanine (Nol), ring- methylated derivatives of 
Phe, halogenated derivatives of Phe or o- methyl- Tyr. 

In addition to the above, the peptides of the present invention may also include one or 
more modified amino acids or one or more non-amino acid monomers (e.g. fatty acids, complex 
carbohydrates etc). 

15 As used herein in the specification and in the claims section below the term "amino acid" 

or "amino acids" is understood to include the 20 naturally occurring amino acids; those amino 
acids often modified post-translationally in vivo, including, for example, hydroxyproline, 
phosphoserine and phosphothreonine; and other unusual amino acids including, but not limited 
to, 2-aminoadipic acid, hydroxylysine, isodesmosine, nor-valine, nor-leucine and ornithine. 

20 Furthermore, the term "amino acid" includes both D- and Lramino acids. 

Table 1 non-conventional or modified amino acids which can be used with the present 
invention. 

Table 1 

25 



Non-conventional amino 
acid 


Code 


Non-conventional amino acid 


Code 


a-aminobutyric acid 


Abu 


L-N-methylalanine 


Nmala 


a-amino- a-methylbutyrate 


Mgabu 


L-N-methylarginine 


Nmarg 


aminocyclopropane- 


Cpro 


L-N-methylasparagine 


Nmasn 
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Carboxylate 




L-N-methylaspartic acid 


Nmasp 


Aminoisobutyric acid 


Aib 


I^N - methy Icy steine 


Nmcys 


aminonorbornyl- 


Norb 


L-N-methylglutamine 


Nmgin 


Carboxylate 




I^N-methylglutamic acid 


Nmglu 


Cyclohexylalanine 


Chexa 


L^N-methylhistidine 


Nmhis 


Cyclopentylalanine 


Cpen 


L-N-methylisolleucine 


Nmile 


D- alanine 


Dal 


I^N-methylleucine 


Nmleu 


D-arginine 


Darg 


L-N~methyllysine 


Nmlys 


D-aspartic acid 


Dasp 


L-N -methy lmethionine 


Nmmet 


D-cysteine 


Dcys 


I^N-methylnorleucine i 


Nmnle 


D-glutamine 


Dgln 


L-N-methylnorvaline 


Nmnva 


D-glutamic acid 


Dglu 


L-N-methylornithine 


Nmorn 


D-histidine 


Dhis 


L-N-methylphenylalanine 


Nmphe 


D-isoleucine 


Dile 


I^N-methylproline 


Nmpro 


D-leucine 


Dleu 


L-N-methylserine 


Nmser 


D- lysine 


Dlys 


I^N-methyltlireonine 


Nmthr 


D-methionine 


Dmet 


L-N-methyltryptophan 


Nmtrp 


D- ornithine 


Dorn 


L-N-methyltyrosine 


Nmtyr 


D-phenylalanine 


Dphe 


L-N-methylvaline 


Nmval 


D-proline 


Dpro 


L-N-methylethylglycine 


Nmetg 


D- serine 


Dser 


L-N-methyl- t-butylgly cine 


Nmtbug 


D- threonine 


iJtnr 


jL^-noneucine 


iNie 


jl/ li y yj l\j tjiian 


Dim 


T ^-nnrvalinp 

X~f^±l\Jl VCl.AlJ.lt/ 


X^i v CI 


D- tyrosine 


Dtyr 


a-methyl-aminoisobutyrate 


Maib 


D- valine 


Dval 


a-metliyl- y-aminobutyrate 


Mgabu 


D- a-methylalanine 


Dmala 


a-methylcyclohexylalanine 


Mchexa 


D- a-methylarginine 


Dmarg 


a-methylcyclopentylalanine 


Mcpen 


D- a - methylasparagine 


Dmasn 


a-methyl- a-napthylalanine 


Manap 


D-a-methylaspartate . 


Dmasp 


a- melhylpenicillaimne 


Mpen 
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u- cx- 1 Tieiny icy s Leinc 


1 j. y 0 


TsJ _ (A- am i n ob iitvn ^1 v cine 


Nglu 


J_^- (JO - HI C LI1 y 1 g JL U L d.1 1 1111C 


Dixigln 


N- ( 2- aminoethvD eh/cine 

JL ^1 \ J— CX 1 JL JL .1 JL JL Vy w L JL JL \ X / f-*> 7 JL^l.^ >^ 


Naeg 


JLv-UC— HiCLIiy illlol.lU.IllC 


Dmhis 


N- (3- aminopropyl) glycine 


Norn 


JL* - LX*~1I1C Lily lloLllC UUIIIC 


Dmile 


M- aiTiino-fy-iTiethvlbutvTate 


Nmaabu 


Pt rv _mp» i"Vi"\/1 1 f=»i 1 fin r 
JL/- IX —111c my 11CUL/111C 


Dmleu 


rz - n a r> tVi vl a 1 a n i n e 

VA JLAC4.L' Ul V ACIJ.UJ.AJ.AA V/ 


Anap 


JL/ - LX - 111C lliy liy olllC 


Dmlys 


" Sf -benzy lglycine 


Nphe 


JL/- iX-lllC Lily lillCllllUlllllC 


Dm met 


~ ST- (2- carbamylethyl) glycine 


Ngln 


JL/- vX IIIC Lily 1U1 ill tlllllC 


Dmorn 

JL JLJ. V-J JL XX 


N- (carbamylmethyl)gly cine 


Nasn 


j_>^- kjl - 111c uiy ip nciiy ldiaiiiiiC/ 


F)mtiVie 


N- (2- carboxy ethyl)glycine 


Nglu 


JL/- oc -me inyipr oime 




N-fcarboxvmethvni?lvcine 


Nasp 


D- cx~methylserine 


.L-/lllot/l 


TsJ _ pvcl obi i tvl p1 vci n e 


Ncbnt 


D- oc - methy Ithreonine 


JL/111L111 


1\J_ rvrl ob entvl plvcin e 

XN vY V< JLvJJLJL^-/ L/ Ly Xt^ty V-< AAAV-< 


Nchep 


D- oc- methy ltryptophan 


JL/111L1 LI 


T\f - p vr*1 nb vl o^l vci ti e 


Nchex 


D- a-methyltyrosine 


Dmty 


N- cyclodecylglycine 


Ncdec 


D- a-methylvaline 


Dmval 


N- cy clododeclglycine 


IN CGOQ 


D- a-methylalnine 


Dnmala 


N-cyclooctylglycine 


Ncoct 


D- a-methylargiiiine 


Dnmarg 


N- cyclopropylglycine 


Ncpro 


D- a-methylasparagine 


Dnmasn 


N-cycloiindecylglycine 


Ncund 


D- a-methylasparatate 


Dnmasp 


N-(2,2-diphenylethyl)glycine 


Nbhm 


D- a-methylcysteine 


Dnmcys 


N-(3 5 3- 

diphenylpropyl)glycine 


Nbhe 


D-N-methylleucine 


Dnmleu 


N~(3-indolylyethyl) glycine 


Nhtrp 


D-N-methyllysine 


Dnmlys 


N-methyl- y- aminobutyrate 


Nmgabu 


N- 

methylcyclohexylalanine 


Nmchexa 


D-N-methylmethionine 


Dnmmet 


D-N-methylornithine 


Dnmorn 


N-methylcyclopentylalanine 


Nmcpen 


N-methylglycine 


Nala 


D-N-methylphenylalanine 


Dnmphe 


N-methylaminoisobutyrate 


Nmaib 


D-N-methylproline 


Dnmpro 


N-(l-methylpropyl)glycine 


Nile 


D-N-methylserine 


Dnmser 
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N-(2-methylpropyl)glycine 


Nile 


D-N-methylserine 


Dnmser 


N-(2-methylpropyl)glycine 


Nleu 


D-N-methylthreonine 


Dnmthr 


D-N-methyltryptophan 


Dnrntip 


N- ( 1 -methylethy l)glycine 


Nva 


D-N -me thylty ro sine 


Dnmtyr 


N-methyla- napthylalanine 


Nmanap 


D-N-methylvaline 


Dnrnval 


N-methylpenicillamine 


Nmpen 


y-aminobiityric acid 


VjraDU 


in -^/z-nyaroxypnenyi ^glycine 


iNniyr 


L- ^-butylglycine 


I DUg 


in - ^tniorne iny i jgiycine 


Ncys 


L-ethylglycine 


Etg 


penicillamine 


Pen 


x^xiuixiupxieiiy laictniiie 


jnpiie 


L- oc - metliy lalanine 


A/Talc* 


L- oc-methylarginine 


M[arg 


L- ex - methy 1 asp aragine 


ivxasn 


I^a-methylaspartate 


Masp 


L- a -methyl- /-buty lgly cine 


Mtbug 


Lr a- methylcysteine 


Mcys 


L- methylethylglycine 


Metg 


Lr a-methylglutamine 


Mgln 


I^a-methylglutamate 


Mglu 


L-a-methylhistidine 


Mhis 


L-a-methylhomo 
phenylalanine 


Mhphe 


L- a- methy lisoleucine 


Mile 


N- (2-methylthioethyl)gly cine 


Nmet 


D-N-methylglutamine 


Dnmgln 


N-(3- 

guanidinopropyl)glycine 


Narg 


D-N-methylglutamate 


Dnmglu 


N-(l -hydroxy ethyl)glycine 


Nthr 


D-N-methylliistidine 


Dnmhis 


N- (hydroxyethyl)glycine 


Nser 


D-N-methylisoleucine 


Dnmile 


N- (imidazolylethyl)glycine 


Nliis 


D-N-methylleucine 


Dnmleu 


N-(3-indolylyethyl)glycine 


Nhtrp 


D-N-methyllysine 


Dnmlys 


N-methyl- y- aminobutyrate 


Nmgabu 


N- 

methylcyclohexylalanine 


Nmchexa 


D-N-methylmethionine 


Dnmmet 


D-N-methylomMiine 


Dnmorn 


N-methylcyclopentylalanine 


Nmcpen 


N-methylglycine 


Nala 


D-N-methylphenylalanine 


Dnmphe 


N-methylaminoisobutyrate 


Nmaib 


D-N-methylproline 


Dnmpro 


N-(l -methylpropyl)glycine 


Nile 


D-N-methylserine 


Dnmser 
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N- (2-methy Ipropy l)glycine 


Nleu 


D-N-methylthreonine 


Dnmthr 


D-N~methyltryptophan 


Dnmtrp 


N- ( 1 -methylethyl)glycine 


Nval 


D-N-methyltyrosine 


Dnmtyr 


N-methyla- napthyl alanine 


Nmanap 


D-N-methylvaline 


Dnmval 


N-methylpenicillamine 


Nmpen 


y-aminobutyric acid 


Oabu 


N-(/?- hydroxy phenyl) glycine 


JNlityr 


L- /-butylglycine 


Tbug 


N-(thiomethyl)glycine 


Ncys 


L-ethylglycine 


Etg 


penicillamine 


Pen 


L-homophenylalanine 


Hphe 


L- a -methylalanine 


Mala 


Lr- oc- methy larginine 


Marg 


L- a -methy lasparagine 


Masn 


L- a -methylaspartate 


Masp 


L- a -methyl- /-buty lgly cine 


JVLtDUg 


L- a -methylcysteine 


Mcys 


L- methyl ethylglycine 


JYLetg 


L- a-methylglutamine 


Mgln 


L- a -methylglutamate 


Mglu 


L- a -methy lhistidine 


Mills 


L-oc- 

methylhomophenylalanine 


JVLnpne 


i^- cx-meiny iisoieucine 


1VJJUC 


in -^z^-iiiciiiy luiivJCLiiy i ^ glycine 


iNlllCl 


L-a-methylleucine 


Mleu 


L- a -methylly sine 


Mlys 


Lr a -methylmethionine 


Mmet 


L- a -methylnorleucine 


Mnle 


L-a-methylnorvaline 


Mnva 


Lr a -methylornithine 


Morn 


Lr a-methylphenylalanine 


Mphe 


I^a-methylproline 


Mpro 


L- a-methylserine 


mser 


L- a-methylthreonine 


Mthr 


L-a-methylvaline 


Mtrp 


Lr a-methyltyrosine 


Mtyr 


Lr a-methylleucine 


Mval 
Nnbhm 


L~N- 

methylhomophenylalanine 


Nmhphe 


N-(N-(2,2-diphenylethyl) 




N-(N-(3 ? 3-diphenylpropyl) 




carbamylmethyl-glycine 


Nnbhm 


carbamylmethyl( 1 )glycine 


Nnbhe 


l -carboxy- 1 -(2,2-diphenyl 
ethylamino)cyclopropane 


Nmbc 
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Since the peptides of the present invention are preferably utilized in diagnostics which 
require the peptides to be in soluble form, the peptides of the present invention preferably 
include one or more non-natural or natural polar amino acids, including but not limited to serine 
and threonine which are capable of increasing peptide solubility due to their hydroxy 1-containing 
5 side chain. 

The peptides of the present invention are preferably utilized in a linear form, although it 
will be appreciated that in cases where cyclicization does not severely interfere with peptide 
characteristics, cyclic forms of the peptide can also be utilized. 

The peptides of present invention can be biochemically synthesized such as by using 

10 standard solid phase techniques. These methods include exclusive solid phase synthesis well 
known in the art, partial solid phase synthesis methods, fragment condensation, classical solution 
synthesis. These methods are preferably used when the peptide is relatively short (i.e., 10 kDa) 
and/or when it cannot be produced by recombinant techniques (i.e., not encoded by a nucleic 
acid sequence) and therefore involves different chemistry. 

15 Synthetic peptides can be purified by preparative high performance liquid 

chromatography and the composition of which can be confirmed via amino acid sequencing. 

In cases where large amounts of the peptides of the present invention are desired, the 
peptides of the present invention can be generated using recombinant techniques such as 
described by Bitter et al., (1987) Methods in EnzymoL 153:516-544, Studier et al. (1990) 

20 Methods in Enzymol. 185:60-89, Brisson et al. (1984) Nature 310:511-514, Takamatsu et al. 
(1987) EMBO J. 6:307-311, Coruzzi et al. (1984) EMBO J. 3:1671-1680 and Brogli et al., 
(1984) Science 224:838-843, Gurley et al. (1986) Mol. Cell. Biol. 6:559-565 and Weissbach & 
Weissbach, 1988, Methods for Plant Molecular Biology, Academic Press, NY, Section VIII, pp 
421-463 and also as described above. 

25 

Antibodies 

"Antibody" refers to a polypeptide ligand that is preferably substantially encoded by an 
immunoglobulin gene or immunoglobulin genes, or fragments thereof, which specifically binds 
30 and recognizes an epitope (e.g., an antigen). The recognized immunoglobulin genes include the 
kappa and lambda light chain constant region genes, the alpha, gamma, delta, epsilon and mu 
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heavy chain constant region genes, and the myriad- immunoglobulin variable region genes. 
Antibodies exist, e.g., as intact immunoglobulins or as a number of well characterized fragments 
produced by digestion with various peptidases. This includes, e.g., Fab' and F(ab) 2 fragments. 
The term "antibody," as used herein, also includes antibody fragments either produced by the 
5 modification of whole antibodies or those synthesized de novo using recombinant DNA 
methodologies. It also includes polyclonal antibodies, monoclonal antibodies, chimeric 
antibodies, humanized antibodies, or single chain antibodies. "Fc" portion of an antibody refers 
to that portion of an immunoglobulin heavy chain that comprises one or more heavy chain 
constant region domains, CHI, CH2 and CH3, but does not include the heavy chain variable 
1 0 region. 

The functional fragments of antibodies, such as Fab, F(ab')2, and Fv that are capable of 
binding to macrophages, are described as follows: (1) Fab, the fragment which contains a 
monovalent antigen-binding fragment of an antibody molecule, can be produced by digestion of 
whole antibody with the enzyme papain to yield an intact light chain and a portion of one heavy 

15 chain; (2) Fab 1 , the fragment of an antibody molecule that can be obtained by treating whole 
antibody with pepsin, followed by reduction, to yield an intact light chain and a portion of the 
heavy chain; two Fab' fragments are obtained per antibody molecule; (3) (Fab')2, the fragment 
of the antibody that can be obtained by treating whole antibody with the enzyme pepsin without 
subsequent reduction; F(ab')2 is a dimer of two Fab 1 fragments held together by two disulfide 

20 bonds; (4) Fv, defined as a genetically engineered fragment containing the variable region of the 
light chain and the variable region of the heavy chain expressed as two chains; and (5) Single 
chain antibody ("SCA"), a genetically engineered molecule containing the variable region of the 
light chain and the variable region of the heavy chain, linked by a suitable polypeptide linker as 
a genetically fused single chain molecule. 

25 Methods of producing polyclonal and monoclonal antibodies as well as fragments 

thereof are well known in the art (See for example, Harlow and Lane, Antibodies: A Laboratory 
Manual, Cold Spring Harbor Laboratory, New York, 1988, incorporated herein by reference). 

Antibody fragments according to the present invention can be prepared by proteolytic 
hydrolysis of the antibody or by expression in E. coli or mammalian cells (e.g. Chinese hamster 

30 ovary cell culture or other protein expression systems) of DNA encoding the fragment. 
Antibody fragments can be obtained by pepsin or papain digestion of whole antibodies by 
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conventional methods. For example, antibody fragments can be produced by enzymatic 
cleavage of antibodies with pepsin to provide a 5S fragment denoted F(ab')2. This fragment can 
be further cleaved using a thiol reducing agent, and optionally a blocking group for the 
sulfhydryl groups resulting from cleavage of disulfide linkages, to produce 3.5S Fab' 
5 monovalent fragments. Alternatively, an enzymatic cleavage using pepsin produces two 
monovalent Fab' fragments and an Fc fragment directly. These methods are described, for 
example, by Goldenberg, U.S. Pat. Nos. 4,036,945 and 4,331,647, and references contained 
therein, which patents are hereby incorporated by reference in their entirety. See also Porter, R. 
R. [Biochem. J. 73: 119-126 (1959)]. Other methods of cleaving antibodies, such as separation 

10 of heavy chains to form monovalent light-heavy chain fragments, further cleavage of fragments, 
or other enzymatic, chemical, or genetic techniques may also be used, so long as the fragments 
bind to the antigen that is recognized by the intact antibody. 

Fv fragments comprise an association of VH and VL chains. This association may be 
noncovalent, as described in Inbar et al. [Proc. Nat'l Acad. Sci. USA 69:2659-62 (19720]. 

15 Alternatively, the variable chains can be linked by an intermolecular disulfide bond or cross- 
linked by chemicals such as glutaraldehyde. Preferably, the Fv fragments comprise VH and VL 
chains connected by a peptide linker. These single- chain antigen binding proteins (sFv) are 
prepared by constructing a structural gene comprising DNA sequences encoding the VH and VL 
domains connected by an oligonucleotide. The structural gene is inserted into an expression 

20 vector, which is subsequently introduced into a host cell such as E. coli. The recombinant host 
cells synthesize a single polypeptide chain with a linker peptide bridging the two V domains. 
Methods for producing sFvs are described, for example, by [Whitlow and Filpula, Methods 2: 
97-105 (1991); Bird et al., Science 242:423-426 (1988); Pack et al., Bio/Technology 1 1:1271-77 
(1993); and U.S. Pat. No. 4,946,778, which is hereby incorporated by reference in its entirety. 

25 Another form of an antibody fragment is a peptide coding for a single complementarity- 

determining region (CDR). CDR peptides ("minimal recognition units") can be obtained by 
constructing genes encoding the CDR of an antibody of interest. Such genes are prepared, for 
example, by using the polymerase chain reaction to synthesize the variable region from RNA of 
antibody-producing cells. See, for example, Larrick and Fry [Methods, 2: 106-10 (1991)]. 

30 Humanized forms of noEehuman (e.g., murine) antibodies are chimeric molecules of 

immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , F(ab') or 
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other antigen-binding subsequences of antibodies) which contain minimal sequence derived 
from non-human immunoglobulin. Humanized antibodies include human immunoglobulins 
(recipient antibody) in which residues from a complementary determining region (CDR) of the 
recipient are replaced by residues from a CDR of a non-human species (donor antibody) such as 
5 mouse, rat or rabbit having the desired specificity, affinity and capacity. In some instances, Fv 
framework residues of the human immunoglobulin are replaced by corresponding non-human 
residues. Humanized antibodies may also comprise residues which are found neither in the 
recipient antibody nor in the imported CDR or framework sequences. In general, the humanized 
antibody will comprise substantially all of at least one, and typically two, variable domains, in 

10 which all or substantially all of the CDR regions correspond to those of a non- human 
immunoglobulin and all or substantially all of the FR regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin [Jones et al., Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323- 

15 329 (1988); and Presta, Curr. Op. Struct. BioL, 2:593-596 (1992)]. 

Methods for humanizing nonehuman antibodies are well known in the art. Generally, a 
humanized antibody has one or more amino acid residues introduced into it from a source which 
is non- human. These non-human amino acid residues are often referred to as import residues, 
which are typically taken from an import variable domain. Humanization can be essentially 

20 performed following the method of Winter and co-workers [Jones et al., Nature, 321:522-525 
(1986); Riechmann et al., Nature 332:323-327 (1988); Verhoeyen et al., Science, 239:1534- 
1536 (1988)], by substituting rodent CDRs or CDR sequences for the corresponding sequences 
of a human antibody. Accordingly, such humanized antibodies are chimeric antibodies (U.S. 
Pat. No. 4,816,567), wherein substantially less than an intact human variable domain has been 

25 substituted by the corresponding sequence from a non-human species. In practice, humanized 
antibodies are typically human antibodies in which some CDR residues and possibly some FR 
residues are substituted by residues from analogous sites in rodent antibodies. 

Human antibodies can also be produced using various techniques known in the art, 
including phage display libraries [Hoogenboom and Winter, J. Mol. BioL, 227:381 (1991); 

30 Marks et al., J. Mol. BioL, 222:581 (1991)]. The techniques of Cole et aL and Boerner et al. are 
also available for the preparation of human monoclonal antibodies (Cole et al., Monoclonal 
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Antibodies and Cancer Therapy, Alan R. Liss, p. 77 (1985) and Boerner et al., J. Immunol., 
147(l):86-95 (1991)]. Similarly, human antibodies can be made by introduction of human 
immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 
immunoglobulin genes have been partially or completely inactivated. Upon challenge, human 
5 antibody production is observed, which closely resembles that seen in humans in all respects, 
including gene rearrangement, assembly, and antibody repertoire. This approach is described, 
for example, in U.S. Pat. Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 
5,661,016, and in the following scientific publications: Marks et al., Bio/Technology 10,: 779- 
783 (1992); Lonberg et al., Nature 368: 856-859 (1994); Morrison, Nature 368 812-13 (1994); 

10 Fishwild et al., Nature Biotechnology 14, 845-51 (1996); Neuberger, Nature Biotechnology 14: 
826 (1996); and Lonberg and Huszar, Intern. Rev. Immunol. 13, 65-93 (1995). 

Preferably, the antibody of this aspect of the present invention specifically binds at least 
one epitope of the polypeptide variants of the present invention. As used herein, the term 
"epitope" refers to any antigenic detemiinant on an antigen to which the paratope of an antibody 

15 binds. 

Epitopic determinants usually consist of chemically active surface groupings of 
molecules such as amino acids or carbohydrate side chains and usually have specific three 
dimensional structural characteristics, as well as specific charge characteristics. 

Optionally, a unique epitope may be created in a variant due to a change in one or more 
20 post-translational modifications, including but not limited to glycosylation and/or 
phosphorylation, as described below. Such a change may also cause a new epitope to be 
created, for example through removal of glycosylation at a particular site. 

An epitope according to the present invention may also optionally comprise part or all of 
a unique sequence portion of a variant according to the present invention in combination with at 
25 least one other portion of the variant which is not contiguous to the unique sequence portion in 
the linear polypeptide itself, yet which are able to form an epitope in combination. One or more 
unique sequence portions may optionally combine with one or more other non- contiguous 
portions of the variant (including a portion which may have high homology to a portion of the 
known protein) to form an epitope. 

30 

Immunoassays 



WO 2006/131783 



PCT/IB2005/004037 



168 

In another embodiment of the present invention, an immunoassay can be used to 
qualitatively or quantitatively detect and analyze markers in a sample. This method comprises: 
providing an antibody that specifically binds to a marker; contacting a sample with the antibody; 
and detecting the presence of a complex of the antibody bound to the marker in the sample. 
5 To prepare an antibody that specifically binds to a marker, purified protein markers can 

be used. Antibodies that specifically bind to a protein marker can be prepared using any suitable 
methods known in the art. 

After the antibody is provided, a marker can be detected and/or quantified using any of a 
number of well recognized immunological binding assays. Useful assays include, for example, 

10 an enzyme immune assay (EIA) such as enzyme- linked immunosorbent assay (ELISA), a 
radioimmune assay (RIA), a Western blot assay, or a slot blot assay see, e.g., U.S. Pat. Nos. 
4,366,241; 4,376,110; 4,517,288; and 4,837,168). Generally, a sample obtained from a subject 
can be contacted with the antibody that specifically binds the marker. • 

Optionally, the antibody can be fixed to a solid support to facilitate washing and 

1 5 subsequent isolation of the complex, prior to contacting the antibody with a sample. Examples 
of solid supports include but are not limited to glass or plastic in the form of, e.g., a microtiter 
plate, a stick, a bead, or a microbead. Antibodies can also be attached to a solid support. 

After incubating the sample with antibodies, the mixture is washed and the antibody- 
marker complex formed can be detected. This can be accomplished by incubating the washed 

20 mixture with a detection reagent. Alternatively, the marker in the sample can be detected using 
an indirect assay, wherein, for example, a second, labeled antibody is used to detect bound 
marker- specific antibody, and/or in a competition or inhibition assay wherein, for example, a 
monoclonal antibody which binds to a distinct epitope of the marker are incubated 
simultaneously with the mixture. 

25 Throughout the assays, incubation and/or washing steps may be required after each 

combination of reagents. Incubation steps can vary from about 5 seconds to several hours, 
preferably from about 5 minutes to about 24 hours. However, the incubation time will depend 
upon the assay format, marker, volume of solution, concentrations and the like. Usually the 
assays will be carried out at ambient temperature, although they can be conducted over a range 

30 of temperatures, such as 10 °C to 40 °C. 
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The immunoassay can be used to determine a test amount of a marker in a sample from a 
subject. First, a test amount of a marker in a sample can be detected using the immunoassay 
methods described above. If a marker is present in the sample, it will form an antibody- marker 
complex with an antibody that specifically binds the marker under suitable incubation 
5 conditions described above. The amount of an antibody- marker complex can optionally be 
determined by comparing to a standard. As noted above, the test amount of marker need not be 
measured in absolute units, as long as the unit of measurement can be compared to a control 
amount and/or signal. 

Preferably used are antibodies which specifically interact with the polypeptides of the 
10 present invention and not with wild type proteins or other isoforms thereof, for example. Such 
antibodies are directed, for example, to the unique sequence portions of the polypeptide variants 
of the present invention, including but not limited to bridges, heads, tails and insertions described 
in greater detail below. Preferred embodiments of antibodies according to the present invention 
are described in greater detail with regard to the section entitled "Antibodies". 
15 Radio -immunoassay (RIA): In one version, this method involves precipitation of the 

desired substrate and in the methods detailed hereinbelow, with a specific antibody and 

radiolabeled antibody binding protein (e.g., protein A labeled with f ) immobilized on a 
precipitable carrier such as agarose beads. The number of counts in the precipitated pellet is 
proportional to the amount of substrate. 

20 In an alternate version of the RIA, a labeled substrate and an unlabelled antibody binding 

protein are employed. A sample containing an unknown amount of substrate is added in varying 
amounts. The decrease in precipitated counts from the labeled substrate is proportional to the 
amount of substrate in the added sample. 

Enzyme linked immunosorbent assay (ELISA): This method involves fixation of a sample 

25 (e.g., fixed cells or a proteinaceous solution) containing a protein substrate to a surface such as a 
well of a microliter plate. A substrate specific antibody coupled to an enzyme is applied and 
allowed to bind to the substrate. Presence of the antibody is then detected and quantitated by a 
colorimetric reaction employing the enzyme coupled to the antibody. Enzymes commonly 
employed in this method include horseradish peroxidase and alkaline phosphatase. If well 

30 calibrated and within the linear range of response, the amount of substrate present in the sample 
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is proportional to the amount of color produced. A substrate standard is generally employed to 
improve quantitative accuracy. 

Western blot: This method involves separation of a substrate from other protein by means 
of an acrylamide gel followed by transfer of the substrate to a membrane (e.g., nylon or PVDF). 
5 Presence of the substrate is then detected by antibodies specific to the substrate, which are in turn 
detected by antibody binding reagents. Antibody binding reagents may be, for example, protein 
A, or other antibodies. Antibody binding reagents may be radiolabeled or enzyme linked as 
described hereinabove. Detection may be by autoradiography, colorimetric reaction or 
chemiluminescence. This method allows both quantitation of an amount of substrate and 

10 determination of its identity by a relative position on the membrane which is indicative of a 
migration distance in the acrylamide gel during electrophoresis. 

Immunohistochemical analysis: This method involves detection of a substrate in situ in 
fixed cells by substrate specific antibodies. The substrate specific antibodies may be enzyme 
linked or linked to fluorophores. Detection is by microscopy and subjective evaluation. If 

1 5 enzyme linked antibodies are employed, a colorimetric reaction may be required. 

Fluorescence activated cell sorting (FACS): This method involves detection of a 
substrate in situ in cells by substrate specific antibodies. The substrate specific antibodies are 
linked to fluorophores. Detection is by means of a cell sorting machine which reads the 
wavelength of light emitted from each cell as it passes through a light beam. This method may 

20 employ two or more antibodies simultaneously. 

Radio- imaging Methods 

These methods include but are not limited to, positron emission tomography (PET) 
single photon emission computed tomography (SPECT). Both of these techniques are non- 
25 invasive, and can be used to detect and/or measure a wide variety of tissue events and/or 
functions, such as detecting cancerous cells for example. Unlike PET, SPECT can optionally be 
used with two labels simultaneously. SPECT has some other advantages as well, for example 
with regard to cost and the types of labels that can be used. For example, US Patent No. 
6,696,686 describes the use of SPECT for detection of breast cancer, and is hereby incorporated 
30 by reference as if fully set forth herein. 
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Display Libraries 

According to still another aspect of the present invention there is provided a display 
library comprising a plurality of display vehicles (such as phages, viruses or bacteria) each 
displaying at least 6, at least 7, at least 8, at least 9, at least 10, 10-15, 12-17, 15-20, 15-30 or 20- 
5 50 consecutive amino acids derived from the polypeptide sequences of the present invention. 

Methods of constructing such display libraries are well known in the art. Such methods 
are described in, for example, Young AC, et al., "The three-dimensional structures of a 
polysaccharide binding antibody to Cryptococcus neoformans and its complex with a peptide 
from a phage display library: implications for the identification of peptide mimotopes" J Mol 

10 Biol 1997 Dec 12;274(4):622-34; Giebel LB et al. "Screening of cyclic peptide phage libraries 
identifies ligands that bind streptavidin with high affinities" Biochemistry 1995 Nov 
28;34(47): 15430-5; Davies EL et al., "Selection of specific phage-display antibodies using 
libraries derived from chicken immunoglobulin genes" J Immunol Methods 1995 Oct 
12;186(l):125-35; Jones C RT al. "Current trends in molecular recognition and bioseparation" J 

15 Chromatogr A 1995 Jul 14;707(l):3-22; Deng SJ et al "Basis for selection of improved 

carbohydrate-binding single-chain antibodies from synthetic gene libraries" Proc Natl Acad Sci 
USA 1995 May 23;92(1 1):4992-6; and Deng SJ et al "Selection of antibody single-chain 
variable fragments with improved carbohydrate binding by phage display" J Biol Chem 1 994 
Apr l;269(13):9533-8, which are incorporated herein by reference. 

20 

The following sections relate to Candidate Marker Examples (first section) and to 
Experimental Data for these Marker Examples (second section). 

CANDIDATE MARKER EXAMPLES SECTION 
25 This Section relates to Examples of sequences according to the present invention, 

including illustrative methods of selection thereof. 

Description of the methodology undertaken to uncover the biomolecular sequences of 
the present invention 

Human ESTs and cDNAs were obtained from GenBank versions 136 (June 15, 2003 
30 ftp.ncbi.nih.gov/genbarik/release.notes/gbl36.release.notes); NCBI genome assembly of April 
2003; RefSeq sequences from June 2003; Genbank version 139 (December 2003); Human 



WO 2006/131783 



PCT/IB2005/004037 



172 

Genome from NCBI (Build 34) (from Oct 2003); and RefSeq sequences from December 2003; 
and from the LifeSeq library of Incyte Corporation (ESTs only; Wilmington, DE, USA). With 
regard to GenBank sequences, the human EST sequences from the EST (GBEST) section and 
the human mRNA sequences from the primate (GBPRI) section were used; also the human 
5 nucleotide RefSeq mRNA sequences were used (see for example 

www.ncbi.nlm.nih.gov/Genbank/GenbankOverview.html and for a reference to the EST section, 
see www.ncbi.nlm.nih.gov/dbEST/; a general reference to dbEST, the EST database in 
GenBank, may be found in Boguski et al, Nat Genet. 1993 Aug;4(4):332-3; all of which are 
hereby incorporated by reference as if fully set forth herein). 

10 Novel splice variants were predicted using the LEADS clustering and assembly system 

as described in Sorek, R., Ast, G. & Graur, D. Alu-containing exons are alternatively spliced. 
Genome Res 12, 1060-7 (2002); US patent No: 6,625,545; and U.S. Pat. Appl. No. 10/426,002, 
published as US20040101876 on May 27 2004; all of which are hereby incorporated by 
reference as if fully set forth herein. Briefly, the software cleans the expressed sequences from 

15 repeats, vectors and immunoglobulins. It then aligns the expressed sequences to the genome 
taking alternatively splicing into account and clusters overlapping expressed sequences into 
"clusters" that represent genes or partial genes. 

These were annotated using the GeneCarta (Compugen, Tet Aviv, Israel) platform. The 
GeneCarta platform includes a rich pool of annotations, sequence information (particularly of 

20 spliced sequences), chromosomal information, alignments, and additional information such as 
SNPs, gene ontology terms, expression profiles, functional analyses, detailed domain structures, 
known and predicted proteins and detailed homology reports. 

A brief explanation is provided with regard to the method of selecting the candidates. 
However, it should noted that this explanation is provided for descriptive purposes only, and is 

25 not intended to be limiting in any way. The potential markers were identified by a computational 
process that was designed to find genes and/or their splice variants that are over- expressed in 
tumor tissues, by using databases of expressed sequences. Various parameters related to the 
information in the EST libraries, determined according to a manual classification process, were 
used to assist in locating genes and/or splice variants thereof that are over- expressed in 

30 cancerous tissues. The detailed description of the selection method is presented in Example 1 



WO 2006/131783 



PCT/IB2005/004037 



173 

below. The cancer biomarkers selection engine and the following wet validation stages are 
schematically summarized in Figure 1 . 

EXAMPLE 1 

5 Identification of differentially expressed gene products - Algorithm 

In order to distinguish between differentially expressed gene products and constitutively 
expressed genes (i.e., house keeping genes ) an algorithm based on an analysis of frequencies was 
configured. A specific algorithm for identification of transcripts over expressed in cancer is 
described hereinbelow. 
10 Dry analysis 

Library annotation - EST libraries are manually classified according to: 

• Tissue origin 

• Biological source - Examples of frequently used biological sources for 
construction of EST libraries include cancer cell- lines; normal tissues; 

15 cancer tissues; fetal tissues; and others such as normal cell lines and pools 

of normal cell- lines, cancer cell- lines and combinations thereof. A 
specific description of abbreviations used below with regard to these 
tissues/cell lines etc is given above. 

• Protocol of library construction - various methods are known in 
20 the art for library construction including normalized library construction; 

norhnormalized library construction; subtracted libraries; ORESTES and 
others. It will be appreciated that at times the protocol of library 
construction is not indicated. 
The following rules are followed: 
25 EST libraries originating from identical biological samples are considered as a single 

library. 

EST libraries which included above-average levels of contamination, such as DNA 
contamination for example, were eliminated. The presence of such contamination was determined 
as follows. For each library, the number of unspliced ESTs that are not fully contained within 
30 other spliced sequences was counted. If the percentage of such sequences (as compared to all 
other sequences) was at least 4 standard deviations above the average for all libraries being 
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analyzed, this library was tagged as being contaminated and was eliminated from further 
consideration in the below analysis (see also Sorek, R. & Safer, H.M. A novel algorithm for 
computational identification of contaminated EST libraries. Nucleic Acids Res 31, 1067-74 
(2003)for further details). 

5 Clusters (genes) having at least five sequences including at least two sequences from the 

tissue of interest were analyzed. Splice variants were identified by using the LEADS software 
package as described above. 

EXAMPLE 2 

10 Identification of genes over expressed in cancer. 

Two different scoring algorithms were developed. 

Libraries score -candidate sequences which are supported by a number of cancer libraries, 
are more likely to serve as specific and effective diagnostic markers. 

The basic algorithm - for each cluster the number of cancer and normal libraries 
1 5 contributing sequences to the cluster was counted. Fisher exact test was used to check if cancer 
libraries are significantly over-represented in the cluster as compared to the total number of 
cancer and normal libraries. 

Library counting: Small libraries (e.g., less than 1000 sequences) were excluded from 
consideration unless they participate in the cluster. For this reason, the total number of libraries is 
20 actually adjusted for each cluster. 

Clones no. score - Generally, when the number of ESTs is much higher in the cancer 
libraries relative to the normal libraries it might indicate actual over-expression. 
The algorithm - 

Clone counting : For counting EST clones each library protocol class was given a weight 
25 based on our belief of how much the protocol reflects actual expression levels: 

(i) non-normalized : 1 

(ii) normalized : 0.2 

(iii) all other classes : 0.1 

Clones number score - The total weighted number of EST clones from cancer libraries 
30 was compared to the EST clones from normal libraries. To avoid cases where one library 
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contributes to the majority of the score, the contribution of the library that gives most clones for a 
given cluster was limited to 2 clones. 
The score was computed as 




where: 

c - weighted number of "cancer" clones in the cluster. 
C- weighted number of clones in all "cancer" libraries, 
n - weighted number of "normal" clones in the cluster. 
N- weighted number of clones in all "normal" libraries. 

Clones number score significance - Fisher exact test was used to check if EST clones from 
cancer libraries are significantly over-represented in the cluster as compared to the total number 
of EST clones from cancer and normal libraries. 

Two search approaches were used to find either general cancer- specific candidates or 
tumor specific candidates. 

• Libraries/sequences originating from tumor tissues are counted as well as 
libraries originating from cancer cell- lines ("normal" cell- lines were 
ignored). 

• Only libraries/sequences originating from tumor tissues are counted 

EXAMPLE 3 
Identification of tissue specific genes 
For detection of tissue specific clusters, tissue libraries/sequences were compared to the 
total number of libraries/sequences in cluster. Similar statistical tools to those described in above 
were employed to identify tissue specific genes. Tissue abbreviations are the same as for 
cancerous tissues, but are indicated with the header "normal tissue". 

The algorithm - for each tested tissue T and for each tested cluster the following were 
examined: 
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1. Each cluster includes at least 2 libraries from the tissue T. At least 3 clones 
(weighed - as described above) from tissue T in the cluster; and 

2. Clones from the tissue T are at least 40 % from all the clones participating in the 
tested cluster 

Fisher exact test P- values were computed both for library and weighted clone counts to 
check that the counts are statistically significant. 

EXAMPLE 4 

Identification of splice variants over expressed in cancer of clusters which are not over 

expressed in cancer 
Cancer- specific splice variants containing a unique region were identified. 
Identification of unique sequence regions in splice variants 

A Region is defined as a group of adjacent exons that always appear or do not appear 
together in each splice variant. 

A "segment" (sometimes referred also as "seg" or "node") is defined as the shortest 
contiguous transcribed region without known splicing inside. 

Only reliable ESTs were considered for region and segment analysis. An EST was 
defined as unreliable if: 

(i) Unspliced; 

(ii) Not covered by RNA; 

(iii) Not covered by spliced ESTs; and 

(iv) Alignment to the genome ends in proximity of long poly- A stretch or starts in 
proximity of long poly-T stretch. 

Only reliable regions were selected for further scoring. Unique sequence regions were 
considered reliable if: 

(i) Aligned to the genome; and 

(ii) Regions supported by more than 2 ESTs. 
The algorithm 

Each unique sequence region divides the set of transcripts into 2 groups: 

(i) Transcripts containing this region (group TA). 

(ii) Transcripts not containing this region (group TB). 
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The set of EST clones of every cluster is divided into 3 groups: 

(i) Supporting (originating from) transcripts of group TA (SI). 

(ii) Supporting transcripts of group TB (S2). 

(iii) Supporting transcripts from both groups (S3). 

5 Library and clones number scores described above were given to SI group. 

Fisher Exact Test P- values were used to check if: 
SI is significantly enriched by cancer EST clones compared to S2; and 
SI is significantly enriched by cancer EST clones compared to cluster background 
(S1+S2+S3). 

10 Identification of unique sequence regions and division of the group of transcripts 

accordingly is illustrated in Figure 2. Each of these unique sequence regions corresponds to a 
segment, also termed herein a "node". 



15 Region 1 : common to all transcripts, thus it is not considered for detecting variants; 

Region 2: specific to Transcript 1; Region 3: specific to Transcripts 2 and 3; Region 4: specific 
to Transcript 3; Region 5: specific to Transcript 1 and 2; Region 6: specific to Transcript 1. 



20 EXAMPLE 5 

Identification of cancer specific splice variants of genes over expressed in cancer 
A search for EST supported (no mRNA) regions for genes of: 

(i) known cancer markers 

(ii) Genes shown to be over- expressed in cancer in published micro-array experiments. 
25 Reliable EST supported-regions were defined as supported by minimum of one of the 

following: 

(i) 3 spliced ESTs; or 

(ii) 2 spliced ESTs from 2 libraries; 

(iii) 10 unspliced ESTs from 2 libraries, or 
30 (iv) 3 libraries. 
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Actual Marker Examples 

The following examples relate to specific actual marker examples. 

EXPERIMENTAL EXAMPLES SECTION 
5 This Section relates to Examples describing experiments involving these sequences, and 

illustrative, non- limiting examples of methods, assays and uses thereof. The materials and 
experimental procedures are explained first, as all experiments used them as a basis for the work 
that was performed. 

The markers of the present invention were tested with regard to their expression in 
various cancerous and noncancerous tissue samples. A description of the samples used in the 
panel is provided in Table 2 below. A description of the samples used in the normal tissue panel 
is provided in Table 3 below. Tests were then performed as described in the "Materials and 
Experimental Procedures" sectionbelow. 

Table 2: Tissue samples in testing panel 



sample rename 


Lot No. 


source 


pathologv 


Grade 


gender/ag 
e 












1-B-Adeno Gl 


A504117 


Biochai 
n 


Adenocarcinom 
a 


1 


F/29 


2-B-Adeno Gl 


A504118 


Biochai 
n 


Adenocarcinom 
a 


1 


M/64 


95-B-Adeno Gl 


A610063 


Biochai 
n 


Adenocarcinom 
a 


1 


F/54 


12-B-Adeno G2 


A504119 


Biochai 
n 


Adenocarcinom 
a 


2 


F/74 


75-B-Adeno G2 


A609217 


Biochai 
n 


Adenocarcinom 
a 


2 


M/65 


77-B-Adeno G2 


A608301 


Biochai 


Adenocarcinom 


2 


M/44 



10 



15 
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n 


a 






13-B-Adeno G2-3 


A504116 


Biochai 
n 


Adenocarcinom 
a 


2-3 


M/64 


89-B- Adeno G2-3 


A609077 


Biochai 
n 


Adenocarcinom 
a 


2-3 


M/62 


76-B- Adeno G3 


A609218 


Biochai 
n 


Adenocarcinom 
a 


3 


M/57 


94-B- Adeno G3 


A610118 


Biochai 
n 


Adenocarcinom 
a 


3 


M/68 


3-CG-Adeno 


CG-200 


Ichilov 


Adenocarcinom 
a 




NA 


14-CG- Adeno 


CG-111 


Ichilov 


Adenocarcinom 
a 




M/68 


15-CG-Bronch adeno 


CG-244 


Ichilov 


Bronchioloalve 
olar 

adenocarcino ma 




M/74 


45-B-Alvelous Adeno 


A501221 


Biochai 
n 


Alveolus 
carcinoma 




F/50 


44-B-Alvelous Adeno G2 


A501123 


Biochai 
n 


Alveolus 
carcinoma 


2 


F/61 


19-B- Squamous Gl 


A408175 


Biochai 
n 


Squamous 
carcinoma 


1 


M/78 


1 6- B- Squamous G2 


A409091 


Biochai 
n 


Squamous 
carcinoma 


2 


F/68 


17-B- Squamous G2 


A503183 


Biochai 
n 


Squamous 
carcinoma 


2 


M/57 


2 1-B- Squamous G2 


A503187 


Biochai 
n 


Squamous 
carcinoma 


2 


M/52 


7 8-B- Squamous G2 


A607125 


Biochai 
n 


Squamous Cell 
Carcinoma 


2 


M/62 
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80-B- Squamous G2 


A609163 


Biochai 
n 


Squamous Cell 
Carcinoma 


2 


M/74 


18-B-Squamous G2-3 


A503387 


Biochai 
n 


Squamous Cell 
Carcinoma 


2-3 


M/63 


8 1-B- Squamous G3 


A609076 


Biochai 
n 


Squamous 
Carcinoma 


3 


m/53 


79-B- Squamous G3 


A609018 


Biochai 
n 


Squamous Cell 
Carcinoma 


3 


M/67 


20-B- Squamous 


A501121 


Biochai 
n 


Squamous 
Carcinoma 




M/64 


22-B-Squamous 


A503386 


Biochai 
n 


Squamous 
Carcinoma 




M/48 


88-B-Squamous 


A609219 


Biochai 
n 


Squamous Cell 
Carcinoma 




M/64 


1 00-B- Squamous 


A409017 


Biochai 
n 


Squamous 
Carcinoma 




M/64 


23 - C G- Squamous 


CG-109(1) 


Ichilov 


Squamous 
Carcinoma 




M/65 


24-CG-Squamous 


CG-123 


Ichilov 


Squamous 
Carcinoma 




M/76 


25-CG-Squamous 


CG-204 


Ichilov 


Squamous 
Carcinoma 




M/72 


87-B-Large cell G3 


A609165 


Biochai 
n 


Large Cell 
Carcinoma 


3 


F/47 


38-B-Large cell 


A504113 


Biochai 
n 


Large cell 




M/58 


39-B-Large cell 


A504114 


Biochai 
n 


Large cell 




F/35 


82-B-Large cell 


A609170 


Biochai 
n 


Large Cell 
Neuroendocrine 




M/68 
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Carcinoma 






30-B-Small cell card G3 


A501389 


Biochai 
n 


small cell 


3 


M/34 


31-B-Small cell carci G3 


A501390 


Biochai 
n 


small cell 


3 


F/59 


32-B- Small cell carci G3 


A501391 


Biochai 
n 


small cell 


3 


M/30 


33-B-Small cell carci G3 


A504115 


Biochai 
n 


small cell 


3 


M 


86-B- Small cell carci G3 


A608032 


Biochai 
n 


Small Cell 
Carcinoma 


3 


F/52 


83-B-Small cell carci 


A609162 


Biochai 
n 


Small Cell 
Carcinoma 




F/47 


84-B-Small cell carci 


A609167 


Biochai 
n 


Small Cell 
Carcinoma 




F/59 


85-B-Small cell carci 


A609169 


Biochai 
n 


Small Cell 
Carcinoma 




M/66 


46-B-N M44 


A501124 


Biochai 
n 


Normal M44 




F/61 


47-B-N 


A503205 


Biochai 
n 


Normal PM 




M/26 


48-B-N 


A503206 


Biochai 
n 


Normal PM 




M744 


49-B-N 


A503384 


Biochai 
n 


Normal PM 




M/27 


50-B-N 


A503385 


Biochai 
n 


Normal PM 




M/28 


90-B-N 


A608152 


Biochai 
n 


Normal (Pool 2) 
PM 




pool 2 


91-B-N 


A607257 


Biochai 


Normal (Pool 2) 




pool 2 
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n 


PM 






92-B-N 


A503204 


Biochai 
n 


Normal PM 




m/28 


93-Am-N 


111P0103A 


Ambio 
n 


Normal PM 




F/61 


96-Am-N 


36853 


Ambio 
n 


Normal PM 




F/43 


97-Am-N 


36854 


Ambio 
n 


Normal PM 




M/46 


98-Am-N 


36855 


Ambio 
n 


Normal PM 




F/72 


99-Am-N 


36856 


Ambio 
n 


Normal PM 




M/31 



Table 3: Tissue samples in normal panel: 





Lot no. 


Source 


Tissue 


Pathology 


Sex/Age 


1-Am-Colon (C71) 


071P10B 


Ambion 


Colon 


PM 


F/43 


2-B-CoIon (C69) 


A411078 


Biochain 


Colon 


PM-Pool of 10 


M&F 


3-CI-Colon (C70) 


1110101 


Clontech 


Colon 


PM-Pool of 3 


M&F 


4-Am-SmaII Intestine 


091P0201A 


Ambion 


Small Intestine 


PM 


M/75 


5-B-Small Intestine 


A501158 


Biochain 


Small Intestine 


PM 


M/63 


6-B-Rectum 


A605138 


Biochain 


Rectum 


PM 


M/25 


7-B-Rectum 


A610297 


Biochain 


Rectum 


PM 


M/24 


8-B-Rectum 


A6 10298 


Biochain 


Rectum 


PM 


M/27 


9-Am-Stomach 


110P04A 


Ambion 


Stomach 


PM 


M/16 


1 0-B-Stomach 


A501159 


Biochain 


Stomach 


PM 


M/24 


11-B-Esophagus 


A603814 


Biochain 


Esophagus 


PM 


M/26 


12-B-Esophagus 


A603813 


Biochain 


Esophagus 


PM 


M/41 


1 3-Am-Pancreas 


071P25C 


Ambion 


Pancreas 


PM 


M/25 


14-CG-Pancreas 


CG-255-2 


Ichilov 


Pancreas 


PM 


M/75 
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15-B-Lung 


A409363 


Biochain 


Lung 


PM 


F/26 


16-Am-Lung (L93) 


111P0103A 


Ambion 


Lung 


PM 


F/61 


17-B-Lung (L92) 


A503204 


Biochain 


Lung 


PM 


M/28 


1 8-Am-Ovary (047) 


061P43A 


Ambion 


Ovary 


PM 


F/16 


19-B-Ovary (048) 


A504087 


Biochain 


Ovary 


PM 


F/51 


20-B-Ovary (046) 


A504086 


Biochain 


Ovary 


PM 


F/41 


21-Am-Cervix 


101P0101A 


Ambion 


Cervix 


PM 


F/40 


22-B-Cervix 


A408211 


Biochain 


Cervix 


PM 


F/36 


23-B-Cervix 


A504089 


Biochain 


Cervix 


PM-Pool of 5 


M&F 


24-B-Uterus 


A4 11074 


Biochain 


Uterus 


PM-Pool of 10 


M&F 


25-B-Uterus 


A409248 


Biochain 


Uterus 


PM 


F/43 


26-B-Uterus 


A504090 


Biochain 


Uterus 


PM-Pool of 5 


M&F 


27-B-Bladder 


A501157 


Biochain 


Bladder 


PM 


M/29 


28-Am-Bladder 


071P02C 


Ambion 


Bladder 


PM 


M/20 


29-B-Bladder 


A504088 


Biochain 


Bladder 


PM-Pool of 5 


M&F 


30-Am-Placenta 


021P33A 


Ambion 


Placenta 


PB 


F/33 


31-B-Placenta 


A410165 


Biochain 


Placenta 


PB 


F/26 


32-B-Placenta 


A4 11073 


Biochain 


Placenta 


PB-Pool of 5 


M&F 


33-B-Breast (B59) 


A607155 


Biochain 


Breast 


PM 


F/36 


34-Am-Breast (B63) 


26486 


Ambion 


Breast 


PM 


F/43 


35-Am-Breast (B64) 


23036 


Ambion 


Breast 


PM 


F/57 


36-CI-Prostate (P53) 


1070317 


Clontech 


Prostate 


PB-Pool of 47 


M&F 


37-Am-Prostate (P42) 


061P04A 


Ambion 


Prostate 


PM 


M/47 


38-Am-Prostate (P59) 


25955 


Ambion 


Prostate 


PM 


M/62 


39-Am-Testis 


111P0104A 


Ambion 


Testis 


PM 


M/25 


40-B-Testis 


A411147 


Biochain 


Testis 


PM 


M/74 


41-CI-Testis 


1110320 


Clontech 


Testis 


PB-Pool of 45 


M&F 


42-CG-Adrenal 


CG-184-10 


Ichilov 


Adrenal 


PM 


F/81 


43-B-Adrenal 


A6 10374 


Biochain 


Adrenal 


PM 


F/83 


44-B-Heart 


A4 11077 


Biochain 


Heart 


PB-Pool of 5 


M&F 
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45-CG-Heart 


CG-255-9 


Ichilov 


Heart 


PM 


M/75 


46-CG-Heart 


CG-227-1 


Ichilov 


Heart 


PM 


F/36 


47-Am-Liver 


081P0101A 


Ambion 


Liver 


PM 


M/64 


48-CG-Liver 


CG-93-3 


Ichilov 


Liver 


PM 


F/19 


49-CG-Liver 


CG- 124-4 


Ichilov 


Liver 


PM 


F/34 


50-CI-BM 


1110932 


Clontech 


Bone Marrow 


PM-Poolof 8 


M&F 


51-CGEN-Blood 


WBC#5 


CGEN 


Blood 




M 


52-CGEN-Blood 


WBC#4 


CGEN 


Blood 




M 


53-CGEN-Blood 


WBC#3 


CGEN 


Blood 




M 


54-CG-Spleen 


CG-267 


Ichilov 


Spleen 


PM 


F/25 


55-CG-Spleen 


111P0106B 


Ambion 


Spleen 


PM 


M/25 


56-CG-Spleen 


A409246 


Biochain 


Spleen 


PM 


F/12 


56-CG-Thymus 


CG-98-7 


Ichilov 


Thymus 


PM 


F/28 


58-Am-Thymus 


101P0101A 


Ambion 


Thymus 


PM 


M/14 


59-B-Thymus 


A409278 


Biochain 


Thymus 


PM 


M/28 


60-B-Thyroid 


A610287 


Biochain 


Thyroid 


PM 


M/27 


61-B-Thyroid 


A610286 


Biochain 


Thyroid 


PM 


M/24 


62-CG-Thyroid 


CG- 119-2 


Ichilov 


Thyroid 


PM 


F/66 


63-CI-Salivary Gland 


1070319 


Clontech 


Salivary Gland 


PM-Pool of 24 


M&F 


64-Am-Kidney 


111P0101B 


Ambion 


Kidney 


PM-Pool of 14 


M&F 


65-CI-Kidney 


1110970 


Clontech 


Kidney 


PM-Pool of 14 


M&F 


66-B-Kidney 


A4 11080 


Biochain 


Kidney 


PM-Pool of 5 


M&F 


67-CG-Cerebellum 


CG- 183-5 


Ichilov 


Cerebellum 


PM 


M/74 


68-CG-Cerebellum 


CG-212-5 


Ichilov 


Cerebellum 


PM 


M/54 


69-B-Brain 


A4 11322 


Biochain 


Brain 


PM 


M/28 


70-Cl-Brain 


1120022 


Clontech 


Brain 


PM-Pool of 2 


M&F 


71-B-Brain 


A411079 


Biochain 


Brain 


PM-Pool of 2 


M&F 


72-CG-Brain 


CG-151-1 


Ichilov 


Brain 


PM 


F/86 


73-Am-Skeletal Muscle 


101P013A 


Ambion 


Skeletal Muscle 


PM 


F/28 


74-CI-Skeletal Muscle 


1061038 


Clontech 


Skeletal Muscle 


PM-Pool of 2 


M&F 
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Materials and Experimental Procedures 
RNA preparation - RNA was obtained from Clontech (Franklin Lakes, NJ USA 07417, 
5 www.clontech.com), BioChain Inst. Inc. (Hayward, CA 94545 USA www.biochain.com), ABS 
(Wilmington, DE 19801, USA, http://www.absbioreagents.com) or Ambion (Austin, TX 78744 
USA, http://www.ambion.com). Alternatively, RNA was generated from tissue samples using 
TRI- Reagent (Molecular Research Center), according to Manufacturer's instructions. Tissue and 
RNA samples were obtained from patients or from postmortem. Total RNA samples were 

10 treated with DNasel (Ambion) and purified using RNeasy columns (Qiagen). 

RT PCR - Purified RNA (1 |ig) was mixed with 150 ng Random Hexamer primers 
(Invitrogen) and 500 jxM dNTP in a total volume of 15.6 jal. The mixture was incubated for 5 
min at 65 °C and then quickly chilled on ice. Thereafter, 5 \x\ of 5X Superscriptll first strand 
buffer (Invitrogen), 2.4jal 0.1M DTT and 40 units RNasin (Promega) were added, and the 

15 mixture was incubated for 10 min at 25 °C, followed by further incubation at 42 °C for 2 min. 
Then, 1 \i\ (200units) of Superscriptll (Invitrogen) was added and the reaction (final volume of 
25|Lil) was incubated for 50 min at 42 °C and then inactivated at 70 °C for 15min. The resulting 
cDNA was diluted 1:20 in TE buffer (10 mM Tris pH=8, 1 mM EDTA pH=8). 

Real-Time RT -PCR analysis- cDNA (5 pi), prepared as described above, was used as a 

20 template in Real-Time PCR reactions using the SYBR Green I assay (PE Applied Biosystem) 

with specific primers and UNG Enzyme (Eurogentech or ABI or Roche). The amplification was 
effected as follows: 50 °C for 2 min, 95 °C for 10 min, and then 40 cycles of 95 °C for 15sec, 
followed by 60 °C for 1 min. Detection was performed by using the PE Applied Biosystem SDS 
7000. The cycle in which the reactions achieved a threshold level (Ct) of fluorescence was 

25 registered and was used to calculate the relative transcript quantity in the RT reactions. The 

relative quantity was calculated using the equation Q=efficiency A " Ct . The efficiency of the PCR 
reaction was calculated from a standard curve, created by us ing serial dilutions of several 
reverse transcription (RT) reactions. To minimize inherent differences in the RT reaction, the 
resulting relative quantities were normalized to the geometric mean of the relative quantities of 

30 several housekeeping (HSKP) genes. Schematic summary of quantitative reattime PCR 
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analysis is presented in Figure 3. As shown, the x-axis shows the cycle number. The Cj = 

Threshold Cycle point, which is the cycle that the amplification curve crosses the fluorescence 
threshold that was set in the experiment. This point is a calculated cycle number in which PCR 
product signal is above the background level (passive dye ROX) and still in the 
5 Geometric/Exponential phase (as shown, once the level of fluorescence crosses the 

measurement threshold, it has a geometrically increasing phase, during which measurements are 
most accurate, followed by a linear phase and a plateau phase; for quantitative measurements, 
the latter two phases do not provide accurate measurements). The y-axis shows the mrmalized 
reporter fluorescence. It should be noted that this type of analysis provides relative 
1 0 quantification. 



The sequences of the housekeeping genes measured in all the examples in testing panel were as 
follows: 

1 5 Ubiquitin (GenBank Accession No. BC000449) 

Ubiquitin Forward primer (SEQ ID NO: 326): ATTTGGGTCGCGGTTCTTG 
Ubiquitin Reverse primer (SEQ ID NO: 327): TGCCTTGACATTCTCGATGGT 
Ubiquitin-amplicon (SEQ ID NO: 328) 

ATTTGGGTCGCGGTTCTTGTTTGTGGATCGCTGTGATCGTCACTTGACAATGCAGAT 
20 CTTCGTGAAGACTCTGACTGGTAAGACCATCACCCTCGAGG 
TTGAGCCCAGTGACACCATCGAGAATGTCAAGGCA 

SDHA (GenBank Accession No. NM_004168) 

SDHA Forward primer (SEQ ID NO: 329): TGGGAACAAGAGGGCATCTG 
25 SDHA Reverse primer (SEQ ID NO: 330) : CCACCACTGCATCAAATTCATG 
SDHA-amplicon (SEQ ID NO: 331): 

TGGGAACAAGAGGGCATCTGCTAAAGTTTCAGATTCCATTTCTGCTCAGTATCCAGT 
AGTGGATCATGAATTTGATGCAGTGGTGG 

30 PBGD (GenBank Accession No. BC019323), 

PBGD Forward primer (SEQ ID NO: 332): TGAGAGTGATTCGCGTGGG 
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PBGD Reverse primer (SEQ ID NO: 333): CCAGGGTACGAGGCTTTCAAT 
PBGD-amplicon (SEQ ID NO: 334): 

TGAGAGTGATTCGCGTGGGTACCCGCAAGAGCCAGCTTGCTCGCATACAGACGGAC 
AGTGTGGTGGCAACATTGAAAGCCTCGTACCCTGG 

5 

HPRT1 (GenBank Accession No. NM_000194), 

HPRT1 Forward primer (SEQ ID NO: 1295): TGACACTGGCAAAACAATGCA 
HPRT1 Reverse primer (SEQ ID NO: 1296): GGTCCTTTTC ACCAGC AAGCT 
HPRTl-amplicon(SEQIDNO: 1297): 
1 0 TGAC ACTGGC AAAAC AATGC AGACTTTGCTTTCCTTGGTC AGGC AGTATAATCCAA 
AGATGGTCAAGGTCGCAAGCTTGCTGGTGAAAAGGACC 

The sequences of the housekeeping genes measured in all the examples on normal tissue 
samples panel were as follows: 

15 

RPL19 (GenBank Accession No. NM_000981), 
RPL19 Forward primer (SEQ ID NO: 1298): TGGCAAGAAGAAGGTCTGGTTAG 
RPL19 Reverse primer (SEQ ID NO: 1420): TGATCAGCCCATCTTTGATGAG 
RPL19 -amplicon (SEQ ID NO: 1630): 

20 TGGCAAGAAGAAGGTCTGGTTAGACCCCAATGAGACCAATGAAATCGCCAATGCCA 
ACTCCCGTCAGCAGATCCGGAAGCTCATCAAAGATGGGCTGATCA 

TATA box (GenBank Accession No. NM_003194), 
TATA box Forward primer (SEQ ID NO: 1631) : CGGTTTGCTGCGGTAATCAT 
TATA box Reverse primer (SEQ ID NO: 1632): TTTCTTGCTGCCAGTCTGGAC 

25 TATA box -amplicon (SEQ ID NO: 1633): 

CGGTTTGCTGCGGTAATCATGAGGATAAGAGAGCCACGAACCACGGCACTGATTTT 
CAGTTCTGGGAAAATGGTGTGCACAGGAGCCAAGAGTGAAGAACAGTCCAGACTG 
GCAGCAAGAAA 

Ubiquitin (GenBank Accession No. BC000449) 
30 Ubiquitin Forward primer (SEQ ID NO: 326): ATTTGGGTCGCGGTTCTTG 

Ubiquitin Reverse primer (SEQ ID NO: 327): TGCCTTGACATTCTCGATGGT 
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Ubiquitin-amplicon (SEQ ID NO: 328) 

ATTTGGGTCGCGGTTCTTGTTTGTGGATCGCTGTGATCGTCACTTGACAATGCAGAT 
CTTCGTGAAGACTCTGACTGGTAAGACCATCACCCTCGAGG 
TTGAGCCCAGTGACACCATCGAGAATGTCAAGGCA 
5 SDHA (GenBank Accession No. NM_0041 68) 

SDHA Forward primer (SEQ ID NO: 329): TGGGAACAAGAGGGCATCTG 
SDHA Reverse primer (SEQ ID NO: 330): CCACCACTGCATCAAATTCATG 
SDHA-amplicon (SEQ ID NO: 331): 

TGGGAACAAGAGGGCATCTGCTAAAGTTTCAGATTCCATTTCTGCTCAGTATCCAGT 
1 0 AGTGGATC ATGAATTTG ATGC AGTGGTGG 



Oligonucleotide-based micro-array experiment protocol- 

15 Microarray fabrication 

Microarrays (chips) were printed by pin deposition using the MicroGrid II MGII 600 
robot from BioRobotics Limited (Cambridge, UK). 50-mer oligonucleotides target sequences 
were designed by Compugen Ltd (Tel- Aviv, IL) as described by A. Shoshan et al 9 "Optical 
technologies and informatics", Proceedings of SPIE. Vol 4266, pp. 86-95 (2001). The designed 

20 oligonucleotides were synthesized and purified by desalting with the Sigma-Genosys system 
(The Woodlands, TX, US) and all of the oligonucleotides were joined to a C6 amino -modified 
linker at the 5' end, or being attached directly to CodeLink slides (Cat #25-6700-01. Amersham 
Bioscience, Piscataway, NJ, US). The 50-mer oligonucleotides, forming the target sequences, 
were first suspended in Ultra-pure DDW (Cat # 01-866-1 A Kibbutz Beit-Haemek, Israel) to a 

25 concentration of 50pM. Before printing the slides, the oligonucleotides were resuspended in 
300mM sodium phosphate (pH 8.5) to final concentration of 150mM and printed at 35-40% 
relative humidity at 21 °C. 

Each slide contained a total of 9792 features in 32 subarrays. Of these features, 4224 
features were sequences of interest according to the present invention and negative controls that 

30 were printed in duplicate. An additional 288 features (96 target sequences printed in triplicate) 
contained housekeeping genes from Human Evaluation Library2, Compugen Ltd, Israel. 
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Another 384 features are E.coli spikes 1-6, which are oligos to E-Coli genes which are 
commercially available in the Array Control product (Array control- sense oligo spots, Ambion 
Inc. Austin, TX. Cat #1781, Lot #1 12K06). 

5 Post-coupling processing of printed slides 

After the spotting of the oligonucleotides to the glass (CodeLink) slides, the slides were 
incubated for 24 hours in a sealed saturated NaCl humidification chamber (relative humidity 70- 
75%). 

Slides were treated for blocking of the residual reactive groups by incubating them in 
10 blocking solution at 50°C for 15 minutes (lOml/slide of buffer containing 0.1M Tris, 50mM 
ethanolamine, 0.1% SDS). The slides were then rinsed twice with Ultra-pure DDW (double 
distilled water). The slides were then washed with wash solution (lOml/slide. 4X SSC, 0.1 % 
SDS)) at 50°C for 30 minutes on the shaker. The slides were then rinsed twice with Ultra-pure 
DDW, followed by drying by centrifugation for 3 minutes at 800 rpm. 
1 5 Next, in order to assist in automatic operation of the hybridization protocol, the slides 

were treated with Ventana Discovery hybridization station barcode adhesives. The printed 
slides were loaded on a Bio-Optica (Milan, Italy) hematology staining device and were 
incubated for 10 minutes in 50ml of 3-Aminopropyl Triethoxysilane (Sigma A3648 lot 
#122K589). Excess fluid was dried and slides were then incubated for three hours in 20 mm/Hg 
20 in a dark vacuum desiccator (Pelco 2251, Ted Pella, Inc. Redding CA). 

The following protocol was then followed with the Genisphere 900-RP (random primer), 
with mini elute columns on the Ventana Discovery HybStation™, to perform the microarray 
experiments. Briefly, the protocol was performed as described with regard to the instructions 
25 and information provided with the device itself. The protocol included cDNA synthesis and 
labeling. cDNA concentration was measured with the TBS-380 (Turner Biosystems. 
Sunnyvale, CA.) PicoFlour, which is used with the OliGreen ssDNA Quantitation reagent and 
kit. 

30 Hybridization was performed with the Ventana Hybridization device, according to the 

provided protocols (Discovery Hybridization Station Tuscon AZ). 
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The slides were then scanned with GenePix 4000B dual laser scanner from Axon 
Instruments Inc, and analyzed by GenePix Pro 5.0 software. 

Schematic summary of the oligonucleotide based microarray fabrication and the 
experimental flow is presented in Figures 4 and 5. 
5 Briefly, as shown in Figure 4, DNA oligonucleotides at 25uM were deposited (printed) 

onto Amersham 'CodeLink' glass slides generating a well defined 'spot'. These slides are 
covered with a long-chain, hydrophilic polymer chemistry that creates an active 3-D surface that 
covalently binds the DNA oligonucleotides 5 '-end via the 

C6-amine modification. This binding ensures that the full length of the DNA oligonucleotides is 
10 available for hybridization to the cDNA and also allows lower background, high sensitivity and 
reproducibility. 

Figure 5 shows a schematic method for performing the microarray experiments. It 
should be noted that stages on the left-hand or right-hand side may optionally be performed in 
any order, including in parallel, until stage 4 (hybridization). Briefly, on the left-hand side, the 

15 target oligonucleotides are being spotted on a glass microscope slide (although optionally other 
materials could be used) to form a spotted slide (stage 1). On the right hand side, control sample 
RNA and cancer sample RNA are Cy3 and Cy5 labeled, respectively (stage 2), to form labeled 
probes. It should be noted that the control and cancer samples come from corresponding tissues 
(for example, normal prostate tissue and cancerous prostate tissue). Furthermore, the tissue 

20 from which the RNA was taken is indicated below in the specific examples of data for particular 
clusters, with regard to overexpression of an oligonucleotide from a "chip" (microarray), as for 
example "prostate" for chips in which prostate cancerous tissue and normal tissue were tested as 
described above. In stage 3, the probes are mixed. In stage 4, hybridization is performed to 
form a processed slide. In stage 5, the slide is washed and scanned to form an image file, 

25 followed by data analysis in stage 6. 

The following clusters were found to be overexpressed in lung cancer: 

W60282JPEA_1 
F05068_PEA_1 
30 H38804_PEA_1 
HSENA78 
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T39971 
(R00299) 
H14624 
Z41644_PEA_1 
5 Z25299_PEA_2 
HSSTROL3 

HUMTREFACJPEA2 

HSS100PCB 

HSU33147_PEA_1 
10 HUMCA1XIA 

H61775 

HUMGRP5E 

HUMODCA 

AA161187 
15 R66178 

D56406_PEA_1 

M85491_PEA_1 

Z21368_PEA_1 

HUMCA1XIA 
20 R20779 

R38144_PEA_2 

Z44808_PEA_1 

HUMOSTRO_PEA_l_PEA_l 

R11723_PEA_3 
25 AI076020 

T23580 

M79217_PEA_1 
M62096_PEA_1 
M78076_PEA_1 
30 T99080_PEA_4 
T08446 PEA 1 
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R16276JPEA_1 

The following clusters were found to be overexpressed in lung small cell cancer: 
H61775 
5 HUMGRP5E 

M85491_PEA_1 
Z44808_PEA_1 
AA161187 
R66178 

10 HUMPHOSLIP_PEA_2 

AI076020 
T23580 

M79217_PEA_1 
15 M62096_PEA_1 
M78076_PEA_1 
T99080_PEA_4 
T08446_PEA_1 

20 

The following clusters were found to be overexpressed in lung adenocarcinoma: 
R00299 

M85491_PEA_1 
Z21368_PEA_1 
25 HUMCA1XIA 
AA161187 
R66178 

T11628 PEA 1 



30 
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The following clusters were found to be overexpressed in lung squamous cell: 
HUMODCA 
R00299 

D56406JPEA_1 
Z44808_PEA_1 
Z21368_PEA_1 
HUMCA1XIA 
AA161187 
R66178 

HUMCEA_PEA_1 

R35 1 37_PEA_1_PEA_1_PEA_1 

DESCRIPTION FOR CLUSTER H61775 
Cluster H61775 features 2 transcript(s) and 6 seginent(s) of interest, the names for which 
are given in Tables 4 and 5, respectively, the sequences themselves are given at the end of the 
application. The selected protein variants are given in table 6. 

Table 4 - Transcripts of interest 



Transcript Name : \ % v 1 "* * • 


j Sequence ID No. % ') J. v 


H61775_T21 


1 


H61775_T22 


2 


Table 5 - Segments of interest 


Segment Name 


Sequence ID No. \ 


H61775_node_2 


151 


H61775_node_4 


152 


H61775_node_6 


153 


H61775_node_8 


154 


H61775_node_0 


155 


H61775_node_5 


156 
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Table 6 - Proteins of interest 



Protein Name' ,': ;. 


Sequence ID No. 


H61775_P16 


1281 


H61775_P17 


1282 



Cluster H61775 can be used as a diagnostic marker according to overexpression of 
5 transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 6 refer to weighted expression of ESTs in each 
category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to the 
expression of all ESTs in that category, according to parts per million). 

10 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 6 and Table 7. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: brain malignant tumors and a mixture of malignant tumors 
from different tissues. 

15 



Table 7 - Normal tissue distribution 



N&fe^of; Tissue 


Number : . 


bladder 


0 


brain 


0 


colon 


0 


epithelial 


10 


general 


3 


breast 


8 


muscle 


0 


ovary 


0 


pancreas 


0 
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prostate 


0 


uterus 


0 



Table 8 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI 


P2 ;j 


SP1 .;" 


R3 : . 


sp2 v : - 


R4 j; 


bladder 


3.1e-01 


3.8e-01 


3.2e-01 


2.5 


4.6e-01 


1.9 


brain 


8.8e-02 


6.5e-02 


1 


3.5 


4.1e-04 


5.8 


colon 


5.6e-01 


6.4e-01 


1 


1.1 


1 


1.1 


epithelial 


3.0e-02 


1.3e-01 


2.3e-02 


2.1 


3.2e-01 


1.2 


general 


1.3e-06 


4.9e-05 


1.0e-07 


6.3 


1.5e-06 


4.3 


breast 


4.7e-01 


3.7e-01 


3.3e-01 


2.0 


4.6e-01 


1.6 


muscle 


2.3e-01 


2.9e-01 


1.5e-01 


6.8 


3.9e-01 


2.6 


ovary 


3.8e-01 


4.2e-01 


1.5e-01 


2.4 


2.6e-01 


1.9 


pancreas 


3.3e-01 


4.4e-01 


4.2e-01 


2.4 


5.3e-01 


1.9 


prostate 


7.3e-01 


7.8e-01 


6.7e-01 


1.5 


7.5e-01 


1.3 


uterus 


1.0e-01 


2.6e-01 


2.9e-01 


2.6 


5.1e-01 


1.8 



As noted above, contig H61775 features 2 transcript(s), which were listed in Table 3 
5 above. A description of each variant protein according to the present invention is now provided. 

Variant protein H61775JP16 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) H61775 T21. One 
or more alignments to one or more previously published protein sequences are given at the end 
10 of the application. A brief description of the relationship of the variant protein according to the 
present invention to each such aligned protein is as follows: 

Comparison report between H61775 J>16 and Q9P2J2 (SEQ ID NO:1694): 
l.An isolated chimeric polypeptide encoding for H61775_P16 ? comprising a first amino 
15 acid sequence being at least 90 % homologous to 

MWCLGLAVLSLVISQGADGRGKPEWSWGRAGESVVLGCDLLPPAGRPPLHVIEWL 
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RFGFLLPIFIQFGLYSPRIDPDYVG corresponding to amino acids 1 1 - 93 of Q9P2J2, which 
also corresponds to amino acids 1 - 83 of H61775_P16, and a second amino acid sequence being 
at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and 
most preferably at least 95% homologous to a polypeptide having the sequence 
5 DCGFPAFRELKRAETVSPVFFTRRCIWEDLKSTGFSPAGGGRPPGGGPRTQEDSGLPCW 
RSSCSVTLQV corresponding to amino acids 84- 152 of H61775_P16, wherein said first and 
second amino acid sequences are contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a tail of H61775JP16, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
10 at least about 90% and most preferably at least about 95% homologous to the sequence 

DCGFPAFRELKRAETVSPVFFTRRCIWEDLKSTGFSPAGGGRPPGGGPRTQEDSGLPCW 

RSSCSVTLQV in H61775_P16. 

Comparison report between H61775JP16 and AAQ88495 (SEQ ID NO:1695): 
LAn isolated chimeric polypeptide encoding for H61775JP16, comprising a first amino 
15 acid sequence being at least 90 % homologous to 

MVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESVVLGCDLLPPAGRPPLHVIEWL 
RFGFLLPIFIQFGLYSPRIDPDYVG corresponding to amino acids 1 - 83 of AAQ88495, which 
also corresponds to amino acids 1 - 83 of H61775_P16, and a second amino acid sequence being 
at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and 
20 most preferably at least 95% homologous to a polypeptide having the sequence 

DCGFPAFRELKRAETVSPVFFTRRCIWEDLKSTGFSPAGGGRPPGGGPRTQEDSGLPCW 
RSSCSVTLQV corresponding to amino acids 84- 152 of H61775__P16, wherein said first and 
second amino acid sequences are contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a tail of H61775JP16, comprising a polypeptide 
25 being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 

DCGFPAFRELKRAETVSPVFFTRRCIWEDLKSTGFSPAGGGRPPGGGPRTQEDSGLPCW 
RSSCSVTLQV in H61775_P16. 

30 The location of the variant protein was determined according to results from a number of 

different software programs and analyses, including analyses from SignalP and other specialized 
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programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 
5 Variant protein H61775_P16 also has the following non-silent SNPs (Single Nucleotide 

Polymorphisms) as listed in Table 9, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein H61775_P16 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

1 0 Table9 - Amino acid mutations 



SNP position(s) onjamino acid 
sequence,- { £.y | i£~ . 


Alternative amino aeid(s) 


Previously known SNP? 5 ;)'■ 4 


14 


I->T 


No 


138 


G->R 


No 


34 


G->E 


Yes 


48 


G->R 


No 


91 


R->* 


Yes 



Variant protein H61775_P16 is encoded by the following transcript(s): H61775_T21, for 
which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
H61775JT21 is shown in bold; this coding portion starts at position 261 and ends at position 
15 716. The transcript also has the following SNPs as listed in Table 10 (given according to their 
position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
H61775_P16 sequence provides support for the deduced sequence of this variant protein 
according to the present invention). 

20 Table 1 0 - Nucleic acid SNPs 



SNP position on nucleotide 


Alternative nucleic acid 


Previously known SNP? 


sequence 
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117 


T->C 


Yes 


200 


T->C 


No 


672 


G->C 


No 


222 


T->C 


Yes 


301 


T->C 


No 


361 


G-> A 


Yes 


377 


G-> A 


No 


400 


->C 


No 


402 


G->C 


No 


531 


C ->T 


Yes 


566 


T->C 


No 



Variant protein H61775_P17 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) H61775__T22. One 
5 or more alignments to one or more previously published protein sequences are given at the end 
of the application. A brief description of the relationship of the variant protein according to the 
present invention to each such aligned protein is as follows: 

Comparison report between H61775JP17 and Q9P2J2: 
10 LAn isolated chimeric polypeptide encoding for H61775JP17, comprising a first amino 

acid sequence being at least 90 % homologous to 

MVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESVVLGCDLLPPAGRPPLHVIEWL 
RPGFLLPIFIQFGLYSPRIDPDYVG corresponding to amino acids 11-93 of Q9P2J2, which 
also corresponds to amino acids 1-83 of H61775JP17. 
15 Comparison report between H61775JP17 and AAQ88495: 

l.An isolated chimeric polypeptide encoding for H61775JP17, comprising a first amino 
acid sequence being at least 90 % homologous to 

MVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESVVLGCDLLPPAGRPPLHVIEWL 
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RFGFLLPIFIQFGLYSPRJDPDYVG corresponding to amino acids 1 - 83 of AAQ88495, which 
also corresponds to amino acids 1 - 83 of H61775_P17. 

The location of the variant protein was determined according to results from a number of 
5 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

10 Variant protein H61775JP17 also has the following nor>silent SNPs (Single Nucleotide 

Polymorphisms) as listed in Table 1 1, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein H61775_P17 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

15 Table 11 - Amino acid mutations 



SNP positions) oh; amino acid 

sequence •. • -r^'\ 

■ • ■ ' ' ' 


|Altemative amino acid(s) " : 


Previously known SNP? 


14 


I->T 


No 


34 


G->E 


Yes 


48 


G->R 


No 



Variant protein H61775_P17 is encoded by the following transcript(s): H61775JT22, for 
which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
H61775_T22 is shown in bold; this coding portion starts at position 261 and ends at position 
20 509. The transcript also has the following SNPs as listed in Table 12 (given according to their 
position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
H61775JP17 sequence provides support for the deduced sequence of this variant protein 
according to the present invention). 

25 Table 12 - Nucleic acid SNPs 



WO 2006/131783 


200 


PCT/IB2005/004037 


SNP position oh nucleotide j 
sequence - V ,:: 


Alternative nucleic acid 


Previously known SNP? 


117 


T->C 


Yes 


200 


T->C 


No 


222 


T->C 


Yes 


301 


T->C 


No 


361 


G-> A 


Yes 


377 


G-> A 


No 


400 


->C 


No 


402 


G->C 


No 


596 


T-> A 


Yes 


As noted above, cluster E 


[61775 features 6 segment(s), which were listed in Table 4 above 



and for which the sequence(s) are given at the end of the application. These segment(s) are 
portions of nucleic acid sequence(s) which are described herein separately because they are of 
particular interest. A description of each segment according to the present invention is now 
5 provided. 

Segment cluster H61775__node_2 according to the present invention is supported by 17 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H61775_T21 and H61775_T22. Table 13 below describes 
10 the starting and ending position of this segment on each transcript. 

Table 13 - Segment location on transcripts 



Transcript name 


Segment starting position 


, Segment ending position 


H61775_T21 


87 


318 


H61775_T22 


87 


318 



Segment cluster H61775_node_4 according to the present invention is supported by 20 
15 libraries. The number of libraries was determined as previously described. This segment can be 
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found in the following transcript(s): H61775JT21 and H61775JT22. Table 14 below describes 
the starting and ending position of this segment on each transcript. 

Table 14 - Segment location on transcripts 



Transcript name r . % 


Segment starting position • 


Segment ending position 


H61775_T21 


319 


507 


H61775_T22 


319 


507 



5 

Segment cluster H61775_node_6 according to the present invention is supported by 1 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H61775_T22. Table 15 below describes the starting and 
ending position of this segment on each transcript. 



10 Table 15 - Segment location on transcripts 



Transcript name 


^Segment starting position 


: Seginent ending position 


H61775_T22 


515 


715 



Segment cluster H61775_node_8 according to the present invention is supported by 5 
libraries. The number of libraries was determined as previously described. This segment can be 
15 found in the following transcript(s): H61775_T21. Table 16 below describes the starting and 
ending position of this segment on each transcript. 

Table 16 - Segment location on transcripts 



Transcript name 


Segment starting position 


[ Segment ending position 


H61775_T21 


508 


1205 



20 
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According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

Segment cluster H61775_node_0 according to the present invention is supported by 4 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H61775JT21 and H61775JT22. Table 17 below describes 
the starting and ending position of this segment on each transcript. 

Table 17 - Segment location on transcripts 



jtVanscript name ^ •; •¥- g , 


Segment starting position 


. Segment endings position 


H61775_T21 


1 


86 


H61775_T22 


1 


86 



Segment cluster H61775node_5 according to the present invention can be found in the 
following transcript(s): H61775_T22. Table 18 below describes the starting and ending position 
of this segment on each transcript. 

Table 18 - Segment location on transcripts 



Transcript name if 


Segmmt starting position $ 


Segment ending position 


H61775_T22 


508 


514 



Microairay (chip) data is also available for this gene as follows. As described above with 
regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (with regard to lung cancer), shown in Table 19. 

Table 19 - Oligonucleotides related to this gene 



Oligonucleotide name 


Overexpressed in cancers 


Chip reference 


H61775J)_11_0 


Lung cancer 


Lung 
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Variant protein alignment to the previously known protein: 
Sequence name: /tmp/PswORJLCti/aLAXQ j Xh07 : Q9P2 J2 

5 

Sequence documentation : 
Alignment of: H61775_P16 x Q9P2J2 
10 Alignment segment 1/1: 

Quality : 

Escore: 0 

Matching length: 
15 length: 83 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 
20 Gaps : 

Alignment : 

1 MVWCLGLAVLSLVISQGADGRGKPEWSWGRAGESVVLGCDLLPPAGRP 5 0 

25 I I | | !! I I I I I I I I I I I I I I ! I I I II I I I I I i I I I I I I I I I I I I I I I I I I 

11 MVWCLGLAVLSLVISQGADGRGKPEVVSWGRAGESVVLGCDLLPPAGRP 60 

• • • 

51 PLHVIEWLRFGFLLPIFIQFGLYSPRIDPDYVG 8 3 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
30 61 PLHVT EWLRFGFLLP I F I QFGLY S PRI DPDYVG 93 



803.00 

83 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 
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5 

Sequence name : /tmp/PswORJLCti/aLAXQjXh07 : AAQ88495 
Sequence documentation : 
10 Alignment of: H61775_P16 x AAQ88495 
Alignment segment 1/1: 

Quality: 803.00 

15 Escore: 0 

Matching length: 83 Total 

length: 83 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 
20 Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

Alignment : 

25 ..... 

1 MVWCLGLAVLSLVISQGADGRGKPEVVSWGRAGESVVLGCDLLPPAGRP 50 

I I I I ) M I I I I I 1 I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I 

1 MVWCLGLAVLSLVISQGADGRGKPEVVSWGRAGESWLGCDLLPPAGRP 50 

30 51 PLHVIEWLRFGFLLPIFIQFGLYSPRIDPDYVG 83 

I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I 
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51 PLHVIEWLRFGFLLPIFIQFGLYSPRIDPDYVG 



83 



Sequence name: /tmp/naab8yR3GC/pSM412IL5o : Q9P2 J2 
10 Sequence documentation: 

Alignment of: H61775_P17 x Q9P2J2 
Alignment segment 1/1: 

Quality: 803.00 



15 



Escore: 0 

Matching length: 
length: 83 
20 Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



25 



83 



100.00 



Alignment : 



Total 



100*00 Matching Percent 



Total Percent 



30 



1 MVWCLGLAVLSLVISQGADGRGKPEVVSWGRAGESVVLGCDLLPPAGRP 50 

I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

11 MVWCLGLAVLSLVISQGADGRGKPEWSVVGRAGESVVLGCDLLPPAGRP 60 
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51 PLHVIEWLRFGFLLPIFIQFGLYSPRIDPDYVG 83 

i I I I I ! i I I I i I II I I I I 1 I I I I I I I I I I I I I I 

61 PLHVIEWLRFGFLLPIFIQFGLYSPRIDPDYVG 93 



10 Sequence name: / tmp/naab8yR3GC/pSM412IL5o : AAQ884 95 
Sequence documentation : 
Alignment of: H61775_P17 x AAQ88495 

15 

Alignment segment 1/1: 

Quality: 803.00 

Escore: 0 

20 Matching length: 83 Total 

length: 83 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

25 Identity: 100.00 

Gaps : 0 

Alignment : 

30 1 MVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESVVLGCDLLPPAGRP 50 

II I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
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1 MVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESVVLGCDLLPPAGRP 50 
• 

51 PLHVIEWLRFGFLLPIFIQFGLYSPRIDPDYVG 83 

I I I I I! I I I II I I I 1 I I I I I I I I I I I I II I I I I 
51 PLHVIEWLRFGFLLPIFIQFGLYSPRIDPDYVG 83 



Expression of immunoglobulin superfamily, member 9, 
H61775 transcripts which are detectable by amplicon as depicted in sequence name H61775seg8 

10 in normal and cancerous lung tissues 

Expression of immunoglobulin superfamily, member 9 transcripts detectable by or 
according to seg8 , H61775seg8 amplicon (SEQ ID NO: 1636) and H61775seg8F2 (SEQ ID NO: 
1634) and H61775seg8R2 (SEQ ID NO: 1635) primers was measured by real time PCR. In 
parallel the expression of four housekeeping genes - PBGD (GenBank Accession No. 

15 BC019323; amplicon - PBGD-amplicon, SEQ ID NO:334, primers SEQ ID NOs 332 and 333), 
HPRT1 (GenBank Accession No. NM_000194; amplicon - HPRT1 -amplicon, SEQ ID 
NO:1297; primers SEQ ID NOs 1295 and 1296), Ubiquitin (GenBank Accession No. 
BC000449; amplicon - Ubiquitin-amplicon, SEQ ID NO:328, primers SEQ ID NOs 326 and 
327) and SDHA (GenBank Accession No. NM_004168; amplicon - SDHA-amplicon, SEQ ID 

20 NO:331; primers SEQ ID NOs 329 and 330) was measured similarly. For each RT sample, the 
expression of the above amplicon was normalized to the geometric mean of the quantities of the 
housekeeping genes. The normalized quantity of each RT sample was then divided by the 
median of the quantities of the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 
96-99, Table 2, "Tissue samples in testing panel"), to obtain a value of fold up -regulation for 

25 each sample relative to median of the normal PM samples. 

Figure 7 is a histogram showing over expression of the above- indicated immunoglobulin 
superfamily, member 9 transcripts in cancerous lung samples relative to the normal samples. 
The number and percentage of samples that exhibit at least 5 fold over- expression, out of the 
total number of samples tested, is indicated in the bottom. 

30 As is evident from Figure 7, the expression of immunoglobulin superfamily, member 9 

transcripts detectable by the above amplicon(s) in cancer samples was significantly higher than 
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in the non-cancerous samples (Sample Nos. 47-50, 90-93, 96-99, Table 2 "Tissue samples in 
testing panel"). Notably an over- expression of at least 5 fold was found in 11 out of 15 
adenocarcinoma samples, 12 out of 16 squamous cell carcinoma samples, 1 out of 4 samples of 
large cell carcinoma samples and in 8 out of 8 small cell carcinoma samples. 
5 Statistical analysis was applied to verify the significance of these results, as described 

below. 

The P value for the difference in the expression levels of immunoglobulin superfamily, 
member 9 transcripts detectable by the above amplicon in lung cancer samples versus the 
normal tissue samples was determined by T test as 6.5E-02. In adenocarcinoma, the minimum 
10 values were 7.62E-03 in squamous cell adenocarcinoma cancer and 1.5E-03 in small cell 
carcinoma. 

Threshold of 5 fold overexpression was found to differentiate between cancer and 
normal samples with P value of 9.62E-04 in adenocarcinoma, 5.9E-04 in squamous cell 
carcinoma, and a threshold of 10 fold overexpression was found to differentiate between small 
15 cell adenocarcinoma cancer and normal samples with P value of 7.14E-05 as checked by exact 
fisher test. The above values demonstrate statistical significance of the results. 

Primer pairs are also optionally and preferably encompassed within the present 

invention; for example, for the above experiment, the following primer pair was used as a non- 
20 limiting illustrative example only of a suitable primer pair: H61775seg8F2 forward primer; and 

H61775seg8R2 reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 

use of any suitable primer pair; for example, for the above experiment, the following amplicon 

was obtained as a non- limiting illustrative example only of a suitable amplicon: H61775seg8. 
25 H61775seg8F2 (SEQ ID NO: 1634) 

GAAGGCTCTTGTCACTTACTAGCCAT 

H61775seg8R2 (SEQ ID NO: 1635) 

TGTCACCATATTTAATCCTCCCAA 

H61775seg8 (SEQ ID NO: 1636) 
30 GAAGGCTCTTGTCACTTACTAGCCATGTGATTTTGGAAAGAAACTTAACATTAATTC 

CTTCAGCTACAATGGAATTCTTGGGAGGATTAAATATGGTGACA 
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5 

Expression of immunoglobulin superfamily, member 9, 
H61775 transcripts which are detectable by amplicon as depicted in sequence name H61775seg8 

in different normal tissues. 

1 0 Expression of immunoglobulin superfamily, member 9 transcripts detectable by or 

according to H61775 seg8 amplicon (SEQ ID NO: 1636) and H61775 seg8F2 (SEQ ID NO: 
1634) and H61775 seg8R2 (SEQ ID NO: 1635) was measured by real time PGR. In parallel the 
expression of four housekeeping genes — RPL19 (GenBank Accession No. NM_000981; RPL19 
amplicon, SEQ ID NO: 1630), TATA box (GenBank Accession No. NM_003194; TATA 

1 5 amplicon, SEQ ID NO: 1633), Ubiquitin (GenBank Accession No. BC000449; amplicon - 
Ubiquitin-amplicon, SEQ ID NO:328) and SDHA (GenBank Accession No. NM 004168; 
amplicon — SDHA-amplicon, SEQ ID NO:331) was measured similarly. For each RT sample, 
the expression of the above amplicon was normalized to the geometric mean of the quantities of 
the housekeeping genes. The normalized quantity of each RT sample was then divided by the 

20 median of the quantities of the ovary samples (Sample Nos. 1 8-20, Table 4, "Tissue sample in 
normal panel", above), to obtain a value of relative expression of each sample relative to median 
of the ovary samples. 

25 H61775seg8F2 (SEQ ID NO: 1634) 

GAAGGCTCTTGTCACTTACTAGCCAT 

H61775seg8R2 (SEQ ID NO: 1635) 

TGTCACCATATTTAATCCTCCCAA 

H61775seg8 (SEQ ID NO: 1636) 
30 GAAGGCTCTTGTCACTTACTAGCCATGTGATTTTGGAAAGAAACTTAACATTAATTC 

CTTCAGCTACAATGGAATTCTTGGGAGGATTAAATATGGTGACA 
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The results are demonstrated in Figure 8, showing expression of immunoglobulin superfamily, 
member 9, H61775 transcripts, which are detectable by amplicon as depicted in sequence name 
H61775seg8, in different normal tissues. 

5 

DESCRIPTION FOR CLUSTER M85491 
Cluster M85491 features 2 transcript(s) and 1 1 segment(s) of interest, the names for 
which are given in Tables 20 and 21, respectively, the sequences themselves are given at the end 
of the application. The selected protein variants are given in table 22. 

10 Table 20 - Transcripts of interest 



Transcript Name ' •' £ v v 


Sequence ID No. ;-A it? 


M85491_PEA_1_T16 


3 


M85491_PEA_1_T20 


4 


Table 21 - Segments of interest 


Segment Name \ * 


Sequence IE> No. .■ ^$$'$$f. .'' 


M8549 l_PEA_l_node_0 


157 


M85491_PEA_l_node_l 3 


158 


M8549 l_PEA_l_node_2 1 


159 


M85491_PEA_l_node_23 


160 


M8549 l_PEA_l_node_24 


161 


M8549 l_PEA_l_node_8 


162 


M85491_PEA_l_node_9 


163 


M85491_PEA_l_node_l 0 


164 


M85491_PEA_l_node_l 8 


165 


M85491_PEA_l_node_l 9 


166 


M85491_PEA_l_node_6 


167 



Table22 - Proteins of interest 
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Protein Name : 


Sequence ID No. 


M85491_PEA_1_P13 


1283 


M85491_PEA_1_P14 


1284 



These sequences are variants of the known protein Ephrin type-B receptor 2 [precursor] 
(SwissProt accession identifier EPB2_HUMAN; known also according to the synonyms EC 
2.7.1.112; Tyrosine -protein kinase receptor EPH-3; DRT; Receptor protein- tyrosine kinase 
5 HEK5; ERK) 5 SEQ ID NO: 1417, referred to herein as the previously known protein. 

Protein Ephrin type-B receptor 2 [precursor] is known or believed to have the following 
function(s): Receptor for members of the ephrin- B family. The sequence for protein Ephrin 
type-B receptor 2 [precursor] is given at the end of the application, as "Ephrin type-B receptor 2 
[precursor] amino acid sequence" (SEQ ID NO:1417). Known polymorphisms for this sequence 
10 are as shown in Table 23. 

Table 23 - Amino acid mutations for Known Protein 



amino acid) sequence 


Comment llf'^ : " ' : Jl 


671 


A -> R. /FTId=VAR_004162. 


1-20 


MALRRLGAALLLLPLLAAVE -> 
MWVPVLALPVCTYA 


923 


E->K 


956 


L-> V 


958 


V->L 


154 


G->D 


476 


K ->KQ 


495 - 496 


Missing 


532 


E->D 


568 


R->RR 


589 


M -> I 


788 


I->F 
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I S->A 

Protein Ephrin type-B receptor 2 [precursor] localization is believed to be Type I 
membrane protein. 

The following GO Annotation(s) apply to the previously known protein. The following 
5 annotation(s) were found: protein amino acid phosphorylation; transmembrane receptor protein 
tyrosine kinase signaling pathway; neurogenesis, which are annotation(s) related to Biological 
Process; protein tyrosine kinase; receptor; transmembrane-ephrin receptor; ATP binding; 
transferase, which are annotation(s) related to Molecular Function; and integral membrane 
protein, which are annotation(s) related to Cellular Component. 
10 The GO assignment relies on information from one or more of the SwissProt/TremBl 

Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nhTi.nih.gov/projects/LocusLink/>. 



Cluster M85491 can be used as a diagnostic marker according to overexpression of 
1 5 transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 9 refer to weighted expression of ESTs in each 
category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to the 
expression of all ESTs in that category, according to parts per million). 

20 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 9 and Table 24. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: epithelial malignant tumors and a mixture of malignant 
tumors from different tissues. 

25 



Table24 - Normal tissue distribution 



Name of Tissue 


Number 


Bladder 


0 


Bone 


0 
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Brain 


10 


Colon 


31 


epithelial 


10 


General 


12 


Kidney 


0 


Liver 


0 


Lung 


5 


Breast 


8 


Muscle 


5 


Ovary 


36 


pancreas 


10 


Skin 


0 


Stomach 


0 



Table 25 - P values and ratios for expression in cancerous tissue 



ISfame of Tissue 




P2 


SPl . 


R3 


SP2 


R4 


Bladder 


5.4e-01 


6.0e-01 


3.2e-01 


2.5 


4.6e-01 


1.9 


Bone 


1 


2.8e-01 


1 


1.0 


7.0e-01 


1.8 


Brain 


3.4e-01 


3.6e-01 


1.2e-01 


2.9 


1.8e-02 


2.7 


Colon 


3.4e-02 


5.7e-02 


8.2e-02 


2.8 


2.0e-01 


2.1 


epithelial 


1.7e-03 


3.5e-03 


2.0e-03 


2.8 


l.le-02 


2.2 


General 


4.8e-04 


5.2e-04 


6.7e-04 


2.3 


1.3e-03 


1.9 


Kidney 


4.3e-01 


3.7e-01 


1 


1.1 


7.0e-01 


1.5 ! 


Liver 


1 


4.5e-01 


1 


1.0 


6.9e-01 


1.5 


Lung 


2.2e-01 


2.7e-01 


6.9e-02 


3.6 


3.4e-02 


3.6 


Breast 


8.2e-01 


7.3e-01 


6.9e-01 


1.2 


6.8e-01 


1.2 


Muscle 


9.2e-01 


4.8e-01 


1 


0.8 


1.5e-01 


3.2 


Ovary 


8.5e-01 


7.3e-01 


9.0e-01 


0.7 


6.7e-01 


1.0 


pancreas 


5.5e-01 


2.0e-01 


6.7e-01 


1.2 


3.5e-01 


1.8 
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Skin 


2.9e-01 


4.7e-01 


1.4e-01 


7.0 


6.4e-01 


1.6 


Stomach 


1.5e-01 


3.2e-01 


1 


1.0 


8.0e-01 


1.3 



As noted above, cluster M85491 features 2 transcript(s), which were listed in Table 20 above. 
These transcript(s) encode for protein(s) which are variant(s) of protein Ephrin type-B receptor 
2 [precursor]. A description of each variant protein according to the present invention is now 
provided. 

5 Variant protein M85491_PEA_1_P13 according to the present invention has an amino 

acid sequence as given at the end of the application; it is encoded by transcript(s) 
M85491 JPEA_1JT16. An alignment is given to the known protein (Ephrin type-B receptor 2 
[precursor]) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 

10 relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 



Comparison report between M85491 J>EA_1__P13 and EPB2_HUMAN: 
LAn isolated chimeric polypeptide encoding for M85491_PEA_1JP13, comprising a first 
1 5 amino acid sequence being at least 90 % homologous to 

MALRRLGAALLLLPL^ 

TYQVC1STVTESSQNNWLRTKFIREJR.GAH 

EADFDSATKTFPNWMENPWKVDTIAADESFSQVDLGGRVMKINTEVRSFGPVSRSGF 
YLAFQDYGGCMSLIAVRVFYRKCPRIIQNGAIFQETLSGAESTSLVAARGSCIANAEEVD 

20 VPIKLYCNGDGEWLVPIGRCMCKAGFEAVENGTVCRGCPSGTFKANQGDEACTHCPIN 
SRTTSEGATNCVCRNGYYRADLDPLDMPCTTIPSAPQAVISSVNETSLMLEWTPPRDSG 
GREDLVYNIICKSCGSGRGACTRCGDNVQYAPRQLGLTEPRIYISDLLAHTQYTFEIQAV 
NGVTDQSPFSPQFASVNITTNQAAPSAVSIMHQVSRTVDSITLSWSQPDQPNGVILDYEL 
QYYEK corresponding to amino acids 1 - 476 of EPB2HUMAN, which also corresponds to 

25 amino acids 1 - 476 of M85491JPEAJUP13, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
VPIGWVLSPSPTSLRAPLPG corresponding to amino acids 477 - 496 of 
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M85491_PEA_1_P13, wherein said first and second amino acid sequences are contiguous and 
in a sequential order. 

2.An isolated polypeptide encoding for a tail of M85491JPEAJ JP13, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
5 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence VPIGWVLSPSPTSLRAPLPG in M85491 JPEA_1_P13. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
10 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans- membrane region. 

1 5 Variant protein M8549 1_PEA_1_P1 3 is encoded by the following transcript(s): 

M85491_PEA_1_T16, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M85491 JPEA_1_T16 is shown in bold; this coding portion starts at 
position 143 and ends at position 1630. The transcript also has the following SNPs as listed in 
Table 26 (given according to their position on the nucleotide sequence, with the alternative 

20 nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein M85491 J?EA_1_P13 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

Table 26 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


799 


G -> A 


Yes 


1066 


C->T 


Yes 


1519 


A->G 


Yes 


1872 


C->T 


Yes 


2044 


T->C 


Yes 
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2156 


G-> A 


Yes 


2606 


C -> A 


Yes 


2637 


G->C 


Yes 



Variant protein M85491 JPEA_1 JP14 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 M85491 JPEA_1_T20. An alignment is given to the known protein (Ephrin type-B receptor 2 
[precursor]) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

10 

Comparison report between M85491_PEA_1 JP14 and EPB2_HUMAN: 
LAn isolated chimeric polypeptide encoding for M85491 JPEA_1_P14, comprising a first 
amino acid sequence being at least 90 % homologous to 

MALRRLGAALLLLPLLAAVEETLMDSTTATAELGWMVHPPSGWEEVSGYDENMNTIR 

15 TYQVCNWESSQlSnSTWLRTKFIRRRGAHRIHVE 

EADFDSATKTFPNWMENPWKVDTIAADESFSQVDLGGRVMKINTEVRSFGPVSRSGF 
YLAFQDYGGCMSLIAVRVFYRKCPRIIQNGAIFQETLSGAESTSLVAARGSCIANAEEVD 
VPIKLYCNGDGEWLVPIGRCMCKAGFEAVENGTVCR corresponding to amino acids 1 - 
270 of EPB2HUMAN, which also corresponds to amino acids 1 - 270 of 

20 M85491„PEA_1JP14, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 

ERQDLTMLSRLVLNSWPQMILPPQPPKVLEL corresponding to amino acids 271 - 301 of 
M85491_PEA_1 JP14, wherein said first and second amino acid sequences are contiguous and 
25 in a sequential order. 

2.An isolated polypeptide encoding for a tail of M85491 JPEAJJP14, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
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more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence ERQDLTMLSRLVLNSWPQMILPPQPPKVLEL in M85491 _PEA_1JP14. 

The location of the variant protein was determined according to results from a number of 
5 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

10 

Variant protein M85491_PEA_1 JP14 is encoded by the following transcript(s): 
M8549 1_PEA_1_T20, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M85491 JPEA_1_T20 is shown in bold; this coding portion starts at 
position 143 and ends at position 1045. The transcript also has the following SNPs as listed in 
15 Table 27 (given according to their position on the nucleotide sequence, with the alternative 

nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein M85491_PEA_1_P14 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

Table 27- Nucleic acid SNPs 



SNP Position on nucleotide 
sequence .' '<§' ■ 


Alternative nucleic acid;;* 


Previously known SNP? 


799 


G->A 


Yes 


1135 


T->C 


Yes 


1160 


T->C 


Yes 


1172 


A->C 


Yes 


1176 


T->A 


Yes 



20 As noted above, cluster M85491 features 1 1 segment(s), which were listed in Table 21 

above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
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of particular interest. A description of each segment according to the present invention is now 
provided. 

Segment cluster M85491_PEA_1 jtiodej) according to the present invention is supported 
5 by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M85491_PEA__1_T16 and M85491 J>EA_1_T20. 
Table 28 below describes the starting and ending position of this segment on each transcript. 

Table 28 - Segment location on transcripts 



JTranscript name < M 


SeginSM Starting pBsi|icm ' ; f 


j Segmeifi gliding position ; 


M85491_PEA_1_T16 


1 


203 


M85491_PEA_1_T20 


1 


203 



10 

Segment cluster M85491 JPEA_1 jaode_13 according to the present invention is 
supported by 6 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M85491_PEA_1_T20. Table 29 below 
describes the starting and ending position of this segment on each transcript. 

1 5 Table 29 ~ Segment location on transcripts 



Transcript name ; 


' Sej^b^nt^btiiig position 


Segment ending position 


M85491JPEA_1_T20 


954 


1182 



Segment cluster M85491_PEA_1 jnode_21 according to the present invention is 
supported by 18 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M85491 JPEA_1_T16. Table 30 below 
20 describes the starting and ending position of this segment on each transcript. 

Table 30 - Segment location on transcripts 



Transcript name 


Segment starting position 


. Segment ending position 


M85491_PEA_1_T16 


1110 


1445 
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Segment cluster M85491_PEA_l_node_23 according to the present invention is 
supported by 18 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M85491 JPEA_1JT16. Table 31 below 



5 describes the starting and ending position of this segment on each transcript. 
Table 31 - Segment location on transcripts 



Transcript name 


Segnierit starting position 


Segment ending position 


M85491JPEA_1_T16 


1446 


1570 



Segment cluster M8 5491 JPEA_l_node_24 according to the present invention is 
10 supported by 3 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M85491_PEA_1_T16. Table 32 below 
describes the starting and ending position of this segment on each transcript. 

Table 32- Segment location on transcripts 





Segment starting position 


i Segment ending position 


M85491_PEA_1_T16 


1571 


2875 



15 

Segment cluster M85491_PEA_l_node_8 according to the present invention is supported 
by 25 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M85491_PEA_1„T16 and M85491 JPEA__1_T20. 
Table 33 below describes the starting and ending position of this segment on each transcript. 

20 Table 33 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


M85491_PEA_1_T16 


269 


672 


M85491_PEA_1_T20 


269 


672 
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Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 34. 



5 Table 34 - Oligonucleotides related to this segment 



Oligojducieotide name •# 


pverexpressed in cancers 


Chip reference 


M85491__0_14_0 


lung malignant tumors 


LUN 



Segment cluster M85491_PEA_l_node_9 according to the present invention is supported 
by 20 libraries. The number of libraries was determined as previously described. This segment 
10 can be found in the following transcript(s): M85491_PEA_1 JT16 and M85491_PEA_1 JT20. 
Table 35 below describes the starting and ending position of this segment on each transcript. 

Table 35 - Segment location on transcripts 



Transcript, narrie v T £ : p 


Segment starting position 


Segment ending position 


M85491_PEA_1_T16 


673 


856 


M85491_PEA_1_T20 


673 


856 



According to an optional embodiment of the present invention, short segments related to 



the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
15 included in a separate description. 



Segment cluster MS 549 l_PEA_l_node_l 0 according to the present invention is 
supported by 17 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M85491JPEA_1_T16 and 
20 M85491 JPEA_1_T20. Table 36 below describes the starting and ending position of this 
segment on each transcript. 

Table36 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


M85491_PEA_1_T16 


857 


953 
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M85491 PEA 1 T20 



857 



953 



1 



Segment cluster M85491 JPEA_l_node_18 according to the present invention is 
supported by 15 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M85491 JPEA_1 JT16. Table 37 below 
describes the starting and ending position of this segment on each transcript. 

Table 37 ' - Segment location on transcripts 



.Transcript name 


Bfegjafenit starting potion .,J[ - 


Segment ending position 


M85491J ) EAJLT16 


954 1 


1044 



10 Segment cluster M85491 JPEA_l_node_19 according to the present invention is 

supported by 15 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following txanscript(s): M85491 JPEA_1_T16. Table 38 below 
describes the starting and ending position of this segment on each transcript. 

Table 38 - Segment location on transcripts 



Transcript name If F 


Segment startingposition :#'■• 


; Segment ending position 


M85491_PEA_1_T16 


1045 


1109 



Segment cluster M85491 JPEA_l_node_6 according to the present invention is supported 
by 1 1 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M85491_PEAJ_T16 andM85491_PEA_l_T20. 
20 Table 39 below describes the starting and ending position of this segment on each transcript. 



Table 39 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


M85491_PEA_1_T16 


204 


268 


M85491_PEA_1_T20 


204 


268 
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Variant protein alignment to the previously known protein: 
5 Sequence name: /tmp/qfmsU9VtxS/DylcLC9 j 8v : EPB2_HUMAN 

Sequence documentation : 

Alignment of: M854 91_PEA_1_P13 x EPB2_HUMAN 

10 

Alignment segment 1/1: 

Quality: 4726.00 

Escore: 0 
15 Matching length: 47 6 

length: 47 6 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
20 Identity: 100.00 

Gaps: 0 

Alignment : 

25 1 MALRRLG AALLLL PLL AAVEE TLMD S T T AT AE L G WMVH P P S GWEE VS G Y D 50 

I I I I I M I M M I I I I I I I I I I I I I I I M I I I I I I I I I I I 1 I M I I M I I 

1 MALRRLGAALLLLPLLAAVEETLMDSTTATAELGWMVHPPSGWEEVSGYD 50 

51 ENMNT I RT YQVCNVFE S S QNNWLRTK FI RRRGAHRI HVEMKFS VRDC S S I 100 
30 | | | | | | I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

51 ENMNT IRTYQVCNVFESSQNNWLRTKFIRRRGAHRIHVEMKFSVRDCS SI 100 



Total 
Matching Percent 
Total Percent 
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101 PSVPGSCKETFNLYYYEADFDSATKTFPNWMENPWVKVDTI AADESFSQV 150 

I I I I I I ! I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I II i I 1 I I I I 

101 PSVPGSCKETFNLYYYEADFDSATKTFPNWMENPWVKVDTI AADESFSQV 150 
5 ..... 

151 DLGGRVMKINTEVRSFGPVSRSGFYLAFQDYGGCMSLIAVRVFYRKCPRI 200 

1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

151 DLGGRVMKINTEVRSFGPVSRSGFYLAFQDYGGCMSLIAVRVFYRKCPRI 20 0 
..... 
10 201 I QNGAI FQE TLS GAE S TSLVAARGS C I ANAEEVDVP I KLYCNGDGEWLVP 250 

I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I II I I 1 I I I I I I I I I I I I I I I I 
201 I QNGAI FQETLS GAE S TSLVAARGS C I ANAEEVDVP I KLYCNGDGEWLVP 250 

251 IGRCMCKAGFEAVENGTVCRGCPSGTFKANQGDEACTHCPINSRTTSEGA 30 0 

15 I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I 

251 I GRCMCKAGFE AVENGTVCRGC P S GT FKANQGDE AC THC P INS RTT S E GA 30 0 

..... 

301 TNCVCRNGYYRADLDPLDMPCTTIPSAPQAVISSVNETSLMLEWTPPRDS 35 0 

I I II ! I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I H I I 

20 301 TNCVCRNGYYRADLDPLDMPCTTIPSAPQAVISSVNETSLMLEWTPPRDS 350 

.... 

351 GGREDLVYNIICKSCGSGRGACTRCGDNVQYAPRQLGLTEPRIYISDLLA 400 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I ! I I 

351 GGREDLVYNIICKSCGSGRGACTRCGDNVQYAPRQLGLTEPRIYISDLLA 40 0 

25 ..... 

401 HTQYTFEIQAVNGVTDQSPFSPQFASVNITTNQAAPSAVSIMHQVSRTVD 450 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I M I 

401 HTQYTFEIQAVNGVTDQSPFSPQFASVNITTNQAAPSAVSIMHQVSRTVD 450 

30 451 SITLSWSQPDQPNGVILDYELQYYEK 47 6 

I I I I I I I II I I i I I I I I I I I I I I I I I 
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4 51 SITLSWSQPDQPNGVILDYELQYYEK 47 6 



5 

Sequence name : /tmp/rmnzuDbot 6/GiHbj eU8iR : EPB2_HUMAN 
10 Sequence documentation: 

Alignment of: M8 5 4 9 1_PEA_1_P1 4 x EPB2_HUMAN 
Alignment segment 1/1: 

15 

Quality: 2673.00 

Escore: 0 

Matching length: 270 
length: 270 
20 Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 

25 

Alignment : 

1 MALRRLGAALLLLPLLAAVEETLMDSTTATAELGWMVHPPSGWEEVSGYD 50 

I | I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I II I I I I I I i I I I I I I 

30 1 MALRRLGAALLLLPLLAAVEETLMDSTTATAELGWMVHPPSGWEEVSGYD 50 



Total 
Matching Percent 
Total Percent 
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51 ENMNTIRTYQVCNVFESSQNNWLRTKFIRRRGAHRIHVEMKFSVRDCSSI 10 0 

I I I I I I I I I 1 1 11 1 1 1 I I I I I I 1 I 1 I 1 ! I 1 1 I 1 I I I I I I I I I I I I I 1 I I I 

51 ENMNTIRTYQVCNVFESSQNNWLRTKFIRRRGAHRIHVEMKFSVRDCSSI 100 

. . » • 

5 101 PSVPGSCKETFNLYYYEADFDSATKTFPNWMENPWVKVDTIAADESFSQV 150 

I I I I I I I I I I It I I I I I ! I II I ! I I I I i I ! 1 I I I I I I I I I I I I I II I I I I 

101 PSVPGSCKETFNLYYYEADFDSATKTFPNWMENPWVKVDTIAADESFSQV 150 

151 DLGGRVMKINTEVRSFGPVSRSGFYLAFQDYGGCMSLIAVRVFYRKCPRI 200 

10 | | | I I I I I I I 1 I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

151 DLGGRVMKINTEVRSFGPVSRSGFYLAFQDYGGCMSLIAVRVFYRKCPRI 200 

■ • • • • 

201 I QNGAI FQE T LS G AE S T S LVAARGS C I ANAEEVDVP I KL YCNG DGEWLVP 250 

I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I M I I I I I I II I I I II I I I 

15 201 I QNGAI FQE TLS GAE S T S LVAARGS C I ANAEEVDVP I KL YCNG DGEWLVP 250 

251 I GRCMCKAGFEAVENGTVCR 270 

I I I I I I I I I I I I I I I t I I II 
251 I GRCMCKAGFEAVENGTVCR 2 70 

20 

Expression of Ephrin type-B receptor 2 precursor (EC 2.7. 1 . 1 12) (Tyrosine-protein kinase 
receptor EPB-3) M85491 transcripts which are detectable by amplicon as depicted in sequence 
name M85491seg24 in normal and cancerous lung tissues 

25 Expression of Ephrin type-B receptor 2 precursor (EC 2.7.1.112) (Tyrosine-protein 

kinase receptor EPH-3) transcripts detectable by or according to seg24, M85491seg24 amplicon 
(SEQ ID NO: 1639) andM85491seg24F (SEQ ID NO: 1637) and M85491seg24R(SEQ ID NO: 
1638) primers was measured by real time PCR. In parallel the expression of four housekeeping 
genes -PBGD (GenBank Accession No. BC019323; amplicon - PBGD- amplicon, SEQ ID 

30 NO:334), HPRT1 (GenBank Accession No. NM_000194; amplicon - HPRT1 -amplicon, SEQ 
ID NO: 1297), Ubiquitin (GenBank Accession No. BC000449; amplicon - Ubiquitin- amplicon, 
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SEQ ID NO:328) and SDHA (GenBank Accession No. NMJ304168; amplicon - SDHA- 
amplicon, SEQ ID NO:331) was measured similarly. For each RT sample, the expression of the 
above amplicon was normalized to the geometric mean of the quantities of the housekeeping 
genes. The normalized quantity of each RT sample was then divided by the median of the 
5 quantities of the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2 
above, "Tissue samples in testing panel"), to obtain a value of fold up-regulation for each 
sample relative to median of the normal PM samples. 

Figure 10 below is a histogram showing over expression of the above- indicated Ephrin 
type-B receptor 2 precursor (EC 2.7.1.112) (Tyrosine-protein kinase receptor EPH-3) transcripts 

10 in cancerous lung samples relative to the normal samples. Values represent the average of 
duplicate experiments. Error bars indicate the minimal and maximal values obtained. The 
number and percentage of samples that exhibit at least 3 fold over-expression, out of the total 
number of samples tested, is indicated in the bottom. 

As is evident from Figure 10, the expression of Ephrin type-B receptor 2 precursor (EC 

15 2.7.1.1 12) (Tyrosine-protein kinase receptor EPH-3) transcripts detectable by the above 

ampliconin cancer samples was significantly higher than in the non-cancerous samples (Sample 
Nos. 47-50, 90-93, 96-99 Table 2, "Tissue samples in testing panel".). Notably an over- 
expression of at least 3 fold was found in 9 out of 15 adenocarcinoma samples and in 4 out of 8 
small cell carcinoma samples. 

20 Statistical analysis was applied to verify the significance of these results, as described 

below. 

Threshold of 3 fold overexpression was found to differentiate between cancer and 
normal samples with P value of 7.42E-03 in adenocarcinoma and 5.69&02 in small cell 
carcinoma as checked by exact fisher test. The above values demonstrate statistical significance 
25 of the results. 
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Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: M85491seg24F forward primer; and 
M85491seg24Rreverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: M85491seg24 

M85491seg24F (SEQ ID NO: 1637) - GGCGTCTTTC TCCCTCTGAAC 
M85491seg24R (SEQ ID NO: 1638) - GTCCCATTCTGGGTGCTGTG 
M85491seg24 CSEO ID NO: 1639V 

GGCGTCTTTCTCCCTCTGAACCTCAGTTTCCACCTGTGTCGAGTGTGGGTGAGACCC 
CTCGCGGGGAGCTATGCAGGTTACGGAGAAAAGGCAGCACAGCACCCAGAATGGG 

AC 



Expression of Ephrin type-B receptor 2 precursor (EC 2.7. 1 . 1 1 2) (Tyrosine-protein kinase 
receptor EPH-3)M85491 transcripts which are detectable by amplicon as depicted in sequence 

name M85491seg24 in different normal tissues 



Expression of Ephrin type-B receptor 2 precursor transcripts detectable by or according 
to M85491 seg24 amplicon (SEQ ID NO: 1639) and M85491 seg24F (SEQ ID NO: 1637) and 
M85491 seg24R (SEQ ID NO: 1638) was measured by real time PCR. In parallel the 
expression of four housekeeping genes -RPL19 (GenBank Accession No. NM_000981; RPL19 
amplicon, SEQ ID NO: 1630), TATA box (GenBank Accession No. NM_003194; TATA 
amplicon, SEQ ID NO: 1633), Ubiquitin (GenBank Accession No. BC000449; amplicon - 
Ubiquitin-amplicon, SEQ ID NO:328) and SDHA (GenBank Accession No. NM_004168; 
amplicon - SDHA- amplicon, SEQ ID NO:331) was measured similarly. For each RT sample, 
the expression of the above amplicon was normalized to the geometric mean of the quantities of 
the housekeeping genes. The normalized quantity of each RT sample was then divided by the 
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median of the quantities of the lung samples (Sample Nos. 15-17, Table 2, "Tissue sample on 
normal panel", above), to obtain a value of relative expression of each sample relative to median 
of the lung samples. 

5 M85491seg24F (SEQ ID NO: 1637) - GGCGTCTTTCTCCCTCTGAAC 
M85491seg24R (SEQ ID NO: 1638) - GTCCCATTCTGGGTGCTGTG 
M85491seg24 (SEQ ID NO: 1639)- 

GGCGTCTTTCTCCCTCTGAACCTCAGTTTCCACCTGTGTCGAGTGTGGGTGAGACCC 
CTCGCGGGGAGCTATGCAGGTTACGGAGAAAAGGCAGCACAGCACCCAGAATGGG 

10 AC 

The results are shown in Figure 11, demonstrating the expression of Ephrin type-B receptor 2 
precursor (Tyrosine-protein kinase receptor EPH-3) M85491 transcripts which are detectable by 
amplicon as depicted in sequence name M85491seg24 in different normal tissues. 

15 

DESCRIPTION FOR CLUSTER T39971 
Cluster T39971 features 4 transcript(s) and 28 segment(s) of interest, the names for which 
are given in Tables 40 and 41, respectively, the sequences themselves are given at the end of the 
application. The selected protein variants are given in table 42. 

20 Table 40 - Transcripts of interest 



Transcript Name X ../,.'.' -,f. , V .... 


Sequence ID No. ' 


T39971_T10 


5 


T39971_T12 


6 


T39971_T16 


7 


T39971_T5 


8 


Table 41 - Segments of interest 


Segment Name 


Sequence ID No. 


T39971_node_0 


168 
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T39971_node_18 


169 


T39971_node_21 


170 


T39971_node_22 


171 


T39971_node_23 


172 


T39971_node_31 


173 


T39971_node_33 


174 


T39971_node_7 


175 


T39971_node_l 


176 


T39971_node_10 


177 


T39971_node_ll 


178 


T39971_node_12 


179 


T39971_node_15 


180 


T39971_node_16 


181 


T39971_node_17 


182 


T39971_node_26 


183 


T39971_node_27 


184 


T39971_node_28 


185 


T39971_node_29 


186 


T39971_node_3 


187 


T39971_node_30 


188 


T39971_node_34 


189 


T39971_node_35 


190 


T39971_node_36 


191 


T39971_node_4 


192 


T39971_node_5 


193 


T39971_node_8 


194 


T39971_node_9 


195 



Table 42 - Proteins of interest 



WO 2006/131783 



PCT/IB2005/004037 



230 



Protein IMame -V - .'• 


Sequence ID No. 


T39971_P6 


1285 


T39971_P9 


1286 


T39971_P11 


1287 


T39971JP12 


1288 







These sequences are variants of the known protein Vitronectin precursor (SwissProt 
accession identifier VTNC_HUMAN; known also according to the synonyms Serum spreading 
factor; S-protein; V75), SEQ ID NO: 1418, referred to herein as the previously known protein. 

5 Protein Vitronectin precursor is known or believed to have the following function(s): 

Vitronectin is a cell adhesion and spreading factor found in serum and tissues. Vitronectin 
interacts with glycosaminoglycans and proteoglycans. Is recognized by certain members of the 
integrin family and serves as a cell-to- substrate adhesion molecule. Inhibitor of the membrane- 
damaging effect of the terminal cytolytic complement pathway. The sequence for protein 

10 Vitronectin precursor is given at the end of the application, as "Vitronectin precursor amino aci 
sequence". Known polymorphisms for this sequence are as shown in Table 4. 
Table 43 - Amino acid mutations for Known Protein 



SNP position(s) on 
amino acid sequence 


Comment ; 


122 


A -> S. /FTId=VAR_0 12983. 


268 


R -> Q. /FTId=VAR_012984. 


400 


T -> M. /FTId=VAR_012985. 


50 


C->N 


225 


S->N 


366 


A->T 



Protein Vitronectin precursor localization is believed to be Extracellular. 
1 5 The previously known protein also has the following indication(s) and/or potential 

therapeutic use(s): Cancer, melanoma. It has been investigated for clinical/therapeutic use in 
humans, for example as a target for an antibody or small molecule, and/or as a direct 
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therapeutic; available information related to these investigations is as follows. Potential 
pharmaceutical^ related or therapeutically related activity or activities of the previously known 
protein are as follows: Alphavbeta3 integrin antagonist; Apoptosis agonist. A therapeutic role 
for a protein represented by the cluster has been predicted. The cluster was assigned this field 

5 because there was information in the drug database or the public databases (e.g., described 

herein above) that this protein, or part thereof, is used or can be used for a potential therapeutic 
indication: Anticancer. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: immune response; cell adhesion, which are annotation(s) related to 

10 Biological Process; protein binding; heparin binding, which are annotation(s) related to 
Molecular Function; and extracellular space, which are annotation(s) related to Cellular 
Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
1 5 from <ht1p://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

Cluster T39971 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
20 the table and the numbers on the y-axis of figure 12 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
25 Figure 12 and Table 44. This cluster is overexpressed (at least at a minimum level) in the 

following pathological conditions: liver cancer, lung malignant tumors and pancreas carcinoma. 

Table 44 - Normal tissue distribution 



Name of Tissue 


Number 


adrenal 


60 


bladder 


0 
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Bone 


0 


Brain 


9 


Colon 


0 


epithelial 


79 


general 


29 


Liver 


2164 


Lung 


0 


Lymph nodes 


0 


Breast 


0 


pancreas 


0 


prostate 


0 


Skin 


0 


Uterus 


0 



Table 45 - P values and ratios for expression in cancerous tissue 



-Name of Tissue ■ 


PI 'J:V> 


P2 • ' 


'SP.L ! ': m 


B3 


'spxM 


R4 


adrenal 


6.9e-01 


7.4e-01 


2.0e-02 


2.3 


5.3e-02 


1.8 


bladder 


5.4e-01 


6.0e-01 


5.6e-01 


1.8 


6.8e-01 


1.5 


Bone 


1 


6.7e-01 


1 


1.0 


7.0e-01 


1.4 


Brain 


8.0e-01 


8.6e-01 


3.0e-01 


1.9 


5.3e-01 


1.2 


Colon 


4.2e-01 


4.8e-01 


7.0e-01 


1.6 


7.7e-01 


1.4 


epithelial 


6.6e-01 


5.7e-01 


1.0e-01 


0.8 


8.7e-01 


0.6 


general 


5.1e-01 


3.8e-01 


9.2e-08 


1.6 


8.3e-04 


1.3 


Liver 


1 


6.7e-01 


2.3e-03 


0.3 


1 


0.2 


Lung 


2.4e-01 


9.1e-02 


1.7e-01 


4.3 


8.1e-03 


5.0 


Lymph nodes 


1 


5.7e-01 


1 


1.0 


5.8e-01 


2.3 


Breast 


1 


6.7e-01 


1 


1.0 


8.2e-01 


1.2 


pancreas 


9.5e-02 


1.8e-01 


1.5e-ll 


6.5 


8.2e-09 


4.6 


prostate 


7.3e-01 


6.0e-01 


6.7e-01 


1.5 


5.6e-01 


1.7 
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Skin 


1 


4.4e-01 


1 


1.0 


6.4e-01 


1.6 


Uterus 


5.0e-01 


2.6e-01 


1 


1.1 


8.0e-01 


1.4 



above. These transcript(s) encode for protein(s) which are variant(s) of protein Vitronectin 
precursor. A description of each variant protein according to the present invention is now 
provided. 



Variant protein T39971 JP6 according to the present invention has an amino acid sequence 
as given at the end of the application; it is encoded by transcript(s) T39971 JT5. An alignment is 
given to the known protein (Vitronectin precursor) at the end of the application. One or more 
alignments to one or more previously published protein sequences are given at the end of the 
1 0 application. A brief description of the relationship of the variant protein according to the present 
invention to each such aligned protein is as follows: 

Comparison report between T39971 J>6 and VTNC_HUM AN : 

1. An isolated chimeric polypeptide encoding for T39971JP6, comprising a first amino 

acid sequence being at least 90 % homologous to 
15 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 

KPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPV 

LKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 

GQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGV 

LDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKG corresponding to amino 

20 acids 1 - 276 of VTNC^HUMAN, which also corresponds to amino acids 1 - 276 of 

T39971JP6, and a second amino acid sequence being at least 70%, optionally at least 80%, 
preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence TQGVVGD corresponding to amino acids 
277 - 283 of T39971 JP6, wherein said first and second amino acid sequences are contiguous 

25 and in a sequential order. 

2. An isolated polypeptide encoding for a tail of T39971 JP6, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
TQGWGD in T39971 JP6. 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 

5 prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein T39971JP6 also has the following non- silent SNPs (Single Nucleotide 
Polymorphisms) as listed in Table 46, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 

10 known or not; the presence of known SNPs in variant protein T39971_P6 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 



Table46 - Amino acid mutations 



SNE^ amino acid 
•;sequence^ . " ;v X: \ ' ^ 


Alternative amino acid(s) *' 

:»# '* . * <v "'•*• 


Previously known SNP? ?? 


122 


A->S 


Yes 


145 


G-> 


No 


268 


R->Q 


Yes 


280 


V -> A 


Yes 


180 


C-> 


No 


180 


c->w 


No 


192 


Y-> 


No 


209 


A-> 


No 


211 


T-> 


No 


267 


G-> 


No 


267 


G-> A 


No 


268 


R-> 


No 



Variant protein T39971 JP6 is encoded by the following transcript(s): T39971_T5, for 
15 which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
T39971 T5 is shown in bold; this coding portion starts at position 756 and ends at position 
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1604. The transcript also has the following SNPs as listed in Table47 (given according to their 
position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
T39971_P6 sequence provides support for the deduced sequence of this variant protein 



5 according to the present invention). 
Table 47 ' - Nucleic acid SNPs 



SNP positional! nucleotide 

sequence; \- ' 


Alternative nucleic acid ' j 


Previously known SNP? j 


417 


G->C 


Yes 


459 


T->C 


Yes 


1387 


C-> 


No 


1406 


-> A 


No 


1406 


->G 


No 


1555 


G-> 


No 


1555 


G->C 


No 


1558 


G-> 


No 


1558 


G-> A 


Yes 


1594 


T->C 


Yes 


1642 


T->C 


Yes 


1770 


C->T 


Yes 


529 


G->T 


Yes 


1982 


A->G 


No 


2007 


G-> 


No 


2029 


T->C 


No 


2094 


T->C 


No 


2117 


C->G 


No 


2123 


C->T 


Yes 


2152 


C->T 


Yes 


2182 


G->T 


No 
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A->C 


No 


2297 


T->C 


Yes 


1119 ! 


G->T 


Yes 


2411 


G-> 


No 


2411 


G->T 


No 


2487 


T->C 


Yes 


1188 


G-> 


No 


1295 


c-> 


No 


1295 


C->G 


No 


1324 


->T 


No 


1331 


C-> 


No 


1381 


C-> 


No 



Variant protein T39971_P9 according to the present invention has an amino acid sequence 
as given at the end of the application; it is encoded by transcript(s) T39971_T10. An alignment 
5 is given to the known protein (Vitronectin precursor) at the end of the application. One or more 
alignments to one or more previously published protein sequences are given at the end of the 
application. A brief description of the relationship of the variant protein according to the present 
invention to each such aligned protein is as follows: 

1 0 Comparison report between T3997 1 JP9 and VTNC HUMAN: 

LAn isolated chimeric polypeptide encoding for T39971_P9, comprising a first amino 
acid sequence being at least 90 % homologous to 

MAPLRPLLILALLAWALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 
KPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPV 
15 LKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 
GQYCYELDEKAVRPGYPKLIRDWGIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGV 
LDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEE 
CEGSSLSAWEHFAMMQRDSWEDIFELLFWGRT corresponding to amino acids 1 - 325 of 
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VTNCHUMAN, which also corresponds to amino acids 1 - 325 of T39971 JP9, and a second 
amino acid sequence being at least 90 % homologous to 

SGMAPRPSLAKKQRFRHRNRKGYRSQRGHSRGRNQNSRRPSRATWLSLFSSEESNLGA 
NNYDDYRMDWLVPATCEPIQSVFFFSGDKYYRVNLRTRRVDTVDPPYPRSIAQYWLGC 

5 PAPGHL corresponding to amino acids 357- 478 of VTNCJiUM AN, which also corresponds 
to amino acids 326 - 447 of T39971JP9, wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 

2. An isolated chimeric polypeptide encoding for an edge portion of T39971_P9, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 

10 length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise TS, having a 
structure as follows: a sequence starting from any of amino acid numbers 325-x to 325; and 
ending at any of amino acid numbers 326 + ((n-2) - x), in which x varies from 0 to n-2. 

15 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 

20 prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein T39971JP9 also has the following non-silent SNPs (Single Nucleotide 
Polymorphisms) as listed in Table 48, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 

25 known or not; the presence of known SNPs in variant protein T39971JP9 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

Table48 - Amino acid mutations 



SNP positions) on amino acid 
sequence 


Alternative ammo acid(s) 


j Previously known SNP? 


122 


A->S 


Yes 
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145 


G-> 


No 


268 


R->Q 


Yes 


328 


M->T 


No 


350 


S ->P 


No 


369 


T->M 


Yes 


379 


S->I 


No 


380 


N->T 


No 


180 


C -> 


No 


180 


C ->W 


No 


192 


Y-> 


No 


209 


A-> 


No 


211 


T-> 


No 


267 


G-> 


No 


267 


G -> A 


No 


268 


R-> 


No 



Variant protein T39971_P9 is encoded by the following transcript(s): T39971 JT10, for 
which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
T39971_T10 is shown in bold; this coding portion starts at position 756 and ends at position 
5 2096. The transcript also has the following SNPs as listed in Table 49 (given according to then- 
position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
T39971_P9 sequence provides support for the deduced sequence of this variant protein 
according to the present invention). 

1 0 Table 49 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


; Previously known SNP? 


417 


G->C 


Yes 


459 


T->C 


Yes 


1387 


C-> 


No 
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1406 


-> A 


No 


1406 


->G 


No 


1555 


G-> 


No 


1555 


G->C 


No 


1558 


G-> 


No 


1558 


G-> A 


Yes 


1738 


T->C 


No 


1803 


T->C 


No 


1826 


C->G 


No 


529 


G->T 


Yes 


1832 


C->T 


Yes 


1861 


C ->T 


Yes 


1891 


G->T 


No 


1894 


A->C 


No 


2006 


T->C 


Yes 


2120 


G-> 


No 


2120 


G->T 


No 


2196 


T->C 


Yes 


1119 


G-> T 


Yes 


1188 


G-> 


No 


1295 


C-> 


No 


1295 


C->G 


No 


1324 


->T 


No 


1331 


C-> 


No 


1381 


C-> 


No 



Variant protein T39971_P1 1 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) T39971_T12. An 
5 alignment is given to the known protein (Vitronectin precursor) at the end of the application. 
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One or more alignments to one or more previously published protein sequences are given at the 
end of the application. A brief description of the relationship of the variant protein according to 
the present invention to each such aligned protein is as follows: 

5 Comparison report between T3997 1JP1 1 and VTNC__HUMAN: 

1. An isolated chimeric polypeptide encoding for T39971_P1 1, comprising a first amino 
acid sequence being at least 90 % homologous to 

MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 
KPQVTRGDVFTMPEDEYTVYDDGEEIONNATVHEQVGGPSLTSDLQAQSKGNPEQTPV 

10 LKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 
GQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFKGSQYWREEDG 
LDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEE 
CEGSSLSAVFEHFAMMQRDSWEDIFELLFWGRTS corresponding to amino acids 1 - 326 of 
VTNCJHUMAN, which also corresponds to amino acids 1 - 326 of T39971_P1 1, and a second 

1 5 amino acid sequence being at least 90 % homologous to 

DKYYRVNLRTRRVDTVDPPYPRSIAQYWLGCPAPGHL corresponding to amino acids 442 
- 478 of VTNC HUMAN, which also corresponds to amino acids 327 - 363 of T39971 JP11, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

2. An isolated chimeric polypeptide encoding for an edge portion of T39971JP11, 
20 comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 

length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise SD, having a 
structure as follows: a sequence starting from any of amino acid numbers 326-x to 326; and 
25 ending at any of amino acid numbers 327 + ((n-2) - x), in which x varies from 0 to n-2. 

Comparison report between T39971_P11 and Q9BSH7 (SEQ ID NO:1696): 
l.An isolated chimeric polypeptide encoding for T39971JP11, comprising a first amino 
acid sequence being at least 90 % homologous to 
30 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 



WO 2006/131783 



PCT/IB2005/004037 



241 

KPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPV 
LKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 
GQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGV 
LDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEE 
5 CEGSSLSAVFEHFAMMQRDSWEDIFELLFWGRTS corresponding to amino acids 1 - 326 of 
Q9BSH7 5 which also corresponds to amino acids 1 - 326 of T39971JP11, and a second amino 
acid sequence being at least 90 % homologous to 

DKYYRVNLRTRRVDTVDPPYPRSIAQYWLGCPAPGHL corresponding to amino acids 442 
- 478 of Q9BSH7, which also corresponds to amino acids 327 - 363 of T39971_P1 1, wherein 

10 said first and second amino acid sequences are contiguous and in a sequential order. 

2.An isolated chimeric polypeptide encoding for an edge portion of T39971_P1 1, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 

15 least about 50 amino acids in length, wherein at least two amino acids comprise SD, having a 
structure as follows: a sequence starting from any of amino acid numbers 326-x to 326; and 
ending at any of amino acid numbers 327 + ((n-2) - x), in which x varies from 0 to n-2. 

The location of the variant protein was determined according to results from a number of 
20 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 
25 Variant protein T39971_P1 1 also has the following non- silent SNPs (Single Nucleotide 

Polymorphisms) as listed in Table 50, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein T39971JP1 1 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

30 TableSO - Amino acid mutations 
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SNP.position(s) on amino acid 
sequence \ ' ■ ■ 


Alternative amino acid(s) :. 


. Previously known SNP? . 


122 


A->S 


Yes 


145 


G-> 


No 


268 


R->Q 


Yes 


180 


C-> 


No 


180 


c->w 


No 


192 


Y-> 


No 


209 


A-> 


No 


211 


T-> 


No 


267 


G-> 


No 


267 


G-> A 


No 


268 


R-> 


No 



Variant protein T39971 JP11 is encoded by the following transcript(s): T39971_T12, for 
which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
T39971__T12 is shown in bold; this coding portion starts at position 756 and ends at position 
5 1844. The transcript also has the following SNPs as listed in Table 51 (given according to their 
position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
T39971_P1 1 sequence provides support for the deduced sequence of this variant protein 
according to the present invention). 

1 0 Table 51 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


417 


G->C 


Yes 


459 


T->C 


Yes 


1387 


C-> 


No 


1406 


-> A 


No 
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1406 


->G 


No 


1555 


G-> 


No 


1555 


G->C 


No 


1558 


G-> 


No 


1558 


G-> A 


Yes 


1754 


T->C 


Yes 


1868 


G-> 


No 


1868 


G->T 1 


No 


529 


G->T 


Yes 


1944 


T->C 


Yes 


1119 


G->T 


Yes 


1188 


G-> 


No 


1295 


C -> 


No 


1295 


C->G 


No 


1324 


->T 


No 


1331 


C-> 


No 


1381 


c-> 


No 



Variant protein T39971 JP12 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) T39971JT16. An 
5 alignment is given to the known protein (Vitronectin precursor) at the end of the application. 
One or more alignments to one or more previously published protein sequences are given at the 
end of the application. A brief description of the relationship of the variant protein according to 
the present invention to each such aligned protein is as follows: 

1 0 Comparison report between T3997 1 JP 12 and VTNC_HUM AN : 

l.An isolated chimeric polypeptide encoding for T39971JP12, comprising a first amino 
acid sequence being at least 90 % homologous to 

MAPLRPLLILALLAWALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 
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KPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPV 
LKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 
GQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFK corresponding to 
amino acids 1 - 223 of VTNCJHUM AN, which also corresponds to amino acids 1 - 223 of 
5 T39971 JP12, and a second amino acid sequence being at least 70%, optionally at least 80%, 
preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence VPGAVGQGRKHLGRV corresponding to 
amino acids 224 - 238 of T39971 P12, wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 
10 2 An isolated polypeptide encoding for a tail of T39971 JP12, comprising a polypeptide 

being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
VPGAVGQGRKHLGRV in T39971_P12. 

15 Comparison report between T39971_P12 and Q9BSH7: 

1. An isolated chimeric polypeptide encoding for T39971JP12, comprising a first amino 
acid sequence being at least 90 % homologous to 

MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAEC 
K^QVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPV 

20 LKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEELCSGKPFDAFTDLKNGSLFAFR 
GQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFK corresponding to 
amino acids 1 - 223 of Q9BSH7, which also corresponds to amino acids 1 - 223 of T39971 JP12, 
and a second amino acid sequence being at least 70%, optionally at least 80%), preferably at least 
85%, more preferably at least 90% and most preferably at least 95% homologous to a 

25 polypeptide having the sequence VPGAVGQGRKHLGRV corresponding to amino acids 224 - 
238 of T39971JP12, wherein said first and second amino acid sequences are contiguous and in a 
sequential order. 

2. An isolated polypeptide encoding for a tail of T39971_P12, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 

30 at least about 90% and most preferably at least about 95% homologous to the sequence 
VPGAVGQGRKHLGRV in T39971 J>12. 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein T39971_P12 also has the following non-silent SNPs (Single Nucleotide 
Polymorphisms) as listed in Table 52, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein T39971 JP12 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

Table 52 - Amino acid mutations 



SNT? position^ on mxmg iicifl 
sequence ." S'- "''■[' /'/''. ",|| " '•" 


Alternative amino acid(s) fk 


Previously known SNP? |; 


122 


A->S 


Yes 


145 


G-> 


No 


180 


C-> 


No 


180 


c->w 


No 


192 


Y-> 


No 


209 


A-> 


No 


211 


T-> 


No 



Variant protein T39971 _P12 is encoded by the following transcript(s): T39971_T16, for 
which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
T39971_T16 is shown in bold; this coding portion starts at position 756 and ends at position 
1469. The transcript also has the following SNPs as listed in Table 53 (given according to then- 
position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
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T39971_P12 sequence provides support for the deduced sequence of this variant protein 
according to the present invention). 

Table 53 - Nucleic acid SNPs 



. SNP pbatjon -oh jmicj^dbf. . 
sequence,;,;,. 


^ternative nucleic acid 


Previously known SNP? 


417 


G->C 


Yes 


459 


T->C 


Yes 


1387 


C-> 


No 


1406 


->A 


No 


1406 


->G 


No 


529 


G->T 


Yes 


1119 


G->T 


Yes 


1188 


G-> 


No 


1295 


C-> 


No 


1295 


C->G 


No 


1324 


->T 


No 


1331 


C -> 


No 


1381 


C-> 


No 



As noted above, cluster T39971 features 28 segment(s), which were listed in Table 41 



5 above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
provided. 

10 Segment cluster T39971_node_0 according to the present invention is supported by 76 

libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T39971JT10, T39971_T12 5 T39971 JT16 and T39971 JT5. 
Table 54 below describes the starting and ending position of this segment on each transcript. 

Table 54 - Segment location on transcripts 
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Transcript name ■ 


Segment starting position 


^Segment ending position; 


T39971_T10 


1 


810 


T39971_T12 


1 


810 


T39971_T16 


1 


810 


T39971_T5 


1 


810 



Segment cluster T39971_node_18 according to the present invention is supported by 1 
libraries. The number of libraries was determined as previously described. This segment can be 



5 found in the following transcript(s): T39971 T16. Table 55 below describes the starting and 
ending position of this segment on each transcript. 

Table 55 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position J : 


T39971_T16 


1425 


1592 



10 Segment cluster T39971jnode_21 according to the present invention is supported by 99 

libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T39971_T10 ? T39971JT12 and T39971 JT5. Table 56 
below describes the starting and ending position of this segment on each transcript. 

Table56 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


T39971_T10 


1425 


1581 


T39971_T12 


1425 


1581 


T39971_T5 


1425 


1581 



Segment cluster T39971_node_22 according to the present invention is supported by 7 
libraries. The number of libraries was determined as previously described. This segment can be 
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found in the following transcript(s): T39971JT5. Table 57 below describes the starting and 
ending position of this segment on each transcript. 

TableS 7 - Segment location on transcripts 



Transcript name 


Isgment starting position ; ; 


Segment ending position -m 


T39971JT5 


1582 


1779 



5 

Segment cluster T39971_node_23 according to the present invention is supported by 101 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T39971 JT10, T39971JT12 and T39971 JT5. Table 58 
below describes the starting and ending position of this segment on each transcript. 

10 Table 58 - Segment location on transcripts 





Segment starting position 


Segment ending position § 


T39971_T10 


1582 


1734 


T39971_T12 


1582 


1734 


T39971_T5 


1780 


1932 



Segment cluster T39971__node_31 according to the present invention is supported by 94 
libraries. The number of libraries was determined as previously described. This segment can be 
15 found in the following transcript(s): T39971_T10 and T39971JT5. Table 59 below describes the 
starting and ending position of this segment on each transcript. 

Table 59 - Segment location on transcripts 



Transcript name 


Segment starting position 


i Segment ending position 


T39971_T10 


1847 


1986 


T39971_T5 


2138 


2277 
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Segment cluster T39971_node_33 according to the present invention is supported by 77 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T39971_T10, T39971JT12 and T39971_T5. Table 60 
below describes the starting and ending position of this segment on each transcript. 
T able 60 - Segment location on transcripts 



Transcript name 


Segment starting position ; 


Segment ending position ':' 


T39971JT10 


1987 


2113 


T39971_T12 


1735 


1861 


T39971_T5 


2278 


2404 



Segment cluster T39971_node_7 according to the present invention is supported by 87 
libraries. The number of libraries was determined as previously described. This segment can be 
0 found in the following transcript(s): T39971_T10, T39971_T12, T39971_T16 and T39971_T5. 
Table 61 below describes the starting and ending position of this segment on each transcript. 
Table 61 - Segment location on transcripts 



ifltenscrip^ name " ;* : >' C * 



T39971 T10 



T39971 T12 



T39971 T16 



T39971 T5 



Segment starting position 



940 



Segment ending position 



940 



940 



940 



1162 



1162 



1162 



1162 



According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and 
included in a separate description. 



so are 



Segment cluster T39971_node_l according to the present invention can be found in the 
following transcript(s): T39971_T10, T39971_T12, T39971_T16 and T39971_T5. Table 62 
below describes the starting and ending position of this segment on each transcript. 
Table 62 - Segment location on transcripts 
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Transcript name 


Segment starting position i v 


Segment ending position 


T39971_T10 


811 


819 


T39971_T12 


811 


819 


T39971T16 


811 


819 


T39971_T5 


811 


819 



Segment cluster T39971node 10 according to the present invention is supported by 77 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T39971_T10, T39971 JT12, T39971JT16 and T39971 JT5. 
Table 63 below describes the starting and ending position of this segment on each transcript. 



Table 63 - Segment location on transcripts 



T^aascriptkame .;' fl } V:, • 


Segment starting position 

■ ; ■■■y : : . 


■ Segment ending positiqn ~ 


T39971_T10 


1189 


1232 


T39971_T12 


1189 


1232 


T39971_T16 


1189 


1232 


T39971_T5 


1189 


1232 



Segment cluster T39971_node_l 1 according to the present invention is supported by 79 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T39971 JT10, T39971_T12 ? T39971JT16 and T39971_T5. 
Table 64 below describes the starting and ending position of this segment on each transcript. 



Table 64 - Segment location on transcripts 



Transcript name f 


Segment starting position 


. Segment ending position 


T39971_T10 


1233 


1270 


T39971_T12 


1233 


1270 


T39971_T16 


1233 


1270 


T39971_T5 


1233 


1270 
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Segment cluster T39971_node_12 according to the present invention can be found in the 
following transcript(s): T39971JT10, T39971 JT12, T39971JT16 and T39971JT5. Table 65 
5 below describes the starting and ending position of this segment on each transcript. 



Table 65 - Segment location on transcripts 



Transcript name 


Segment starting- position ; i 


Segment ending position 


T39971_T10 


1271 


1284 


T39971_T12 


1271 


1284 


T39971_T16 


1271 


1284 


T39971_T5 


1271 


1284 



Segment cluster T39971_node_15 according to the present invention is supported by 79 
10 libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T39971JT10, T39971JT12, T39971 JT16 and T39971JT5. 
Table 66 below describes the starting and ending position of this segment on each transcript. 



Table 66 - Segment location on transcripts 



Transcript name , 


Segment starting position 


Segment ending position ' 


T39971_T10 


1285 


1316 


T39971_T12 


1285 


1316 


T39971_T16 


1285 


1316 


T39971_T5 


1285 


1316 



15 

Segment cluster T39971_node_16 according to the present invention can be found in the 
following transcript(s): T39971JT10, T39971_T12 ? T39971 JT16 and T39971 JT5. Table 67 
below describes the starting and ending position of this segment on each transcript. 

Table 67 - Segment location on transcripts 
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Transcript name [.: \ . 


Segment starting position 


Segment ending position 


T39971_T10 


1317 


1340 


T39971_T12 


1317 


1340 


T39971_T16 


1317 


1340 


T39971T5 


1317 


1340 



Segment cluster T39971_node_17 according to the present invention is supported by 86 
libraries. The number of libraries was determined as previously described. This segment can be 
5 found in the following transcript(s): T39971JT10, T39971JT12, T39971 JT16 and T39971_T5. 
Table 68 below describes the starting and ending position of this segment on each transcript. 
Table 68 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


T39971_T10 


1341 


1424 


T39971_T12 


1341 


1424 


T39971_T16 


1341 


1424 


T39971_T5 


1341 


1424 



Segment cluster T39971_node_26 according to the present invention is supported by 85 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T39971 JT5. Table 69 below describes the starting and 
ending position of this segment on each transcript. 



Table 69 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


T39971_T5 


1933 


1974 
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Segment cluster T39971_node_27 according to the present invention is supported by 90 
libraries. The number of libraries was determined as previously described. This segment can b< 
found in the following transcript(s): T39971 JT5. Table 70 below describes the starting and 
ending position of this segment on each transcript. 



5 Table 70 - Segment location on transcripts 



Transcript name iv , 


Segment starting position f 


Segment ending position 


T39971_T5 


1975 \ 


2025 



Segment cluster T39971_node_28 according to the present invention can be found in 1 
following transcript(s): T39971_T10 and T39971JT5. Table 71 below describes the starting 
1 0 ending position of this segment on each transcript. 



Table 71 - Segment location on transcripts 





'p<- • "... A '[ • ■■ 


Segment ending position , . 


T39971_T10 


1735 


1743 


T39971JT5 


2026 


2034 



Segment cluster T39971_node_29 according to the present invention is supported by 99 
1 5 libraries. The number of libraries was determined as previously described. This segment can b 
found in the following transcript(s): T39971_T10 and T39971_T5. Table 72 below describes t 
starting and ending position of this segment on each transcript. 

Table 72 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


T39971_T10 


1744 


1838 


T39971_T5 


2035 "~ 1 


2129 



20 
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Segment cluster T39971_node_3 according to the present invention is supported by 78 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T39971 JT10, T39971JN2, T39971 JT16 and T39971 JT5. 
Table 73 below describes the starting and ending position of this segment on each transcript. 



5 Table 73 - Segment location on transcripts 



Transcript name 


.Segment starting position : 


: Segment ending position 


T39971_T10 


820 


861 


T39971_T12 


820 


861 


T39971_T16 


820 


861 


T39971_T5 


820 


861 



Segment cluster T39971_node_30 according to the present invention can be found in the 
following transcript(s): T39971_T10 and T39971_T5. Table 74 below describes the starting and 
10 ending position of this segment on each transcript. 



Table 74 - Segment location on transcripts 



Transcript name [% •' 


Segment starting position 


Segment ending position . 


T39971_T10 


1839 


1846 


T39971_T5 


2130 


2137 



Segment cluster T39971_node_34 according to the present invention can be found in the 
15 following transcript(s): T39971_T10, T39971 JT12 and T39971_T5. Table 75 below describes 
the starting and ending position of this segment on each transcript. 



Table 75 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position ^; 


T39971_T10 


2114 


2120 


T39971_T12 


1862 


1868 


T39971_T5 


2405 


2411 
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Segment cluster T39971_node_35 according to the present invention can be found in the 
following transcript(s): T39971JT10, T39971JT12 and T39971_T5. Table 76 below describes 
5 the starting and ending position of this segment on each transcript. 

Table 76 - Segment location on transcripts 



Transcript mime 


'Segment starting position : 


Segment ending position 5 


T39971JT10 


2121 


2137 


T39971_T12 


1869 


1885 


T39971_T5 


2412 


2428 



Segment cluster T39971_node_36 according to the present invention is supported by 51 
10 libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T39971JT10, T39971_T12 and T39971_T5. Table 77 
below describes the starting and ending position of this segment on each transcript. 

Table 77 - Segment location on transcripts 



Transcript name . ' k .. ; / 


Segment starting position 


Segment ending position 


T39971_T10 


2138 


2199 


T39971_T12 


1886 


1947 


T39971_T5 


2429 


2490 



15 

Segment cluster T39971_node_4 according to the present invention can be found in the 
following transcript(s): T39971_T10, T39971_T12 ? T39971_T16 and T39971JT5. Table 78 
below describes the starting and ending position of this segment on each transcript. 

Table 78 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


T39971_T10 


862 


881 
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T39971_T12 


862 


881 


T39971_T16 


862 


881 


T39971_T5 


862 


881 



Segment cluster T39971_node_5 according to the present invention is supported by 80 
libraries. The number of libraries was determined as previously described. This segment can be 



5 found in the following transcript(s): T39971JT10, T39971JT12, T39971_T16 and T39971JT5. 
Table 79 below describes the starting and ending position of this segment on each transcript. 

Table 79 - Segment location on transcripts 



Transcript name 


Segment starting position i r : 


Segment ending position " ' 


T39971_T10 


882 


939 


T39971_T12 


882 


939 


T39971_T16 


882 


939 


T39971_T5 


882 


939 



10 Segment cluster T39971_node_8 according to the present invention can be found in the 

following transcript(s): T39971JT10, T39971 JT12, T39971_T16 and T39971JT5. Table 80 
below describes the starting and ending position of this segment on each transcript. 



Table 80 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


T39971_T10 


1163 


1168 


T39971_T12 


1163 


1168 


T39971_T16 


1163 


1168 


T39971_T5 


1163 


1168 



15 
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Segment cluster T39971_node_9 according to the present invention can be found in the 
following transcript(s): T39971 JT10, T39971 JT12, T39971 JT16 and T39971JT5. Table 81 
below describes the starting and ending position of this segment on each transcript. 

Table 81 - Segment location on transcripts 



Transcript name : "<"■ 


Segment starting position 


Segment ending position 


T39971_T10 


1169 


1188 


T39971_T12 


1169 


1188 


T39971_T16 


1169 


1188 


T39971_T5 


1169 


1188 



Variant protein alignment to the previously known protein: 

Sequence name: /tmp/xkraCL20cZ/4 3L7YcPH7x : VTNC_HUMAN 



10 Sequence documentation: 

Alignment of: T39971_P6 x VTNC_HUMAN 
Alignment segment 1/1: 

15 

Quality: 2774.00 

Escore: 0 

Matching length: 27 8 

length: 278 
20 Matching Percent Similarity: 99.64 
Identity: 99.64 

Total Percent Similarity: 99.64 
Identity: 99.64 

Gaps: 0 

25 



Total 



Matching Percent 



Total Percent 
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Alignment : 

1 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSC 5 0 

I I I I i I I I I I I i 1 i I I I I I I ! 1 I I i I I I I I I I I I I I I I I I I I I I I I ! I I I 

1 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSC 50 
. . • - * 

51 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 100 

I I I I I I I I I I I I I I I I I I i I I I I II I i M I I I I I ! I II I I I I I I i I I I I I 

51 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 100 

101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 150 

I I 1 I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 150 

15 151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 200 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 20 0 

a * • • * 

201 GIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGVLDPDYPRNISDGFDGI 250 
20 | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

201 GIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGVLDPDYPRNISDGFDGI 250 

251 PDNVDAALALPAHS YSGRERVYFFKGTQ 27 8 

I I I I I I I I I I I I I I I I I I I I I I II I I I 
25 251 PDNVDAALALPAHS YSGRERVYFFKGKQ 27 8 



30 
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Sequence name: / tmp/X4DeeuSlB4 /yMubSR5FPs : VTNC HUMAN 



Sequence documentation : 



5 Alignment of: T3 9 971 P9 x VTNC_HUMAN 



Alignment segment 1/1: 



10 Escore: 



15 



Quality: 4430.00 



447 



Matching length: 
length: 478 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 93.51 
Identity: 93.51 

Gaps : 1 



Total 



Matching Percent 



Total Percent 



20 



Alignment : 



1 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSC 5 0 

I 1 I I 1 I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I 

1 MAPLRPLL I LALLAWVAL ADQE S CKGRC TEGFN VDKKC Q CDELC S Y YQ S C 50 



25 



51 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 10 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I! I I I I I 1 I I I I I I I I I I I I I I 

51 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 100 



30 



101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 150 

I I I I I I I I II I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I i I I I 

101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 150 
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151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 200 

I 1 I I I t I 1 I I I II 1 I I 1 I I I I I I I M I I i I 1 I ! I I I I ! I I I I I I I I 1 I I I 

151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 20 0 
5 • . • • • 

201 GIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGVLDPDYPRNISDGFDGI 250 

I I I ! I I I I I II I I I II I I I I I I I I 1 I I I ! 1 I I I I I I II I I II I i I I M I I 

201 GIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGVLDPDYPRNISDGFDGI 250 

10 251 PDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEECEGSSLSA 300 

I I I I I I I I I I I I I I I I 1 I I II I I I I I I i I I I I I I I I 1 I I I I I I I I I I I II 

2 51 PDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEECEGSSLSA 300 

301 VFEHFAMMQRDSWEDIFELLFWGRT 325 

15 I I I I I I I I I I I I I I I I I I II I I I I I 

301 VFEHFAMMQRDSWEDIFELLFWGRTSAGTRQPQFISRDWHGVPGQVDAAM 350 

32 6 S GMAPRP S L AKKQRFRHRNRKG YRS QRGH S RGRNQN S RRP SRAT 369 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
20 351 AGRIYISGMAPRPSLAKKQRFRHRNRKGYRSQRGHSRGRNQNSRRPSRAT 400 

37 0 WLSLFSSEESNLGANNYDDYRMDWLVPATCEPIQSVFFFSGDKYYRVNLR 419 

II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I 

401 WLSLFSSEESNLGANNYDDYRMDWLVPATCEPIQSVFFFSGDKYYRVNLR 450 



25 



420 TRRVDTVDPPYPRS IAQYWLGCPAPGHL 447 

I I! I I I I I I I II I I I I I I I I I I I I I I I I 

451 TRRVDTVDPPYPRS IAQYWLGCPAPGHL 47 8 



30 
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Sequence name : /tmp/ j vplVtnxNy/wxNSeFVZZw : VTNC_HUMAN 

5 

Sequence documentation : 

Alignment of: T39971_P11 x VTNC_HUMAN 
10 Alignment segment 1/1: 

Quality: 3576.00 

Escore: 0 

Matching length: 363 Total 

15 length: 478 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 75.94 Total Percent 

Identity: 75.94 
20 Gaps : 1 

Alignment : 

1 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSC 50 

25 I I I 1 I I I I I I I I I I I I I I I I i I I II I I I I I I I I I I I ! I I I I I I I 1 I I I I I 

1 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSC 50 
. • • • • 

51 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 100 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

30 51 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 100 
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101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 150 

I 1 I I I 1 ! I I I I I 1 I I I I ! I I I 1 I I I I I I I 1 I 1 I I I I I I I 1 II I 1 I I I I 1 I 

101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 150 

5 151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 20 0 

I I I I I I I I I I I II I I I I I I I I I ! I I 1 It I I I I I I li I I I I I I I II I I I I I 

151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 200 

201 GIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGVLDPDYPRNISDGFDGI 250 

10 I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

201 GIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGVLDPDYPRNISDGFDGI 250 

251 PDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEECEGSSLSA 300 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I II I I M I I I I I I I 

15 251 PDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEECEGSSLSA 300 

301 VFEHFAMMQRDSWEDIFELLFWGRTS 326 

I I I I I I I I 11 I I I I I I I I I I I I I I I I 

301 VFEHFAMMQRDSWEDIFELLFWGRTSAGTRQPQFISRDWHGVPGQVDAAM 350 
20 • • • • • 

326 326 

351 AGRI Y I S GMAPRP SL AKKQRFRHRNRKGYRS QRGHSRGRNQNS RRP S RAT 400 

25 327 DKYYRVNLR 335 

MINIMI 

401 WLSLFSSEESNLGANNYDDYRMDWLVPATCEPIQSVFFFSGDKYYRVNLR 450 

33 6 TRRVDTVDPPYPRS I AQYWLGCPAPGHL 3 63 
30 | | | | M M I I I I I I II I I I I II I I I M I 

451 TRRVDTVDPPYPRS I AQYWLGCPAPGHL 47 8 
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5 

Sequence name : /tmp/ j vplVtnxNy/wxNSeFVZZw : Q9BSH7 
Sequence documentation : 

10 

Alignment of: T39971_P11 x Q9BSH7 

Alignment segment 1/1: 

15 Quality: 3576.00 

Escore: 0 

Matching length: 3 63 

length: 478 
Matching Percent Similarity: 100.00 
20 Identity: 100.00 

Total Percent Similarity: 75.94 
Identity: 75.94 

Gaps : 1 

25 Alignment: 

1 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSC 50 

I I I I I I i I I I I I I I I ! I I I I I I I I 1 I I I I I I I ) I I I I I I I I I I I I I I I I I 

1 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSC 5 0 
30 ..... 

51 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 100 



Total 
Matching Percent 
Total Percent 
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i I II I I I I i I I I I I I I I 1 1 1 ! i i I I I i i I I I I I I I I M I 1 I I i I I i I I i I 

5 1 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 10 0 

101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 150 

I I I 1 M I I I I I I I I I 1 I I I I I I I I I ! I I I I I I I I I I I M 1 I I 1 I I I I I I I 

101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 150 
151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 200 

I I I I I M I I I I I I I I I I I I I I I I t I I I I I I I I I I I I ! i I I I M I I I I I I I 

151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 200 
201 GIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGVLDPDYPRNISDGFDGI 250 

I I I I I I I I I 1 M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 II I I I I I 

201 GIEGPIDAAFTRINCQGKTYLFKGSQYWRFEDGVLDPDYPRNISDGFDGI 250 
251 PDWDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEECEGSSLSA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I ! I I I I I I I I 

251 PDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEECEGSSLSA 300 

301 VFEHFAMMQRDSWEDIFELLFWGRTS 326 

I I I I I I I I I t I I I I I I I I I I I I I I I I 
301 VFEHFAMMQRDSWEDIFELLFWGRTSAGTRQPQFISRDWHGVPGQVDAAM 350 

326 326 

351 AGRIYISGMAPRPSLAKKQRFRHRNRKGYRSQRGHSRGRNQNSRRPSRAM 400 

327 DKYYRVNLR 335 

I I I I I I I I I 

401 WLSLFSSEESNLGANNYDDYRMDWLVPATCEPIQSVFFFSGDKYYRVNLR 450 
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33 6 TRRVDTVDPPYPRSIAQYWLGCPAPGHL 3 63 

II I I I I I I I I I I I I I I I I I 1 I I I I 1 II I 

4 51 TRRVDTVDPPYPRSIAQYWLGCPAPGHL 47 8 



10 Sequence name : /tmp/f gebv7ir 4i/4 8bTBMzi JO : VTNC_HUMAN 
Sequence documentation : 

Alignment of: T39971_P12 x VTN C_HUMAN 

15 

Alignment segment 1/1: 

Quality: 2237.00 

Escore: 0 
20 Matching length: 223 

length: 223 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
25 Identity: 100.00 

Gaps: 0 

Alignment : 

30 1 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSC 50 

I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I M I I I I I I I I I ! I I I I I I I 



Total 
Matching Percent 
Total Percent 
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1 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSC 5 0 

• • • 

51 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 10 0 

I I I I I I M I I I !! II 1 i I I I I I I I I i I I I I I I I i I I t I I I I M ! I I I II I 

5 1 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 10 0 
• • • • • 

101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 15 0 

I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I II 

101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 150 

151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I II I I I I I I 

151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 20 0 



15 201 GIEGPI DAAFTRINCQGKTYLFK 

I I I I I I I I I I I I I I I I I I I I I I I 
2 01 GIEGPI DAAFTRINCQGKTYLFK 



223 
223 



20 



Sequence name: /tmp/f gebv7ir4i/ 4 8bTBMzi JO : Q9BSH7 

25 

Sequence documentation : 
Alignment of: T39971__P12 x Q9BSH7 
30 Alignment segment 1/1: 
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Quality: 2237.00 

Escore: 0 

Matching length: 223 Total 

length: 223 

5 Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

10 

Alignment : 

- 

1 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSC 5 0 

I I I I I M I ] I I ] I I II I I I i I I I t I I I I I II I I I II I ! I I ! I I I I I I I I I 

15 1 MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSC 50 

51 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 100 

I I I I I I I I I I I I I I I I II I I I I I I I I I I II M I I I I I I I I I II I I I I I I I 

51 CTDYTAECKPQVTRGDVFTMPEDEYTVYDDGEEKNNATVHEQVGGPSLTS 100 
20 ..... 

101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 150 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
101 DLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPP 150 
... - - 

25 151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
151 AEEELCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVW 200 

201 GIEGPIDAAFTRINCQGKTYLFK 223 
30 I I I I I I I I I I I I I I I I 1 II I I I I 

201 GIEGPIDAAFTRINCQGKTYLFK 223 
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DESCRIPTION FOR CLUSTER Z21368 
Cluster Z21368 features 7 transcript(s) and 34 segment(s) of interest, the names for 
which are given in Tables 82 and 83 ? respectively, the sequences themselves are given at the end 
of the application. The selected protein variants are given in table 84, 

Table 82 - Transcripts of interest 



Transcript Name * .- : f;;^[ -J\ ,-;f 


Sequence ID No. : s 


Z21368_PEA_1_T10 


9 


Z21368_PEA_1_T11 


10 


Z21368JPEA_1_T23 


11 


Z21368_PEA_1_T24 


12 


Z21368_PEA_1_T5 


13 


Z21368_PEA_1_T6 


14 


Z21368_PEA_1_T9 


15 


Table83 - Segments of interest 


SegmentName -A- : y.r ^- A 


Sequence ID No. . . 


Z2 1 3 68_PEA_l_node_0 


1067 


Z2 1 368_PEA_l_node_l 5 


1068 


Z2 1 368_PEA_l_node_l 9 


1069 


Z21368_PEA_l_node_2 


1070 


Z2 1 3 68_PEA_l_node_2 1 


1071 


Z2 1 368_PEA_l_node_33 


1072 


Z21368_PEA_l_node_36 


1073 


Z2 1 368_PEA_l_node_37 


1074 


Z21368_PEA_l_node_39 


1075 


Z2 1 368_PEA_l_node_4 


1076 
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Z2 1 368_PEA_l_node_4 1 


1077 


Z2 1 368JPEA_l_node_43 


1078 


Z2 1 3 68_PEA_l_node_45 


1079 


Z2 1 3 68_PEA_l_node_53 


1080 


Z2 1 368_PEA_l_node_56 


1081 


Z21368_PEA_l_node_58 


1082 


Z2 1 368_PEA_l_node_66 


1083 


Z2 1 368_PEA_l_node_67 


1084 


Z2 1 368_PEA_l_node_69 


1085 


Z2 1 368_PEA_l_node_l 1 


1086 


Z2 1368_PEA_l_node_12 


1087 


Z2 1 368_PEA_l_node_l 6 


1088 


Z2 1 368_PEA_l_node_l 7 


1089 


Z2 1 3 68_PEA_l_node_23 


1090 


Z2 1 3 68_PE A_ 1 _node_24 


1091 


Z2 1 368_PEA_l_node_30 


1092 


Z2 1 368_PEA_l_node_3 1 


1093 


Z2 1 3 68JPE A_l_node_3 8 


1094 


Z21368_PEA_l_node_47 


1095 


Z2 1 368_PEA_l_node_49 


1096 


Z21368_PEA_l_node_51 


1097 


Z2 1 368_PEA_l_node_6 1 


1098 


Z2 1 368_PEA_l_node_68 


1099 


Z21368_PEA_l_node_7 


1100 


Table 84 - Proteins of interest 


Protein Name 


Sequence ID No. 


Z21368_PEA_1_P2 


1289 


Z21368_PEA_1_P5 


1290 
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Z21368_PEA_1_P15 


1291 


Z21368_PEA_1_P16 


1292 


Z21368_PEA_1_P22 


1293 


Z21368_PEA_1_P23 


1294 



These sequences are variants of the known protein Extracellular sulfatase Sulf-1 precursor 
(SwissProt accession identifier SUL1 HUMAN; known also according to the synonyms EC 
3.1.6.-; HSulf-1), SEQ ID NO: 1419, referred to herein as the previously known protein. 
5 Protein Extracellular sulfatase Sulf-1 precursor is known or believed to have the following 

function(s): Exhibits arylsulfatase activity and highly specific endoglucosamine-6-sulfatase 
activity. It can remove sulfate from the C-6 position of glucosamine within specific subregions 
of intact heparin. Diminishes HSPG (heparan sulfate proteoglycans) sulfation, inhibits signaling 
by heparin- dependent growth factors, diminishes proliferation, and facilitates apoptosis in 
10 response to exogenous stimulation. The sequence for protein Extracellular sulfatase Sulf-1 
precursor is given at the end of the application, as "Extracellular sulfatase Sulf-1 precursor 
amino acid sequence". Known polymorphisms for this sequence are as shown in Table 85. 

Table 85 - Amino acid mutations for Known Protein 



;.SNP position(s) on £ 
amino acid sequence 


Comment . . ; 


87-88 


CC->AA: LOSS OF ARYLSULFATASE ACTIVITY 
AND LOSS OF ABILITY TO MODULATE APOPTOSIS. 


49 


L->P 


728 


K->R 



15 Protein Extracellular sulfatase Sulf-1 precursor localization is believed to be 

Endoplasmic reticulum and Golgi stack. Also localized on the cell surface (By similarity). 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: apoptosis; metabolism; heparan sulfate proteoglycan metabolism, 
which are annotation(s) related to Biological Process; arylsulfatase; hydrolase, which are 
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annotation(s) related to Molecular Function; and extracellular space; endoplasmic reticulum; 
Golgi apparatus, which are annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
5 from <http://www.ncbi.nlm.nih.gov/proj ects/LocusLink/>. 

Cluster Z21368 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
10 the table and the numbers on the y-axis of figure 13 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
15 Figure 13 and Table 86. This cluster is overexpressed (at least at a minimum level) in the 

following pathological conditions: epithelial malignant tumors, a mixture of malignant tumors 
from different tissues and pancreas carcinoma. 

Table 86 - Normal tissue distribution 



Name of TIssib | 


Nimbfcr;':-. 


bladder 


123 


Bone 


557 


Brain 


34 


Colon 


94 


epithelial 


56 


general 


68 


head and neck 


0 


kidney 


35 


Lung 


22 


Lymph nodes 


0 


Breast 


52 
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muscle 


31 


Ovary 


0 


pancreas 


0 


prostate 


44 


Skin 


67 


stomach 


109 


T cells 


0 


Thyroid 


0 


Uterus 


140 



Table 87 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


pi v ':..- 


P2 *|t I 


SP1 s 


R3 


SP2 C2:, 


•R4v|;:J';vC: 


bladder 


5.4e-01 


6.6e-01 


6.4e-01 


1.0 


8.5e-01 


0.7 


Bone 


4.5e-01 


8.2e-01 


9.1e-01 | 


0.4 


1 


0.3 


Brain 


5.5e-01 


7.3e-01 


1.5e-01 


1.5 


5.0e-01 


0.9 


Colon 


1.4e-01 


2.8e-01 


1.0e-01 


2.0 


3.0e-01 


1.4 


epithelial 


l.le-03 


1.5e-01 


1.2e-07 


2.1 


1.0e-01 


1.1 


general 


1.4e-05 


5.3e-02 


1.9e-06 


1.6 


6.7e-01 


0.8 


head and neck 


2.4e-02 


7.1e-02 


4.6e-01 


2.5 


7.5e-01 


1.4 


kidney 


8.9e-01 


9.0e-01 


1 


0.4 


1 


0.4 


Lung 


3.5e-01 


4.1e-01 


7.2e-03 


2.6 


1.0e-01 


1.6 


Lymph nodes 


7.7e-02 


3.1e-01 


2.3e-02 


8.5 


1.9e-01 


3.2 


Breast 


4.0e-01 


6.1e-01 


5.4e-02 


2.3 


3.0e-01 


1.3 


muscle 


7.5e-02 


3.5e-02 


1 


1.0 


1.7e-01 


1.7 


Ovary 


3.8e-01 


4.2e-01 


2.2e-01 


2.9 


3.4e-01 


2.2 


pancreas 


2.2e-02 


6.9e-02 


1.4e-08 


6.5 


1.4e-06 


4.6 


prostate 


8.3e-01 


8.9e-01 


3.1e-01 


1.4 


5.2e-01 


1.1 


Skin 


6.1e-01 


8.1e-01 


6.0e-01 


1.2 


1 


0.3 


stomach 


4.4e-02 


5.0e-01 


5.0e-01 


0.8 


9.7e-01 


0.4 
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T cells 


5.0e-01 


6.7e-01 


3.3e-01 


3.1 


7.2e-01 


1.4 


Thyroid 


3.6e-01 


3.6e-01 


1 


1.1 


1 


1.1 


Uterus 


3.5e-01 


7.8e-01 


4.6e-01 


0.9 


9.1e-01 


0.5 



As noted above, cluster Z21368 features 7 transcript(s), which were listed in Table 1 
above. These transcript(s) encode for protein(s) which are variant(s) of protein Extracellular 
sulfatase Sulf- 1 precursor. A description of each variant protein according to the present 
invention is now provided. 



Variant protein Z21368 PEA_1_P2 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
Z21368JPEA_1_T5. An alignment is given to the known protein (Extracellular sulfatase Sul£l 
precursor) at the end of the application. One or more alignments to one or more previously 
10 published protein sequences are given at the end of the application. A brief description of the 

relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between Z21368JPEA_1JP2 and SUL1 HUMAN: 
l.An isolated chimeric polypeptide encoding for Z21368 JPEA_1_P2, comprising a first 
1 5 amino acid sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSL 

QVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYVHNH 

QAMHEPRTFAVYLlSRNfTG 

NGIKEKHGFDYAKDYFTDLITN^ 
20 FSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTO 

SVERLYNMLVETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEP 

GSIVPQIVLNIDLAPTILDIAGL^ 

VERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARYQTACEQPGQKWQCIEDTSGK 
LRIHKCKGPSDLLTVRQSTRNLYARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQ 
25 GTPKYKPRFVHTRQTRSLSVEFEGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPR^ 
ASSGGNRGRMLADSSNAVGPPTTVRVTHKCFILPNDSIHCERELYQSARAWKDHKAYI 
DKEIEALQDKIKNL^ 
AAQEVDSKLQLFKENNRRRKKERKEKR^ 
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corresponding to amino acids 1-761 of SUL INHUMAN, which also corresponds to amino 
acids 1 - 761 of Z21368_PEA_1_P2 ? and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
5 PHKYSAHGRTRHFESATRTTNGAQKLSRI corresponding to amino acids 762 - 790 of 

Z21368JPEA_1_P2, wherein said first and second amino acid sequences are contiguous and in a 
sequential order. 

2.An isolated polypeptide encoding for a tail of Z21368JPEA_1 JP2, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
10 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence PHKYSAHGRTRHFESATRTTNGAQKLSRI in Z21368_PEA_1_P2. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
15 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

20 Variant protein Z21368JPEA_1 JP2 is encoded by the following transcript(s): 

Z21368_PEA_1_T5, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z21368JPEA_1_T5 is shown in bold; this coding portion starts at 
position 529 and ends at position 2898. 

25 

Variant protein Z2 1 368 JPEA_1 JP5 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
Z2 1 3 6 8_PE A_ 1 _T9 . An alignment is given to the known protein (Extracellular sulfatase Sulf-1 
precursor) at the end of the application. One or more alignments to one or more previously 
30 published protein sequences are given at the end of the application. A brief description of the 



WO 2006/131783 



PCT/IB2005/004037 



275 

relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between Z21368JPEA_1 JP5 and Q7Z2W2 (SEQ ID NO:1697): 
LAn isolated chimeric polypeptide encoding for Z21368JPEA_1 JP5, comprising a first 
amino acid sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVEL 
corresponding to amino acids 1 - 57 of Q7Z2W2, which also corresponds to amino acids 1 - 57 
of Z2 1 3 6 8 PEA I P 5 , second bridging amino acid sequence comprising A, and a third amino 
acid sequence being at least 90 % homologous to 

FFGKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITN 

ESINYFKMSKIUVLYPHRPVMMVI 
DKHWIMQYTGPMLPIHMEFTM 

ADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGLDT 
PPDVDGKSVLKLLDPEKPGNRFRTN^ 

PKYERVKELCQQARYQTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLY 

ARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEFE 

GEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGPPT 

tvrvthkcfilpndsihcerelyqsarawkdhkayidkeiealqdkiknlrewghl^ 
rkteecscskqsyynkekgvkkqekl^ 

kekrrqrkgeecslpgltcfthdnnhwqtapf™lgsfcactssnm^twclrtv^ 

THNFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQCN 
PRPKNLDVGNKDGGSYDLHRGQLWDGWEG corresponding to amino acids 139 - 871 of 
Q7Z2W2, which also corresponds to amino acids 59 - 791 of Z21368JPEAJL J?5, wherein said 
first, second and third amino acid sequences are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for an edge portion of Z21368JPEA_1_P5, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least three amino acids comprise LAF, the 
sequence having a structure as follows (numbering according to Z21368JPEA_1_P5): a 
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sequence starting from any of amino acid numbers 57- x to 57; and ending at any of amino acid 
numbers 59 + ((n-2) - x), in which x varies from 0 to n-2. 

Comparison report between Z21368JPEA_1_P5 and AAH12997 (SEQ ID NO: 1698): 
5 l.An isolated chimeric polypeptide encoding for Z21368_PEA_1_P5, comprising a first 

amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERXNIRPNIILVLTDDQDVELAFF 
10 GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRJNGIKEKH 
INYFKMSKRMYPHRPVMM^ 

HWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELENTYIIY 
HGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGLDTPP 
DVDGKSVLKLLDPEKPGNRF 
15 KYERVKELCQQARYQTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYA 
RGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEFEGE 
IYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGPPTTV 
RVTHKCFILPNDSIHCERELYQSARAW 

PEECSCSKQSYYNKEKGVKKQEKLKSHLHPFKEAAQEVDSKL^ 

20 KIIRQRKGEECSLPGLTCFTHDN1SIHWQTAPFWNLGSFCACTS 

NFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLME corresponding to 
amino acids 1-751 of Z21368_PEA_1 JP5, and a second amino acid sequence being at least 90 
% homologous to LRSCQGYKQCNPRPKNLDVGNKDGGSYDLHRGQLWDGWEG 
corresponding to amino acids 1 - 40 of AAH12997, which also corresponds to amino acids 752 - 

25 791 of Z21368JPEA_1_P5 ? wherein said first and second amino acid sequences are contiguous 
and in a sequential order. 

2 .An isolated polypeptide encoding for a head of Z21368_PEA_1_P5 ? comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 

30 sequence 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELAFF 
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GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNES 

mYFBCMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYNYAPNMDK 

HWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELENTYIIYTAD 

HGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGLDTPP 

DVDGKSVLKLLDPEKPGNRFRTNKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLP 

KYERVKELCQQARYQTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYA 

RGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEFEGE 

IYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGPPTTV 

RVTHKCFILPNDSIHCERELYQSARAWKDHKAYIDKEIEALQDKIKNLREVRGHLKRRK 

PEECSCSKQSYYNKEKGVK^QEKXKSHLHPFKEAAQEVDSKLQLFK^NNRFIRKKERKE 

KRRQRKGEECSLPGLTCFTHDNNHWQTAPFWNLGSFCACTSSNNNTYWCLRTVNETH 

NFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLM E of 

Z21368_PEA_1_P5. 

Comparison report between Z21368_PEA_1_P5 and SUL1_HUMAN: 
1 An isolated chimeric polypeptide encoding for Z21368_PEA_1_P5, comprising a first 
amino acid sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVEL 
corresponding to amino acids 1 - 57 of SUL1_HUMAN, which also corresponds to amino acids 
1-57 of Z21368_PEA_1JP5, and a second amino acid sequence being at least 90 % 
homologous to 

AFFGKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEK^GFDYAKDYFTDLIT 
NESINYFKMSKPJVIYPHRPVMMVISHAAPH^ 

MDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELENTYn 

YTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGL 

DTPPDVDGKSVLKLLDPEKPGNPJFRTNKKAKIWRDTFLVERGKFLRKKEESSKNIQQSN 

HLPKYERVKELCQQARYQTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRN 

LYARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVE 

FEGEnrDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGP 

PTTVRVTHKCFILPNDSIHCERELYQSARAWKDHKAYIDKEIEALQDKIKNLREVRGHL 

KRRKPEECSCSKQSYYNKEKGVKXQEKLKSHLHPFK^AAQEVDSKLQLFKliNNRRRK 

KERKEKRRQRKGEECSLPGLTCFTHDNNHWQTAPFTOLGSFCACTSSr^NTYWCLRT 
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VNETHNFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYK 
QCNPRPKNLDVGNKDGGSYDLHRGQLWDGWEG corresponding to amino acids 138 - 871 
of SUL INHUMAN, which also corresponds to amino acids 58 - 791 of Z21368_PEA_1_P5, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 
5 2. An isolated chimeric polypeptide encoding for an edge portion of Z21368JPEA_1_P5, 

comprising a polypeptide having a length "n" wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise LA, having a 
10 structure as follows: a sequence starting from any of amino acid numbers 57-x to 57; and ending 
at any of amino acid numbers 58 + ((n-2) - x), in which x varies from 0 to n-2. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
15 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

20 Variant protein Z2 1 3 6 8 JPE A__ 1 P 5 is encoded by the following transcript(s): 

Z21368JPEA_1_T9, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z2 1 3 6 8JPE A_ 1 _T9 is shown in bold; this coding portion starts at 
position 556 and ends at position 2928. 

25 

Variant protein Z21368JPEA_1 JP15 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
Z2 1 3 68_PEA_1_T23 . An alignment is given to the known protein (Extracellular sulfatase Sulf 
1 precursor) at the end of the application. One or more alignments to one or more previously 
30 published protein sequences are given at the end of the application. A brief description of the 
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relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 



Comparison report between Z21368__PEA_1_P15 and SUL INHUMAN: 
5 l.An isolated chimeric polypeptide encoding for Z21368_PEA_1_P1 5, comprising a first 

amino acid sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSL 
QVMNKTRKIMEHGGATFmAFVTTPMCCPSRSSMLTGKYVHNflNVYTNNENCSSPSW 
QAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCR 
10 NGIKEKHGFDYAKDYFTDLITN^ 

FSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHM 

SVERLYNMLVETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEP 
GSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRTNKKAKIWRDTFL 
VERG corresponding to amino acids 1 - 416 of SUL 1JHUMAN, which also corresponds to 
15 amino acids 1 - 416 of Z21368JPEA_1 JP15. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
20 secreted. The protein localization is believed to be secreted because both signatpeptide 

prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein Z21368_PEA_1JP15 is encoded by the following transcript(s): 
25 Z21368_PEA_1_T23, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z21368JPEA_1_T23 is shown in bold; this coding portion starts at 
position 691 and ends at position 1938. 

Variant protein Z2 1 3 6 8_PE A_l JP 1 6 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
30 Z21368_PEA_1 JT24. An alignment is given to the known protein (Extracellular sulfatase Sulf- 
1 precursor) at the end of the application. One or more alignments to one or more previously 
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published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

5 Comparison report between Z21368_PEA_1_P16 and SUL1_HUMAN: 

LAn isolated chimeric polypeptide encoding for Z21368JPEA_1 JP16, comprising a first 
amino acid sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSL 

QVMNKTRKIMEHGGA 
10 QAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCR 
NGIKEKHGFDYAKDYFTDLITNESINYFKMSKRM 
FSKLYPNASQHITPSYNYAPNMDKHW 

SVERLYNMLVETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEP 
GSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNR corresponding to amino 

15 acids 1 - 397 of SUL1_HUMAN, which also corresponds to amino acids 1 - 397 of 

Z21368JPEA_1 JP16, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence CVIVPPLSQPQIH corresponding to amino 
acids 398 - 410 of Z21368_PEA_1_P16, wherein said first and second amino acid sequences are 

20 contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of Z21368_PEA_1_P16, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence CVIVPPLSQPQIH in Z21368JPEAJJP16. 

25 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
30 prediction programs predict that this protein ha s a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 
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Variant protein Z21368_PEA_1JP16 is encoded by the following transcript(s): 
Z21368JPEA_1_T24, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z21368JPEA_1_T24 is shown in bold; this coding portion starts at 
5 position 691 and ends at position 1920. 

Variant protein Z21368_PEA_1JP22 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
Z21368_PEA_1JT10. An alignment is given to the known protein (Extracellular sulfatase Sulf- 
1 precursor) at the end of the application. One or more alignments to one or more previously 
10 published protein sequences are given at the end of the application. A brief description of the 

relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between Z21368_PEA_1 JP22 and SUL1 IiUMAN: 
15 l.An isolated chimeric polypeptide encoding for Z21368_PEA_1JP22, comprising a first 

amino acid sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGMQQERKNIRPNIILVLTDDQDVELGSL 

QVMNKTRKIMEHGGAT^ 

QAMHEPRTFAWLNNTGYRTAPFGK 

20 NGIKEKHGFDYAK corresponding to amino acids 1-188 of SUL1 HUMAN, which also 

corresponds to amino acids 1-188 of Z21368JPEA_1_P22, and a second amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
90% and most preferably at least 95% homologous to a polypeptide having the sequence 
ARYDGDQPRCAPRPRGLSPTVF corresponding to amino acids 189 - 210 of 

25 Z21368JPEA_1 JP22, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

2 .An isolated polypeptide encoding for a tail of Z21368JPEA_1_P22 5 comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
30 sequence ARYDGDQPRCAPRPRGLSPTVF in Z21368_PEA_1JP22. 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal- peptide 
5 prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans- membrane region.. 

Variant protein Z21368JPEA_1 JP22 is encoded by the following transcript(s): 
Z21368JPEA1T10, for which the sequence(s) is/are given at the end of the application. The 
10 coding portion of transcript Z2 1 3 6 8PE A_ 1 JIT 0 is shown in bold; this coding portion starts at 
position 691 and ends at position 1320. 



Variant protein Z21368JPEA_1_P23 according to the present invention has an amino acid 
15 sequence as given at the end of the application; it is encoded by transcript(s) 

Z21368_PEA_1_T11. An alignment is given to the known protein (Extracellular sulfatase Sulf- 
1 precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
20 is as follows: 



Comparison report between Z21368JPEA_1 JP23 and Q7Z2W2: 

l.An isolated chimeric polypeptide encoding for Z2 1 3 6 8 JPE A_ 1 JP23 , comprising a first 
amino acid sequence being at least 90 % homologous to 
25 MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKMRPNIILVLTDDQDVELGSL 

QVMNKTRKIMEHGGATF1NAFVTTPMCCPSRSSMLTGK 

QAMHEPRTFAVYLNNTGYRT corresponding to amino acids 1-137 of Q7Z2W2, which also 
corresponds to amino acids 1 - 137 of Z21368_PEA_1_P23, and a second amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
30 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
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GLLHRLNH corresponding to amino acids 138 - 145 of Z21368_PEA_1_P23, wherein said 
first and second amino acid sequences are contiguous and in a sequential order. 

2An isolated polypeptide encoding for a tail of Z21368_PEA_1 JP23, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
5 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence GLLHRLNH in Z21368JPEAJ JP23. 

Comparison report between Z21368_PEA_1_P23 and SUL1 JHUMAN: 

1. An isolated chimeric polypeptide encoding for Z21368_PEA_1_P23, comprising a first 
10 amino acid sequence being at least 90 % homologous to 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSL 
QVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYVHNHNVYTNNENCSSPSW 
QAMHEPRTFAVYLNNTGYRT corresponding to amino acids 1-137 of SUL INHUMAN, 
which also corresponds to amino acids 1-137 of Z21368JPEA_1_JP23, and a second amino 
15 acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 

preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence GLLHRLNH corresponding to amino acids 138 - 145 of Z21368_PEA_1 JP23, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail ofZ21368_PEA_l_P23 ? comprising a 
20 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 

more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence GLLHRLNH in Z21368JPEAJJP23. 

The location of the variant protein was determined according to results from a number of 
25 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 



30 
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Variant protein Z21368_PEA_JJP23 is encoded by the following transcript(s): 
Z21368JPEA_1 JT1 1, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z21368JPEA_1_T11 is shown in bold; this coding portion starts at 
position 691 and ends at position 1 125. 
5 As noted above, cluster Z21368 features 34 segment(s) ? which were listed in Table 2 

above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
provided. 

10 

Segment cluster Z21368_PEA_l_node_0 according to the present invention is supported 
by 8 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1 JT9. Table 88 below describes the 
starting and ending position of this segment on each transcript. 

15 Table 88 - Segment location on transcripts 



Transcript name ; 


.Segment starting position 


Segment ending position \\ 


Z21368_PEA_1_T9 


1 


327 



Segment cluster Z21368_PEA_l__node_15 according to the present invention is supported 
by 26 libraries. The number of libraries was determined as previously described. This segment 
20 can be found in the following transcript(s): Z21368JPEA_1_T10, Z21368JPEA_1_T1 1, 

Z21368_PEA_1_T23, Z21368JPEA_1_T24, Z21368_PEA_1JT5, Z21368JPEA_1_T6 and 
Z2 1 3 6 8 JPE A_1_T9 . Table 89 below describes the starting and ending position of this segment 
on each transcript. 

Table 89 - Segment location on transcripts 



Transcript name 


Segment starting position 


■■ Segment ending position 


Z21368_PEA_1_T10 


631 


807 


Z21368_PEA_1_T11 


631 


807 
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Z21368_PEA_1_T23 


631 


807 


Z21368_PEA_1_T24 


631 


807 


Z21368_PEA_1_T5 


469 


645 


Z21368_PEA_1_T6 


469 


645 


Z21368_PEA_1_T9 


496 


672 



Segment cluster Z21368_PEA__l_node 19 according to the present invention is supported 
by 24 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1_T10, Z21368_PEA_1_T11, 
Z21368JPEA_1_T23, Z21368JPEA_1_T24 5 Z21368_PEA_1_T5 and Z21368J > EA_1__T6. 
Table 90 below describes the starting and ending position of this segment on each transcript. 

Table 90 - Segment location on transcripts 



Transcript name 


Segment 'starting position 


j Segment ending position 


Z21368_PEA_1_T10 


863 


1102 


Z21368_PEA_1_T11 


863 


1102 


Z21368_PEA_1_T23 


863 


1102 


Z21368_PEA_1_T24 


863 


1102 


Z21368_PEA_1_T5 


701 


940 


Z21368_PEA_1_T6 


701 


940 



10 



15 



Segment cluster Z21368JPEA_l_nodej2 according to the present invention is supported 
by 15 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1_T10 ? Z21368_PEA_1_T11, 
Z21368_PEA_1_T23 ? Z21368JPEAJ JT24, Z21368JPEA_1_T5 and Z21368_PEA_1_T6. 
Table 91 below describes the starting and ending position of this segment on each transcript. 

Table 91 - Segment location on transcripts 



Transcript name 



Segment starting position 



Segment ending position 
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Z21368_PEA_1_T10 




300 


Z21368_PEA_1_T11 




300 


Z21368JPEA_1_T23 




300 


Z21368_PEA_1_T24 




300 


Z21368_PEA_1_T5 




300 


Z21368_PEA_1_T6 




300 



Segment cluster Z21368JPEA_l_node_21 according to the present invention is supported 
by 37 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z21368JPEAJLT10, Z21368JPEA_1_T23, 

Z21368JPEA_1_T24, Z21368_PEA_1_T5, Z21368_PEA_1_T6 and Z21368JPEA_1_T9. Table 
92 below describes the starting and ending position of this segment on each transcript. 

Table 92 - Segment location on transcripts 



Transcript name ' ■ • ; . 


Segment Itartittg position «;| I 


''• Segrhent ending position ••' 


Z21368_PEA_1_T10 


1103 


1254 


Z21368_PEA_1_T23 


1103 


1254 


Z21368_PEA_1_T24 


1103 


1254 


Z21368_PEA_1_T5 


941 


1092 


Z21368_PEA_1_T6 


941 


1092 


Z21368_PEA_1_T9 


728 


879 



10 

Segment cluster Z21368_PEA_l_node_33 according to the present invention is supported 
by 45 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA__1_T10 5 Z21368_PEA_1_T11, 
Z21368_PEAJLT23, Z21368_PEA_1_T24, Z21368JPEA_1_T5 ? Z21368_PEA_1_T6 and 
15 Z21368_PEA_1_T9. Table 93 below describes the starting and ending position of this segment 
on each transcript. 

Table 93 - Segment location on transcripts 
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Transcript name 


Segment starting position ' 


Segment ending position 


Z21368_PEA_1_T10 


1502 


1677 


Z21368_PEA_1_T11 


1424 


1599 


Z21368_PEA_1_T23 


1576 


1751 


Z21368_PEA_1_T24 


1576 


1751 


Z21368_PEA_1_T5 


1414 


1589 


Z21368_PEA_1_T6 


1414 


1589 


Z21368_PEA_1_T9 


1201 


1376 



Segment cluster Z2 1 3 6 8_PE A_ 1 _node_3 6 according to the present invention is supported 
by 44 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z21368JPEAJLT10, Z21368JPEA_1_T1 1, 
Z21368_PEA_1_T23, Z21368JPEAJMT24, Z21368_PEA_1_T5, Z21368_PEA_1_T6 and 
Z2 1 3 6 8 JPE A_ 1 _T9 . Table 94 below describes the starting and ending position of this segment 
on each transcript. 

Table 94 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending positioii 


Z21368_PEA_1_T10 


1678 


1806 


Z21368_PEA_1_T11 


1600 


1728 


Z21368_PEA_1_T23 


1752 


1880 


Z21368_PEA_1_T24 


1752 


1880 


Z21368_PEA_1_T5 


1590 


1718 


Z21368_PEA_1_T6 


1590 


1718 


Z21368_PEA_1_T9 


1377 


1505 



10 

Segment cluster Z2 1 368 PE A_l_node_37 according to the present invention is supported 
by 3 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): Z21368_PEA_1_T24. Table 95 below describes the 
starting and ending position of this segment on each transcript. 

Table 95 - Segment location on transcripts 



' l^^script name ; ' . V' ■ < 


Segment starting position '{ \ 


Segment ending position, 


Z21368_PEA_1_T24 


1881 


2159 



5 

Segment cluster Z21368_PEA_1 jtiode_39 according to the present invention is supported 
by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368JPEA_1_T23 and Z21368JPEA_1_T24. 
Table 96 below describes the starting and ending position of this segment on each transcript. 

10 Table 96 - Segment location on transcripts 



Transcript name '• M t' :; ^ 


Segment starting position 


r Segment ending position , '• 


Z21368_PEA_1_T23 


1938 


2790 


Z21368_PEA_1_T24 


2217 


3069 



Segment cluster Z21368JPEA_l_node_4 according to the present invention is supported 
by 13 libraries. The number of libraries was determined as previously described. This segment 
15 can be found in the following transcript(s): Z21368JPEA_1_T10, Z21368JPEA_1_T1 1, 
Z21368_PEA_1_T23 and Z2 1 3 6 8_PEA_ 1 JT24 . Table 97 below describes the starting and 
ending position of this segment on each transcript. 

Table 97 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z21368_PEA_1_T10 


301 


462 


Z21368_PEA_1_T11 


301 


462 


Z21368_PEA_1_T23 


301 


462 


Z21368_PEA_1_T24 


301 


462 
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Segment cluster Z21368JPEA l_node_41 according to the present invention is supported 
by 49 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1 JT10, Z21368JPEA_1_T11 ? 



5 Z21368JPEAJ JT5, Z21368JPEA_1_T6 and Z21368_PEA_1 JT9. Table 98 below describes 
the starting and ending position of this segment on each transcript. 

Table 98 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending ^position , |, 


Z21368_PEA_1_T10 


1864 


1993 


Z21368_PEA_1_T11 


1786 


1915 


Z21368_PEA_1_T5 


1776 


1905 


Z21368_PEA_1_T6 


1776 


1905 


Z21368_PEA_1_T9 


1563 


1692 



10 Segment cluster Z2 1 3 6 8JPE A_ 1 _node_43 according to the present invention is supported 

by 52 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1_T10, Z21368JPEA_1_T11, 
Z21368JPEA_1_T5, Z21368__PEA_1_T6 and Z21368JPEA_1_T9. Table 99 below describes 
the starting and ending position of this segment on each transcript. 

1 5 Table 99 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z21368_PEA_1_T10 


1994 


2210 


Z21368_PEA_1_T11 


1916 


2132 


Z21368_PEA_1_T5 


1906 


2122 


Z21368_PEA_1_T6 


1906 


2122 


Z21368_PEA_1_T9 


1693 


1909 



WO 2006/131783 



PCT/IB2005/004037 



290 

Segment cluster Z21368JPEA_l_node_45 according to the present invention is supported 
by 64 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368JPEA_1JT10, Z21368_PEA_1_T1 1, 
Z21368_PEA_1_T5, Z21368JPEAJMT6 and Z21368JPEAJJT9. Table 100 
5 below describes the starting and ending position of this segment on each transcript. 



Table 100 - Segment location on transcripts 



Transcript name ^ 


Segment starting position i 


Segment ending position 


Z21368_PEA_1_T10 


2211 


2466 


Z21368_PEA_1_T11 


2133 


2388 


Z21368_PEA_1_T5 


2123 


2378 


Z21368_PEA_1_T6 


2123 


2378 


Z21368_PEA_1_T9 


1910 


2165 



Segment cluster Z21368JPEA_l_node_53 according to the present invention is supported 
10 by 60 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368JPEA__1_T10, Z21368JPEA_1_T1 1, 
Z21368JPEA_1_T5, Z21368JPEAJMT6 and Z21368JPEA_1_T9. Table 101 below describes 
the starting and ending position of this segment on each transcript. 



Table 102 - Segment location on transcripts 



Transcript name : ; - 


Segment starting position 


. Segment ending position 4-. 


Z21368_PEA_1_T10 


2725 


2900 


Z21368_PEA_1_T11 


2647 


2822 


Z21368_PEA_1_T5 


2637 


2812 


Z21368_PEA_1_T6 


2637 


2812 


Z21368_PEA_1_T9 


2424 


2599 



15 
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Segment cluster Z21368JPEA_l_node_56 according to the present invention is supported 
by 50 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEAJLT10, Z21368_PEA_1__T11 and 
Z21368_PEA_1_T9. Table 102 below describes the starting and ending position of this segment 



5 on each transcript. 

Table 102 - Segment location on transcripts 



Transcript name A J 


( Segment starting position 


,-jSegttient ending position : ' 


Z21368_PEA_1_T10 


2901 


3043 


Z21368_PEA_1_T11 


2823 


2965 


Z21368_PEA_1_T9 


2600 


2742 



Segment cluster Z21368_PEA_l_node_58 according to the present invention is supported 
10 by 71 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368JPEAJMT10, Z21368JPEA__1_T11 3 
Z21368JPEA_1_T5, Z21368JPEA_1_T6 and Z21368_PEA_1_T9. Table 103 below describes 
the starting and ending position of this segment on each transcript. 

Table 103 - Segment location on transcripts 



Transcript name ,#> ■ f \ 


Segment starting position 


! Segment ending position 


Z21368_PEA_1_T10 


3044 


3167 


Z21368_PEA_1_T11 


2966 


3089 


Z21368_PEA_1_T5 


2813 


2936 


Z21368_PEA_1_T6 


2813 


2936 


Z21368_PEA_1_T9 


2743 


2866 



15 

Segment cluster Z2 13 68_PEA_l_node_66 according to the present invention is supported 
by 142 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1_T10, Z21368_PEA_1_T11, 
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Z21368JPEA_1_T5, Z21368JPEA_1 JT6 and Z21368JPEA_1_T9. Table 104 below describes 
the starting and ending position of this segment on each transcript. 

Table 104 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segmen£endjfog£bsition 


Z21368_PEA_1_T10 


3202 


3789 


Z21368_PEA_1_T11 


3124 


3711 


Z21368_PEA_1_T5 


2971 


3558 


Z21368_PEA_1_T6 


2971 


3558 


Z21368_PEA_1_T9 


2901 


3488 



5 

Segment cluster Z21368_PEA_l_node__67 according to the present invention is supported 
by 181 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368JPEA_1_T10, Z21368JPEAJMT11, 
Z21368JPEAJMT5, Z21368JPEA_1_T6 and Z21368JPEA__1 JT9. Table 105 below describes 
10 the starting and ending position of this segment on each transcript. 



Table 105 - Segment location on transcripts 



Transcript name 


Segiaent starting position ; . 


Segment ending position ■;[', :j 


Z21368_PEA_1_T10 


3790 


4374 


Z21368_PEA_1_T11 


3712 


4296 


Z21368_PEA_1_T5 


3559 


4143 


Z21368_PEA_1_T6 


3559 


4143 


Z21368_PEA_1_T9 


3489 


4073 



Segment cluster Z21368_PEA_l_node__69 according to the present invention is supported 
15 by 150 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1_T10 ? Z21368JPEA_1_T11, 



WO 2006/131783 



PCT/IB2005/004037 



293 



Z21368_PEA_1JT5, Z21368_PEA_1 JT6 and Z21368JPEA_1_T9. Table 106 below describes 
the starting and ending position of this segment on each transcript. 

Table 107 - Segment location on transcripts 



Transcript name 


■ Segment starting : position : 


Segment ending position V 


Z21368_PEA_1_T10 


4428 


4755 


Z21368_PEA_1_T11 


4350 


4677 


Z21368_PEA_1_T5 


4197 


5384 


Z21368_PEA_1_T6 


4197 


4524 


Z21368_PEA_1_T9 


4127 


4454 



According to an optional embodiment of the present invention, short segments related to 
5 the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



Segment cluster Z21368__PEA_l_node_ll according to the present invention is supported 
by 26 libraries. The number of libraries was determined as previously described. This segment 
10 can be found in the following transcript(s): Z21368JPEA_1_T10, Z21368JPEA_1 JIT 1, 

Z21368JPEA_1JT23, Z21368JPEAJLT24, Z21368_PEA_1_T5, Z21368JPEA_1_T6 and 
Z21368JPEA_1_T9. Table 107 below describes the starting and ending position of this segment 
on each transcript. 

Table 107 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z21368_PEA_1_T10 


558 


602 


Z21368_PEA_1_T11 


558 


602 


Z21368_PEA_1_T23 


558 


602 


Z21368_PEA_1_T24 


558 


602 


Z21368_PEA_1_T5 


396 


440 


Z21368_PEA_1_T6 


396 


440 


Z21368_PEA_1_T9 


423 


467 



15 
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Segment cluster Z21368_PEA_l_node_12 according to the present invention is supported 
by 23 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1_T10, Z21368_PEA_1_T1 1, 
Z21368JPEA__1 JT23, Z21368JPEA_1JT24, Z21368JPEA_1 JT5, Z21368_PEA_1_T6 and 
Z21368_PEA__1_T9. Table 108 below describes the starting and ending position of this segment 
on each transcript. 

Table 108 - Segment location on transcripts 



Transcript name 


Segment starting posittdn 


Segment ending position 


Z21368_PEA_1_T10 


603 


630 


Z21368_PEA_1_T11 


603 


630 


Z21368_PEA_1_T23 


603 


630 


Z21368_PEA_1_T24 


603 


630 


Z21368_PEA_1_T5 


441 


468 


Z21368_PEA_1_T6 


441 


468 


Z21368_PEA_1_T9 


468 


495 



Segment cluster Z2 1 3 6 8 JPE A_ 1 _node_ 1 6 according to the present invention can be 
found in the following transcript(s): Z21368JPEA_1_T10, Z21368JPEAJMT11, 
Z21368JPEA_1_T23, Z21368JPEA_1_T24, Z21368_PEA_1_T5 5 Z21368JPEA__1JT6 and 
Z21368JPEA_1_T9. Table 109 below describes the starting and ending position of this segment 
on each transcript. 

Table 109 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z21368_PEA_1_T10 


808 


822 


Z21368_PEA_1_T11 


808 


822 


Z21368_PEA_1_T23 


808 


822 


Z21368_PEA_1_T24 


808 


822 
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Z21368_PEA_1_T5 


646 


660 


Z21368_PEA_1JT6 


646 


660 


Z21368_PEA_1_T9 


673 


687 



Segment cluster Z21368_PEA_1 jaode_17 according to the present invention is supported 
by 19 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z21368JPEA_1_T10 ? Z21368_PEA__1_T1 1 5 
Z21368_PEA_1_T23, Z21368J>EAJMr24, Z21368JPEA__1_T5, Z21368JPEA_1JT6 and 
Z21368 PEA_1 JT9. Table 1 10 below describes the starting and ending position of this segment 
on each transcript. 

Table 110 - Segment location on transcripts 



Transcript name 


Segment starting position 


SegmeiSjtending position!: 


Z21368_PEA_1_T10 


823 


862 


Z21368_PEA_1_T11 


823 


862 


Z21368_PEA_1_T23 


823 


862 


Z21368_PEA_1_T24 


823 


862 


Z21368_PEA_1_T5 


661 


700 


Z21368_PEA_1_T6 


661 


700 


Z21368_PEA_1_T9 


688 


727 



Segment cluster Z2 1 3 6 8_PE A_ 1 jaode_2 3 according to the present invention is supported 
by 36 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1JT11, Z2 1 3 6 8 JPEA_1 _T23 , 
15 Z21368_PEA_1_T24, Z21368JPEA_1_T5, Z21368_PEA_1_T6 and Z21368J>EA_1_T9. Table 
111 below describes the starting and ending position of this segment on each transcript. 

Table 111 Segment location on transcripts 



Transcript name 



Segment starting position 



Segment ending position 



WO 2006/131783 
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PCT/IB2005/004037 


Z21368_PEA_1_T11 


1103 


1176 


Z21368_PEA_1_T23 


1255 


1328 


Z21368_PEA_1_T24 


1255 


1328 


Z21368_PEA_1_T5 


1093 


1166 


Z21368_PEA_1_T6 


1093 


1166 


Z21368_PEA_1_T9 


880 


953 



Segment cluster Z21368_PEA_l_node_24 according to the present invention is supported 
by 36 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z21368JPEAJ JT10, Z21368_PEA_1_T1 1, 
Z21368JPEAJMT23, Z21368_PEAJ_T24, Z21368_PEA_1_T5 ? Z21368_PEA_1_T6 and 
Z21368_PEA_1__T9. Table 1 12 below describes the starting and ending position of this segment 
on each transcript. 

Table 112 - Segment location on transcripts 



Transcript name ; ■ . j ;J? ' : •• 


Segment starting posiljbaJ ; 


jfegment eflding position : i- 


Z21368_PEA_1_T10 


1255 


1350 


Z21368_PEA_1_T11 


1177 


1272 


Z21368_PEA_1_T23 


1329 


1424 


Z21368_PEA_1_T24 


1329 


1424 


Z21368_PEA_1_T5 


1167 


1262 


Z21368_PEA_1_T6 


1167 


1262 


Z21368_PEA_1_T9 


954 


1049 



Segment cluster Z21368_PEA_l_node_30 according to the present invention is supported 
by 39 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1_T10, Z21368_PEA_1_T1 1, 
15 Z21368_PEA_1_T23, Z21368_PEA_1_T24, Z21368_PEA_1_T5, Z21368_PEA_1_T6 and 
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Z21368_PEA__1_T9. Table 1 13 below describes the starting and ending position of this segment 
on each transcript. 

Table 113 - Segment location on transcripts 



Transcript name S\f- 


Segment starting position M 


Segment, ending position 


Z21368_PEA_1_T10 


1351 


1409 


Z21368_PEA_1_T11 


1273 


1331 


Z21368_PEA_1_T23 


1425 


1483 


Z21368_PEA_1_T24 


1425 


1483 


Z21368_PEA_1_T5 


1263 


1321 


Z21368_PEA_1_T6 


1263 


1321 


Z21368_PEA_1_T9 


1050 


1108 



Segment cluster Z21368_PEA_l_node_31 according to the present invention is supported 
by 40 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368JPEAJLJT10, Z21368JPEAJMT11, 
Z21368_PEA_1 JT23, Z21368JPEAJLT24, Z21368_PEA_1_T5, Z21368JPEA_1_T6 and 
10 Z21368JPEA_1_T9. Table 1 14 below describes the starting and ending position of this segment 



on each transcript. 

Table 114 - Segment location on transcripts 



Transcript name > 


Segment starting position 


Segment ending position , : 


Z21368_PEA_1_T10 


1410 


1501 


Z21368_PEA_1_T11 


1332 


1423 


Z21368_PEA_1_T23 


1484 


1575 


Z21368_PEA_1_T24 


1484 


1575 


Z21368_PEA_1_T5 


1322 


1413 


Z21368_PEA_1_T6 


1322 


1413 


Z21368_PEA_1_T9 


1109 


1200 
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Segment cluster Z21368_PEA_l_node_38 according to the present invention is supported 
by 45 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368JPEA__1_T10, Z21368_PEA__1_T1 1, 
5 Z21368_PEA_1_T23, Z21368_PEA_1_T24, Z21368JPEA_1 JT5, Z21368JPEA_1_T6 and 

Z21368__PEA_1__T9. Table 1 15 below describes the starting and ending position of this segment 
on each transcript. 

Table 115 - Segment location on transcripts 



Transcript name .'- ■ 


Segment starting position 1 


Segment ending position 


Z21368_PEA_1_T10 


1807 


1863 


Z21368_PEA_1_T11 


1729 


1785 


Z21368_PEA_1_T23 


1881 


1937 


Z21368_PEA_1_T24 


2160 


2216 


Z21368_PEA_1_T5 


1719 


1775 


Z21368_PEA_1_T6 


1719 


1775 


Z21368_PEA_1_T9 


1506 


1562 



10 

Segment cluster Z21368JPEA_l_node_47 according to the present invention is supported 
by 61 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1 JT10, Z21368JPEA_1_T11, 
Z21368_PEA_1JT5, Z21368JPEA__1_T6 and Z21368JPEA_1_T9. Table 116 below describes 
15 the starting and ending position of this segment on each transcript. 

Table 116 - Segment location on transcripts 



Transcript name 


Segment starting position 


i Segment ending position ; 


Z21368_PEA_1_T10 


2467 


2563 


Z21368_PEA_1_T11 


2389 


2485 


Z21368_PEA_1_T5 


2379 


2475 


Z21368_PEA_1_T6 


2379 


2475 
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Z21368_PEA_1_T9 


2166 


2262 









Segment cluster Z21368_PEA_ljtiode_49 according to the present invention is supported 
by 57 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z21368 ^PEAJJTIO, Z21368JPEA_1_T1 1, 

Z21368_PEA_1_T5, Z21368JPEA_1 JT6 and Z21368J ) EA_1 JT9. Table 117 below describes 
the starting and ending position of this segment on each transcript. 



Table 117 - Segment location on transcripts 



Transcript name 


Segment starting position . ; ... 


; Segment ending positiori;/ .A- 


Z21368_PEA_1_T10 


2564 


2658 


Z21368_PEA_1_T11 


2486 


2580 


Z21368_PEA_1_T5 


2476 


2570 


Z21368_PEA_1_T6 


2476 


2570 


Z21368_PEA_1_T9 


2263 


2357 



Segment cluster Z2 1 368_PEA_l_node_5 1 according to the present invention is supported 
by 46 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1_T10, Z21368JPEA_1_T11, 
Z21368JPEA_1_T5 ? Z21368_PEA_1_T6 and Z21368_PEA__1_T9. Table 118 below describes 
15 the starting and ending position of this segment on each transcript. 



Table 118 - Segment location on transcripts 



Transcript name 


Segment starting position 


: Segment ending position 


Z21368_PEA_1_T10 


2659 


2724 


Z21368_PEA_1_T11 


2581 


2646 


Z21368_PEA_1_T5 


2571 


2636 


Z21368_PEA_1_T6 


2571 


2636 


Z21368_PEA_1_T9 


2358 


2423 
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Segment cluster Z21368JPEA_l_node_61 according to the present invention is supported 
by 61 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z21368JPEA_1 JT10, Z21368JPEAJMT1 1, 

Z21368_PEA_1_T5, Z21368JPEAJLT6 and Z21368JPEA_1_T9. Table 1 19 below describes 
the starting and ending position of this segment on each transcript. 

Table! 19 - Segment location on transcripts 



TranScfipt name '.S , , r " j 


Segment starting position 


^Segment endmg position 


Z21368_PEA_1_T10 


3168 


3201 


Z21368_PEA_1_T11 


3090 


3123 


Z21368_PEA_1_T5 


2937 


2970 


Z21368_PEA_1_T6 


2937 


2970 


Z21368_PEA_1_T9 


2867 


2900 



Segment cluster Z21368_PEA_l_node_68 according to the present invention is supported 
by 87 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368_PEA_1JT10, Z21368JPEA_1_T11, 
Z21368_PEA_1_T5, Z21368JPEA_1_T6 and Z21368JPEA_1_T9. Table 120 below describes 
1 5 the starting and ending position of this segment on each transcript. 

Table 120 - Segment location on transcripts 



Transcript name 


Segment starting position 


i Segment ending position ; . 


Z21368_PEA_1_T10 


4375 


4427 


Z21368_PEA_1_T11 


4297 


4349 


Z21368_PEA_1_T5 


4144 


4196 


Z21368_PEA_1_T6 


4144 


4196 


Z21368_PEA_1_T9 


4074 


4126 



WO 2006/131783 



PCT/IB2005/004037 



301 

Segment cluster Z2 13 68 PE A l node_7 according to the present invention is supported 
by 29 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z21368JPEA_1_T10 ? Z21368JPEAJJT1 1, 
5 Z21368_PEA_1_T23, Z21368_PEA_1_T24 ? Z21368_PEA_1_T5, Z21368_PEA_1_T6 and 

Z21368JPEA_1_T9. Table 121 below describes the starting and ending position of this segment 
on each transcript. 

Table 121 - Segment location on transcripts 



Transcript nanae; f -Si 


Segment starting position 


Segment ending position 


Z21368_PEA_1_T10 


463 


557 


Z21368_PEA_1_T11 


463 


557 


Z21368_PEA_1_T23 


463 


557 


Z21368_PEA_1_T24 


463 


557 


Z21368_PEA_1_T5 


301 


395 


Z21368_PEA_1_T6 


301 


395 


Z21368_PEA_1_T9 


328 


422 



10 Overexpression of at least a portion of this cluster was determined according to 

oligonucleotides and one or more chips. The results were as follows: Oligonucleotide 
Z2 1 3 6 8_0_0_6 1857 was on the TAA chip and was found to be overexpressed in Lung cancer 
(general), in Lung adenocarcinoma, and in Lung squamous cell cancer. 

15 

Variant protein alignment to the previously known protein: 

Sequence name : /tmp/5ER3vIMKE2/ 9L0Y71D1TQ : SUL1_HUMA3ST 

Sequence documentation : 

20 

Alignment of: Z213 68_PEA_JL_P2 x SUL1_HUMAN 
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Alignment segment 1/1: 

Quality: 7664.00 

Escore: 0 

5 Matching length: 7 61 Total 

length: 7 61 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

10 Identity: 100.00 

Gaps : 0 



Alignment : 

15 1 MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLT 50 

I I I I I I I i I I I I M II I I I I I I I I I I I I I I I i 1 I I I I I I I I I I I I I I I I I 

1 MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLT 50 

- s . . • * 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 100 

20 | | | | I I I I I I I I I i I I I I I I I I I I I I I I I I M I I I I I I I I ! I I I I I I M I 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 100 
. . » • • 

101 HNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGS 150 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
25 101 HNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGS 150 

...» - 

151 YIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESI 200 

I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I 1 I I I I I I 

151 YIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESI 200 



30 



201 NYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYN 250 
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I I I I I ] i I I 1 I I I I ! I I I 11 I I ! I I I 1 I I I 1 I I 1 ! I I ! I I I I I ! I I I I I I 

201 NYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYN 250 

251 YAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNML 30 0 

5 I I I I I I I I I I I II I I I I I I I I I I I I I I I t I I I I I I I i f ! II I I I I I I I I I 

251 YAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNML 300 

301 VETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVE 350 
I I I I I I I I I I I I I I I I I II I I I I I I I I I I 11 I I I I I I I I I I II I I I I I I I 
10 301 VETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVE 350 

. . • • • 

351 PGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRT 400 

M I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I II I I II I I I I I I I I I I I II 
351 PGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRT 400 

15 ..... 

401 NKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARY 450 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I 
401 NKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARY 450 

20 451 QTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYARGFHDK 500 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
451 QTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYARGFHDK 500 

. • • • • 

501 DKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEF 550 

25 I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

501 DKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEF 550 

. . • • ■ 

551 EGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLA 600 

I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

30 551 EGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLA 600 
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601 DSSWAVGPPTTVRVTHKCFILPNDSIHCERELYQSARAWKDHKAYIDKEI 650 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 ] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

601 DSSNAVGPPTTVRVTHKCFILPNDSIHCERELYQSARAWKDHKAYIDKEI 650 
• . • • * 

5 651 EALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVKKQEKLKSHLH 700 

1 1 1 i 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

651 EALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVKKQEKLKSHLH 7 00 

701 PFKEAAQEVDSKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHD 750 

10 || I 1 I I I I 1 I I I I I I I I I ! I I I I I I ) I I I I I I I ) I I I I I I I II I I II I I I 

701 PFKEAAQEVDSKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHD 750 

751 NNHWQTAPFWN 7 61 
I I I I I I I I I I I 

15 751 NNHWQTAPFWN 7 61 



20 

Sequence name: /tmp/tt3yf XIUKV/YxSTFWr 66h : Q7Z2W2 
Sequence documentation : 

25 

Alignment of: Z213 68_PEA_1_P5 x Q7Z2W2 

Alignment segment 1/1: 

30 Quality: 7869.00 

Escore: 0 
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Matching length: 
length: 871 

Matching Percent Similarity: 
Identity: 99.87 
5 Total Percent Similarity: 

Identity: 90.70 

Gaps : 



305 

791 Total 
99.87 Matching Percent 
90.70 Total Percent 

1 



Alignment : 

10 ..... 

1 MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLT 50 

I I I I I I I 1 1 1 I I I I I I 1 I I 1 I I I I I I I I 1 I I 1 I I I I I I 1 I I I I I I I i I I I 
1 MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNI ILVLT 5 0 

15 51 DDQDVELA 58 

I I I I I I i 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 100 

5 9 FFGKYLNEYNGS 7 0 

20 I I I I 1 I 1 I! I I I 

101 HNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTVFFGKYLNEYNGS 150 

71 YIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESI 120 

II I I I I I I I I I i I I I I I I I I I i II II I ! I I I I I I II I I I I I I I I t I I I I I 

25 151 YIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESI 200 

121 NYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYN 170 

I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I II I I I I I 

201 NYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYN 250 



30 



171 YAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNML 220 
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I I I I 1 1 I I I I I 1 1 i I I I I I I I ! I I I I II i I I I I I I i I I I I I M I I I I I I I 

251 YAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNML 30 0 

221 VETGELENTYII YTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVE 270 

5 I I I I I I I I I I I I I I I I I i I I I I I II I M I I I I I I I I I I I I I I I I I I I I I I 

301 VETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVE 350 

271 PGS I VPQI VLNI DLAPT I LDI AGLDT PPDVDGKS VLKLLDPEKPGNRFRT 32 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I M I 

10 351 PGS 1 VPQI VLNI DLAPT ILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRT 400 

321 NKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARY 370 

I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I II I I I I I I I I I I ! I I I 

401 NKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARY 450 

15 . 

371 QTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYARGFHDK 42 0 

I I I I I I II I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I ! I I I I I I I I I I I 

451 QTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYARGFHDK 500 
20 421 DKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEF 470 

I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I II M I I I I I 

501 DKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEF 550 

. . • • 

471 EGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLA 520 

25 | | | | | | | | | I M | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M 

551 EGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLA 600 

• a • • • 

521 DSSNAVGPPTTVRVTHKCFILPNDSIHCERELYQSARAWKDHKAYIDKEI 570 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
30 601 DSSNAVGPPTTVRVTHKCFILPNDSIHCERELYQSARAWKDHKAYIDKEI 650 
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571 EALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVKKQEKLKSHLH 62 0 

1 1 i 1 1 1 i 1 1 1 1 1 i 1 1 1 1 1 1 1 i i 1 1 i 1 1 1 1 M 1 1 1 1 i 1 1 1 1 ! 1 1 i 1 1 1 1 1 1 

651 EALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVKKQEKLKSHLH 70 0 

5 621 PFKEAAQEVDSKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHD 67 0 

I I I I I I I I I II I I I I I 1 I I I I I I I I I I I I I I 1 II I I I I I I I 1 I I I I I I II 
7 01 PFKEAAQEVDSKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHD 750 

• . • - - 

671 NNHWQTAPFWNLGSFCACTSSNNNTYWCLRTVNETHNFLFCEFATGFLEY 720 

10 M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I) I I I I I I I I I I 

751 NNHWQTAPFWNLGSFCACTSSNNNTYWCLRTVNETHNFLFCEFATGFLEY 800 

721 FDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQCNPRPKNLDV 77 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

15 801 FDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQCNPRPKNLDV 850 

771 GNKDGGSYDLHRGQLWDGWEG 791 

II I I I I I II I I I I I I I I I I I I 

851 GNKDGGSYDLHRGQLWDGWEG 871 

20 



25 

Sequence name: /tmp/tt3yf XIUKV/YxSTFWr 66h : AAH12 997 
Sequence documentation : 
30 Alignment of: Z213 68_PEA__1_P5 x AAH12997 
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Alignment segment 1/1: 



10 



Quality: 

Escore: 0 

Matching length: 
length: 4 0 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



420.00 



40 



100.00 



Total 



100.00 Matching Percent 



Total Percent 



Alignment : 



15 



7 52 LRSCQGYKQCNPRPKNLDVGNKDGGSYDLHRGQLWDGWEG 
I I II I I I I I I I I 1 1 I I I I 1 I I I I I 1 I 1 I I i 1 I 1! I 11 I i 1 
1 LRSCQGYKQCNPRPKNLDVGNKDGGSYDLHRGQLWDGWEG 



791 



40 



20 



25 



Sequence name : /tmp/tt3yfXIUKV/YxSTFWr6 6h : SUL1_HUMAN 



Sequence documentation : 



Alignment of: Z213 68_PEA_1 J?5 x SUL1_HUMAN 



30 Alignment segment 1/1: 
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Quality: 7878.00 

Escore: 0 

Matching length: 791 Total 

length: 871 

5 Matching Percent Similarity: 100.00 Matching Percent 
Identity: ,10 0 .00 

Total Percent Similarity: 90.82 Total Percent 

Identity: 90.82 

Gaps : 1 



10 



Alignment : 



1 MK Y S CC ALVL AVLGTE LLG S LC S T VRS PRFRGRI QQERKN I RPN 1 1 LVL T 50 

I I I II I 1 I i I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I i II I I I I M I I 

15 1 MKYSCCALVLAVLGTELLGSLCS TVRS PRFRGRI QQERKNIRPNI I LVLT 50 

51 DDQDVEL 57 

I I I I i I t 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 100 
20 • * • • • 

58 AFFGKYLNEYNGS 7 0 

I I I I I I I I I I I I I 

101 HNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGS 150 
25 71 YIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESI 120 

I I I I I I I I I I I I i I I I I I I I I M I I I I I I I 11 I I I I I I I I I I I I I I t I I I 

151 YIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESI 200 

. . . . • 

121 NYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYN 170 

30 I I I I I I I I I I I I 1 I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

201 NYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYN 250 
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171 YAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNML 220 

I I I I I I I II I I I I I I I I I I II I I I 1 I I I I I M I i I I I I I I I I I I I I I I I I 

251 YAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNML 300 

221 VETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVE 270 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I 

301 VETGELENTYI IYTADHGYHIGQFGLVKGKSMPYDFDIRVPFF1RGPSVE 350 

271 PGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRT 32 0 

I I I I I I I 1 I I 1 I I I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I II I II 

351 PGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRT 40 0 

321 NKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARY 370 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I II M I M I I I 
401 NKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARY 450 

. • • 

371 QTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYARGFHDK 420 

I I I I I I I I I I j I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

451 QTACEQPGQKWQCIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYARGFHDK 500 

421 DKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEF 47 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
501 DKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEF 550 

471 EGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLA 520 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
551 EGE1YDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLA 600 

• • • • • 

521 DSSNAVGPPTTVRVTHKCFILPNDSIHCERELYQSARAWKDHKAYIDKEI 570 

! I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I i I I I I I I I I I 
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601 DSSNAVGPPTTVRVTHKCFILPNDSIHCERELYQSARAWKDHKAYIDKEI 650 
571 EALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVKKQEKLKSHLH 620 

I I I I i M I I I 1 I I I I 1 I I I I i I I I 1 I I I I I I 1 I M i I I I I I I I I 1 I I I I I 

651 E ALQ DKI KNLRE VRGHLKRRKPE EC S C S KQ S Y YNKEKGVKKQEKLKS HLH 700 

. . • - • 

621 PFKEAAQEVDSKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHD 670 

I 1 I i I II I I 1 I I I I I I 1 I 1 I I I I 1 I I I I ! I I I I I I ! I 1 ! I I ! I 1 I ! I I I I 

701 PFKEAAQEVDSKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHD 750 

671 NNHWQTAPFWNLGSFCACTSSNNNTYWCLRTVNETHNFLFCEFATGFLEY 720 

I I I i I ! I I I ! I I I I I I I I I I i I I I ! 1 I I II I I I I I ! I I I I I I I i I I I I I I 

751 NNHWQTAPFWNLGSFCACTSSNNNTYWCLRTVNETHNFLFCEFATGFLEY 80 0 

a a ■ • 

15 721 FDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQCNPRPKNLDV 77 0 

I I I 1 I I I I I I I I I I I I I It I I I I I I I I I II I I I I I I I I I I I I I ! 1 I I I I I 

801 FDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQCNPRPKNLDV 850 

771 GNKDGGSYDLHRGQLWDGWEG 7 91 

20 I I II I I I I I I I I I II I I I i II 

851 GNKDGGSYDLHRGQLWDGWEG 871 



25 

Sequence name : / tmp/AVAZGWHuFO/RzHFOnHIsT : SUL1__HUMAN 
30 Sequence documentation: 
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Alignment of: Z213 68_PEA_1_P15 x SUL1JE1UMAN 



Alignment segment 1/1: 

5 Quality: 
Escore: 0 

Matching length: 
length: 416 
Matching Percent Similarity: 
10 Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



4174 . 00 

416 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 



15 Alignment: 

..... 

1 MKY S C C ALVLAVLGTELLG S LC S T VRS PRFRGRI QQERKN I RPN 1 1 L VLT 50 

I I I I I M I I I I I I I I I I ! I I I I ) I M I I I I I I I I I I I I I I i I II I ! I I I I 

1 MKYSCCALVLAVLGTELLGSLCSTVRS PRFRGRI QQERKN IRPNI I LVLT 50 
20 ..... 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 100 

I I I I I I I I I I I I I I I I I ! M 1 I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 100 
. • • • • 

25 101 HNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGS 150 

I I I I I I I I I I I I I I I 1 I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I 
101 HNHNVYTNNENC S S P S WQ AMHE PRTFAVYLNNTGYRTAFFGKYLNE YNG S 150 

. * . . • 

151 YIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESI 200 
30 I I 1 1 I I I I I I I I I I I I I ) I I I I I I I I I ! I I I I I I I 11 I I I I I I I ! I I I I I 

151 YIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESI 200 



WO 2006/131783 



PCT/IB2005/004037 



313 



201 NYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYN 250 

! I I i I I I I i I i I I I I I 1 I I I I I 11 I I I I I I I I I it I I ! I I I I I I I I 1 I I I 

2 01 NYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYN 25 0 
251 YAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNML 300 

I I I I I I I I I I I I I 1 II I I I I I M I I t I I I I I 1 I I I I ! I I I i I I I I i I I I I 

2 51 YAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNML 30 0 

301 VETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVE 350 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I II I I I I I I I I I I I I I 
301 VETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVE 350 

351 PGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRT 400 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I M 

351 PGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRT 400 

401 NKKAK I WRDT FLVERG 416 

I I I I I I I I II I I II I I 
4 01 NKKAKIWRDTFLVERG 416 



Sequence name : /tmp/ JhwgRdKqmt/kqSmj xkWWk : SUL1_HUMAN 
Sequence documentation : 

Alignment of: Z213 68_PEA_1_P16 x SUL1_HUMAN 
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Alignment segment 1/1: 

Quality : 

Escore: 0 

Matching length: 
length: 397 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



3985 .00 

397 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 



Alignment : 



1 MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLT 50 

I I I I I I I 1 I I I I 1 I M i I 1 I I I I I I I I I I I i I I I I 1 I M I I I I 1 I I I I I I 
1 MKYS CCALVLAVLGTELLGS LCS TVRS PRFRGRI QQERKNIRPNI I LVLT 50 
.«•* * 
51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 10 0 

I i | | I I I I I I I i I II I I I I I II I I I I I I I I I I I I I i I I I I I I I t I I I I I I 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 100 

. . ■ • 

101 HNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGS 150 

I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I 1 I I I I I I I I 

101 HNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGS 150 

. . » * • 

151 YIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESI 200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
151 YIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESI 200 
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201 NYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSY1SI 250 

I I I I I I I I I I I I I I I I I I I I I 1 I I I 1 I I I I I M I 1 I I II I I I I I I I I I 1 I 

201 NYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYN 250 
5 251 YAPNMDKHWIMQYTGPMLPIHMEFTN1LQRKRLQTLMSVDDSVERLYNML 300 

I I I I I I I I 1 I I I I I i t I I I I t I I I ! I I I I I I ! i M I I I I I I I I I I I I I I I 

251 YAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNML 300 

301 VETGELENTYI IYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVE 350 

10 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 VETGELENTYI I YTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVE 350 

351 PGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNR 397 

I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I II M I I I I I 

15 351 PGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNR' 397 



20 

Sequence name: /tmp/GPlnIw3BOg/zXFdxqG4ow : SUL1_HUMAN 
Sequence documentation : 

25 

Alignment of: Z21 3 68_J?EA_1_P22 x SUL1_HUMAN 

Alignment segment 1/1: 

30 Quality: 1897.00 

Escore: 0 
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Matching length: 188 Total 

length: 188 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 



Alignment : 

1 MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLT 50 

I I I I I I I I I I I I I I I I ! I I I I I I I I I I I II I I I I I I I I I I f ! I I I I I I I I 

1 MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLT 50 
• • • 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 10 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 10 0 
. . ■ . « 

101 HNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGS 150 

I | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II II I I I I I 
101 HNHN V YTNNENC S S P S WQAMHE PRTFAVY LNNT G YRT AFFGK YLNE YNG S 150 

151 Y I P P GWREWLGL I KNSRF YN YT VCRNG IKEKHGFDYAK 18 8 

I I I I I I I I I I I I I! I I I I I I I I I I I I I I I I I I I I I I M 

151 YIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAK 18 8 



Sequence name: /tmp/o j i5Fs74f B/8xeB9KrGjp : Q7Z2W2 
Sequence documentation : 
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Alignment of: Z2 1 3 68_PEA_1_P23 x Q7Z2W2 

Alignment segment 1/1: 

5 Quality: 1368.00 

Escore: 0.000511 

Matching length: 137 
length: 137 
Matching Percent Similarity: 100.00 
10 Identity: 10 0.0 0 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 

15 Alignment: 

1 MK Y S CC ALVLAVLGTELLGS LC S T VRS PRFRGRI QQERKN I RPN 1 1 LVLT 50 

I I I II I II I I 1 I I I I I I I I I I I I I I I I I I I I I I I I 1 I i I I I II I 1 I I I M 

1 MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLT 50 

20 ..... 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 100 

I I I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 100 
... 
25 101 HNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRT 137 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
101 HNHN V Y TNNENC S S P S WQ AMHE PRT FAVYLNNT G YRT 137 

30 



Total 
Matching Percent 
Total Percent 
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Sequence name : /tmp/o j i5Fs7 4 f B/ 8xeB9KrGjp : SUL1_HUMAN 
5 Sequence documentation: 

Alignment of: Z213 68_PEA_1_P2 3 x SUL1_HUMAN 
Alignment segment 1/1: 

10 

Quality: 1368.00 

Escore: 0.000511 

Matching length: 137 
length: 137 
15 Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 

20 

Alignment : 

1 MKYS CC ALVLAVLGTE L LGS LC S T VRS PRFRGRI QQERKN I RPN 1 1 LVLT 50 

I I I 1 I I I I I I I i I I I I I I I! I I ! I I I M I I I I I I I M I I I I I ) I 1 I I I 1 I 

25 1 MKYS CCALVLAVLGTELLGSLCSTVRS PRFRGRI QQERKNIRPNIILVLT 5 0 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 10 0 

I I I I I I I I I I I I I I I I I ! I I I I I I I M I I I I I I I 1 i I I I I I I I ! I I I I I I 

51 DDQDVELGSLQVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYV 100 
30 ... 

101 HNHN V Y TNNENC S S P S WQ AMHE PRT FAV YLNNT G YRT 137 



Total 
Matching Percent 
Total Percent 
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I I I I I t I ! I I I I I I i I I I | | | I I I | | i | | | | | I | | | 1 
101 HNHNVYTNNENCS S PSWQAMHEPRT FAVYLNNTGYRT 137 

Expression of SUL1 JHUMAN - Extracellular sulfatase Sulf-lZ21368 transcripts which are 
5 detectable by amplicon as depicted in sequence name Z21368juncl7-21 in normal and cancerous 

lung tissues 

Expression of SUL INHUMAN - Extracellular sulfatase Sulf-1 transcripts detectable by 
or according to juncl7-21 segment, Z21368juncl7-21 amplicon (SEQ ID NO: 1642) and 
Z21368juncl7-21F(SEQIDNO: 1640) Z21368juncl7-21R (SEQ ID NO: 1641) primers was 

10 measured by real time PGR. In parallel the expression of four housekeeping genes -PBGD 
(GenBank Accession No. BC019323; amplicon - PBGD-amplicon, SEQ ID NO:334), HPRT1 
(GenBank Accession No. NM_000194; amplicon - HPRT1 -amplicon, SEQ ID NO: 1297), 
Ubiquitin (GenBank Accession No. BC000449; amplicon - Ubiquitin- amplicon, SEQ ID 
NO:328) and SDHA (GenBank Accession No. NM_004168; amplicon - SDH A- amplicon, SEQ 

15 ID NO:331) was measured similarly. For each RT sample, the expression of the above amplicon 
was normalized to the geometric mean of the quantities of the housekeeping genes. The 
normalized quantity of each RT sample was then divided by the median of the quantities of the 
normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table2, "Tissue samples 
in testing panel", above), to obtain a value of fold up -regulation for each sample relative to 

20 median of the normal PM samples. 

Figure 14 is a histogram showing over expression of the above -indicated 
SUL1_HUMAN - Extracellular sulfatase Sulf-1 transcripts in cancerous lung samples relative 
to the normal samples. Values represent the average of duplicate experiments. Error bars 
indicate the minimal and maximal values obtained. As is evident from Figure 14, the expression 

25 of SUL 1 HUMAN - Extracellular sulfatase Sulf-1 transcripts detectable by the above 

amplicon in cancer samples was significantly higher than in the non- cancerous samples (Sample 
Nos. 47-50, 90-93, 96-99 Table 2, "Tissue samples in testing panel"). Notably an over- 
expression of at least 5 fold was found in 10 out of 15 adenocarcinoma samples, 7 out of 16 
squamous cell carcinoma samples, 0 out of 4 large cell carcinoma samples and in 0 out of 8 

30 small cells carcinoma samples. 
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Threshold of 5 fold over-expression was found to differentiate between cancer and normal 
samples with P value of 3.56E-04 in adenocarcinoma, 9.66E-03 in squamous cell carcinomas 
checked by exact fisher test. The above values demonstrate statistical significance of the results. 

Primer pairs are also optionally and preferably encompassed within the present 
5 invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: Z21368jimcl7-21F forward primer; 
and Z21368juncl 7-21 R reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
10 was obtained as a none limiting illustrative example only of a suitable amplicon: Z21368juncl7- 
21. 

Forward primer (SEQ ID NO: 1640): GGACGGATACAGCAGGAACG 
Reverse amplicon (SEQ ID NO: 1641): TATTTTCCAAAAAAGGCCAGCTC 
Amplicon (SEQ ID NO: 1642): 
15 GGACGGATACAGCAGGAACGAAAAAACATCCGACCCAACATTATTCTTGTGCTTAC 
CGATGATCAAGATGTGGAGCTGGCCTTTTTTGGAAAATA 



20 

Expression of SUL1_HUMAN - Extracellular sulfatase Sulf-lZ21368 transcripts, which are 
detectable by amplicon as depicted in sequence name Z21368 junc 17-21 in different normal 

tissues 

25 Expression of SUL INHUMAN - Extracellular sulfatase Sulf-1 transcripts detectable by 

or according to Z21368 juncl7-21 amplicon (SEQ ID NO: 1642) and Z21368 juncl7-21F (SEQ 
ID NO: 1640) and Z21368 juncl7-21R (SEQ ID NO: 1641) was measured by real time PGR. In 
parallel the expression of four housekeeping genes -RPL19 (GenBank Accession No. 
NMJ300981; RPL19 amplicon, SEQ ID NO:1630), TATA box (GenBank Accession No. 

30 NM_003 1 94; TATA amplicon, SEQ ID NO: 1 633), Ubiquitin (GenBank Accession No. 

BC000449; amplicon - Ubiquitin-amplicon, SEQ ID NO:328) and SDHA (GenBank Accession 
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No. NMJ)04168; amplicon - SDHA-amplicon, SEQ ID NO:331) was measured similarly. For 
each RT sample, the expression of the above amplicon was normalized to the geometric mean of 
the quantities of the housekeeping genes. The normalized quantity of each RT sample was then 
divided by the median of the quantities of the breast samples (Sample Nos. 33-35 Table 3, 
5 "Tissue samples in normal panel", above), to obtain a value of relative expression of each 
sample relative to median of the breast samples. 

Forward primer (SEQ ID NO: 1640): GGACGGATACAGCAGGAACG 
Reverse amplicon (SEQ ID NO: 1641): TATTTTCCAAAAAAGGCCAGCTC 
10 Amplicon (SEQ ID NO: 1642): 

GGACGGATACAGCAGGAACGAAAAAACATCCGACCCAACATTATTCTTGTGCTTAC 

The results are shown in Figure 15, demonstrating the expression of Extracellular sulfatase Sulf- 
15 1Z21368 transcripts, which are detectable by amplicon as depicted in sequence name Z21368 
juncl7-21, in different normal tissues. 



20 Expression of SUL1 JHUMAN - Extracellular sulfatase Sulf-1 Z21368 transcripts which are 

detectable by amplicon as depicted in sequence name Z21368seg39 in normal and cancerous 

lung tissues 

Expression of SUL1_HUMAN - Extracellular sulfatase Sulf-1 transcripts detectable by 
or according to seg39, Z21368seg39 amplicon (SEQ ID NO: 1645) and primers Z21368seg39F 

25 (SEQ ID NO: 1643) and Z21368seg39R (SEQ ID NO: 1644) was measured by real time PCR. 
In parallel the expression of four housekeeping genes -PBGD (GenBank Accession No. 
BC019323; amplicon - PBGD-amplicon, SEQ ID NO:334), HPRT1 (GenBank Accession No. 
NMJ)00194; amplicon - HPRT1 -amplicon, SEQ ID NO:1297), Ubiquitin (GenBank Accession 
No. BC000449; amplicon - Ubiquitin- amplicon, SEQ ID NO:328) and SDHA (GenBank 

30 Accession No. NM_004168; amplicon - SDHA-amplicon, SEQ ID NO:331) was measured 
similarly. For each RT sample, the expression of the above amplicon was normalized to the 
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geometric mean of the quantities of the housekeeping genes. The normalized quantity of each 
RT sample was then divided by the median of the quantities of the normal post-mortem (PM) 
samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, "Tissue samples in testing panel"), to obtain 
a value of fold up-regulation for each sample relative to median of the normal PM samples. 
5 Figure 16 is a histogram showing over expression of the above -indicated 

SUL1 JHUMAN - Extracellular sulfatase Sulf-1 transcripts in cancerous lung samples relative to 
the normal samples. Values represent the average of duplicate experiments. Error bars indicate 
the minimal and maximal values obtained. 

As is evident from Figure 16, the expression of SUL1 JHUMAN - Extracellular sulfatase Sulf-1 
10 transcripts detectable by the above amplicon in cancer samples was higher than in the non- 
cancerous samples (Sample Nos. 47-50, 90-93, 96-99 Table 2, "Tissue samples in testing 
panel"). Notably an over- expression of at least 5 fold was found in 8 out of 15 adenocarcinoma 
samples, 5 out of 16 squamous cell carcinoma samples and 1 out of 4 large cell carcinoma 
samples . 

15 Statistical analysis was applied to verify the significance of these results, as described 

below. 

The P value for the difference in the expression levels of SUL1_HUMAN - Extracellular 
sulfatase Sulf- 1 transcripts detectable by the above amplicon in lung cancer samples versus the 
normal tissue samples was determined by T test as 2.17E-04 in adenocarcinoma, 9.94E-03 in 
20 squamous cell carcinoma and 2.17E-01 in large cell carcinoma. 

Threshold of 5fold overexpression was found to differentiate between cancer and normal 
samples with P value of 1.74E-02 in adenocarcinoma, L58E-01 in squamous cell carcinoma and 
4.33E-01 in large cell carcinoma as checked by exact fisher test. The above values demonstrate 
statistical significance of the results. 
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Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: Z21368seg39F forward primer; and 
Z21368seg39R reverse primer. 
5 The present invention also preferably encompasses any amplicon obtained through the 

use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: Z21368seg39. 

Primers: 

10 Forward primer Z21368seg39F (SEQ ID NO: 1643): GTTGCATTTCTCAGTGCTGGTTT 
Reverse primer Z21368seg39R (SEQ ID NO: 1644): AGGGTGCCGGGTGAGG 
Amplicon Z21368seg39 (SEQ ID NO: 1645): 

GTTGCATTTCTCAGTGCTGGTTTCTAATCAGACCAGTGGATTGAGTTTCTCTACCATC 
CTCCCCACGTTCTTCTCTAAGCTGCCTCCAAGCCTCACCCGGCACCCT 

15 

Expression of SUL1_HUMAN - Extracellular sulfatase Sulf-lZ21368 transcripts which are 
detectable by amplicon as depicted in sequence name Z21368seg39 in different normal tissues 

Expression of SUL1 JHUMAN - Extracellular sulfatase Sulf-1 transcripts detectable by or 
20 according to Z21368seg39 amplicon (SEQ ID NO: 1645) and Z21368seg39F (SEQ ID NO: 
1643) Z21368seg39R (SEQ ID NO: 1644) was measured by real time PCR. In parallel the 
expression of four housekeeping genes -[ RPL19 (GenBank Accession No. NM_000981; 
RPL19 amplicon, SEQ ID NO:1630), TATA box (GenBank Accession No. NM_003194; TATA 
amplicon, SEQ ID NO: 1633), UBC (GenBank Accession No. BC000449; amplicon - 
25 Ubiquitin-amplicon, SEQ ID NO:328) and SDHA (GenBank Accession No. NM_004168; 

amplicon - SDHA- amplicon, SEQ ID NO:331) was measured similarly. For each RT sample, 
the expression of the above amplicon was normalized to the geometric mean of the quantities of 
the housekeeping genes. The normalized quantity of each RT sample was then divided by the 
median of the quantities of the breast samples (Sample Nos. 33-35 Table 3, above), to obtain a 
30 value of relative expression of each sample relative to median of the breast samples. 
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Forward primer Z21368seg39F (SEQ ID NO: 1643): GTTGCATTTCTCAGTGCTGGTTT 
Reverse primer Z21368seg39R (SEQ ID NO: 1644): AGGGTGCCGGGTGAGG 
Amplicon Z21368seg39 (SEQ ID NO: 1645): 

GTTGCATTTCTCAGTGCTGGTTTCTAATCAGACCAGTGGATTGAGTTTCTCTACCATC 
5 CTCCCCACGTTCTTCTCTAAGCTGCCTCCAAGCCTCACCCGGCACCCT 

The results are demonstrated in Figure 17, showing expression of SUL INHUMAN - 
Extracellular sulfatase Sulf-1, Z21368 transcripts, which are detectable by amplicon as depicted 
in sequence name Z21368seg39, in different normal tissues. 



15 PBGD-amplicon, SEQ ID NO:334HPRTl -amplicon, SEQ ID NO :1297Ubiquitin- amplicon, 

SEQ ID NO:328SDHA- amplicon, SEQ ID NO :3 3 1PBGD- amplicon, SEQ ID NO:334HPRTl- 
amplicon, SEQ ID NO:1297Ubiquitin-amplicon, SEQ ID NO :328SDH A- amplicon, SEQ ID 
NO:331RPL19 amplicon, SEQ ID NO:1630TATA amplicon, SEQ ID NO:1633Ubiquitin- 
amplicon, SEQ ID NO:328SDHA-amplicon, SEQ ID NO:331 

20 

DESCRIPTION FOR CLUSTER HUMGRP5E 
Cluster HUMGRP5E features 2 transcript(s) and 5 segment(s) of interest, the names for 
which are given in Tables 160 and 161, respectively, the sequences themselves are given at the 
end of the application. The selected protein variants are given in table 162. 

25 



Table 160 - Transcripts of interest 



Transcript Name 


Sequence ID No. 


HUMGRP5ET4 


20 
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HUMGRP5E_T5 


21 


Table 161 - Segments of interest 


Segment Name | ? / 1 

, ' "V V!" 'f% ■' \- « • • ' • v % " 4 


Sequence ID No. - - : ; 'i ' . - 


HUMGRP5E_node_0 


335 


HUMGRP5E_node_2 


336 


HUMGRP5E_node_8 


337 


HUMGRP5E_node_3 


338 


HUMGRP5E_node_7 


339 


r«We 762 - Proteins of interest 


Protein Name 


s Sequence ID No. #j 4 : -'; 


HUMGRP5E_P4 


1299 


HUMGRP5E_P5 


1300 



5 



These sequences are variants of the known protein Gastrin-releasing peptide precursor 
(SwissProt accession identifier GRPJHUMAN; known also according to the synonyms GRP; 
GRP- 10), SEQ ID NO: 1421, referred to herein as the previously known protein. 

Gastrin-releasing peptide is known or believed to have the following function(s): 
10 stimulates gastrin release as well as other gastrointestinal hormones. The sequence for protein 
Gastrin-releasing peptide precursor is given at the end of the application, as "Gastrin-releasing 
peptide precursor amino acid sequence". Known polymorphisms for this sequence are as shown 
in Table 163. 



Tablel63 - Amino acid mutations for Known Protein 



SNP position(s) on 
amino acid sequence 


Comment ■ "' . " . 


4 


S->R 



15 

Protein Gastrin-releasing peptide localization is believed to be Secreted. 
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The previously known protein also has the following indication(s) and/or potential 
therapeutic use(s): Diabetes, Type II. It has been investigated for clinical/therapeutic use in 
humans, for example as a target for an antibody or small molecule, and/or as a direct 
5 therapeutic; available information related to these investigations is as follows. Potential 

pharmaceutically related or therapeutically related activity or activities of the previously known 
protein are as follows: Bombesin antagonist; Insulinotropin agonist. A therapeutic role for a 
protein represented by the cluster has been predicted. The cluster was assigned this field because 
there was information in the drug database or the public databases (e.g., described herein above) 
10 that this protein, or part thereof, is used or can be used for a potential therapeutic indication: 
Anorectic/Antiobesity; Releasing hormone; Anticancer; Respiratory; Antidiabetic. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: signal transduction; neuropeptide signaling pathway, which are 
annotation(s) related to Biological Process; growth factor, which are annotation(s) related to 
15 Molecular Function; and secreted, which are annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <ht1p://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

As noted above, cluster HUMGRP5E features 2 transcript(s), which were listed in Table 
20 160 above. These transcript(s) encode for protein(s) which are variant(s) of protein Gastrin- 
releasing peptide precursor. A description of each variant protein according to the present 
invention is now provided. 

Variant protein HUMGRP 5 E_P4 according to the present invention has an amino acid 
25 sequence as given at the end of the application; it is encoded by transcript(s) HUMGRP 5 E T4 . 
An alignment is given to the known protein (Gastrin-releasing peptide precursor) at the end of 
the application. One or more alignments to one or more previously published protein sequences 
are given at the end of the application. A brief description of the relationship of the variant 
protein according to the present invention to each such aligned protein is as follows: 

30 

Comparison report between HUMGRP5EJP4 and GRP HUMAN: 
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LAn isolated chimeric polypeptide encoding for HUMGRP5EJP4, comprising a first 
amino acid sequence being at least 90 % homologous to 

MRGSELPLVLLALVLCLAPRGRAVPLPAGGGTVLTKMYPRGNHWAVGHLMGKKSTG 
ESSSVSERGSLKQQLREYIRWEEAARNLLGLIEAKENRNHQPPQPKALGNQQPSWDSED 
5 S SNFKD VGSKGK corresponding to amino acids 1-127 of GRP_HUMAN, which also 
corresponds to amino acids 1-127 of HUMGRP5EJP4, and a second amino acid sequence 
being at least 90 % homologous to GSQREGKNPQLNQQ corresponding to amino acids 135 - 
148 of GRPJHUMAN, which also corresponds to amino acids 128 - 141 of HUMGRP5E_P4, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

10 2. An isolated chimeric polypeptide encoding for an edge portion of HUMGRP5EJP4, 

comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise KG, having a 

15 structure as follows: a sequence starting from any of amino acid numbers 127-x to 127; and 
ending at any of amino acid numbers 128 + ((n-2) - x), in which x varies from 0 to n-2. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 

20 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein HUMGRP5EJP4 also has the following non- silent SNPs (Single 

25 Nucleotide Polymorphisms) as listed in Table 164, (given according to their position(s) on the 

amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HUMGRP5 EJP4 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

30 Table 164 - Amino acid mutations 
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Sl^P^positionCs) on amino acid- 
sequence 


Alternative amino acid(£) q 


Previously known SNP? 


4 


S->R 


Yes 



Variant protein HUMGRP5E P4 is encoded by the following transcript(s): 
HUMGRP5E T4, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HUMGRP5EJT4 is shown in bold; this coding portion starts at 
5 position 622 and ends at position 1044. The transcript also has the following SNPs as listed in 
Table 165 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HUMGRP5EJP4 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 165 - Nucleic acid SNPs 



SNP position on nucleotide ; 


Alternative nucleic acid V 


j Px-evioiisly known SNP? 


sequence ' ' J 1" «'."' v£. \? 






541 


->T 


No 


542 


G->T 


No 


631 


A->C 


Yes 


672 


G -> A 


Yes 


1340 


C-> 


No 


1340 


C->A 


No 


1341 


A-> 


No 


1341 


A->G 


No 



Variant protein HUMGRP5E P5 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) HUMGRP5E_T5. 
1 5 An alignment is given to the known protein (Gastrin- releasing peptide precursor) at the end of 
the application. One or more alignments to one or more previously published protein sequences 
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are given at the end of the application. A brief description of the relationship of the variant 
protein according to the present invention to each such aligned protein is as follows: 



Comparison report between HUMGRP5E_P5 and GRPHUMAN: 
5 LAn isolated chimeric polypeptide encoding for HUMGRP5EJP5, comprising a first 

amino acid sequence being at least 90 % homologous to 

MRGSELPLVLLALVLCLAPRGRAVPLPAGGGTVLTKMYPRGNHWAVGHLMGKKSTG 
ESSSVSERGSLKQQLRJEY1RWEEAARNLLGLIEAKENRNHQPPQPKALGNQQPSWDSED 
S SNFKD VGSKGK corresponding to amino acids 1 - 127 of GRPJHUMAN, which also 

10 corresponds to amino acids 1 - 127 of HUMGRP5E_P5, and a second amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
90% and most preferably at least 95% homologous to a polypeptide having the sequence 
DSLLQVLNVKEGTPS corresponding to amino acids 128 - 142 of HUMGRP5EJP5, wherein 
said first and second amino acid sequences are contiguous and in a sequential order. 

1 5 2.An isolated polypeptide encoding for a tail of BrtJMGRP5E_P5, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence DSLLQVLNVKEGTPS in HUMGRP5E_P5. 



20 The location of the variant protein was determined according to results from a number of 

different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 

25 region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HUMGRP 5 E_P 5 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 166, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HUMGRP 5 E_P 5 

30 sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
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Table 166 -Amino acid mutations 



SNP position(s) on aminp acid 
sequence ; 1 ' 4 4\ \$ 


Alternative ammo acid(s) >• 


Previously known SNP? ; 


4 


S->R 


Yes 



Variant protein HUMGRP5E_P5 is encoded by the following transcript(s): 
HUMGRP5EJT5, for which the sequence(s) is/are given at the end of the application. The 
5 coding portion of transcript HUMGRP5E_T5 is shown in bold; this coding portion starts at 
position 622 and ends at position 1047. The transcript also has the following SNPs as listed in 
Table 167 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HUMGRP5EJP5 sequence provides support for the deduced 
10 sequence of this variant protein according to the present invention). 

Table 167 ' - Nucleic acid SNPs 



SNP positio n on nucleotide 

sequence "• ' ,* 

• :tl: ' .:.S ' m 


Alternative nucleic acid ^ 


I Previously Jknown SNP? 


541 


->T 


No 


542 


G->T 


No 


631 


A->C 


Yes 


672 


G-> A 


Yes 


1354 


C-> 


No 


1354 


C -> A 


No 


1355 


A-> 


No 


1355 


A->G 


No 



~ As noted above, cluster HUMGRP5E features 5 segment(s), which were listed in Table 
161 above and for which the sequence(s) are given at the end of the application. These 
segment(s) are portions of nucleic acid sequence(s) which are described herein separately 



15 because they are of particular interest. A description of each segment according to the present 
invention is now provided. 
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Segment cluster HUMGRP5E_node_0 according to the present invention is supported by 
21 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMGRP5E_T4 and HUMGRP5ET5. Table 168 
below describes the starting and ending position of this segment on each transcript. 



5 Table 168 - Segment location on transcripts 



Transcript name 


p. , f . . TT Z 

Segment starting position 


Segment ending position 


HUMGRP5E_T4 


1 


760 


HUMGRP5E_T5 


1 


760 



Segment cluster HUMGRP5E_node_2 according to the present invention is supported by 
27 libraries. The number of libraries was determined as previously described. This segment can 
10 be found in the following transcript(s): HUMGRP5EJT4 and HUMGRP5E T5 . Table 169 
below describes the starting and ending position of this segment on each transcript. 

Table 169 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position. 


HUMGRP5ET4 


761 


984 


HUMGRP5E_T5 


761 


984 



15 Segment cluster HUMGRP5E_node_8 according to the present invention is supported by 

26 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMGRP5EJT4 and HUMGRP5EJT5. Table 170 
below describes the starting and ending position of this segment on each transcript. 

Table 170 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


HUMGRP5E_T4 


1004 


1362 


HUMGRP5E_T5 


1018 


1376 
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According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

5 Segment cluster HUMGRP5E_node_3 according to the present invention can be found in 

the following transcript(s): HUMGRP5ET4 and HUMGRP5E_T5 . Table 171 below describes 
the starting and ending position of this segment on each transcript. 

Table 171 - Segment location on transcripts 



Transcript name , , :• . i 


j- Segment ^starting position ? 


[ Segment ending position , ; ; 


HUMGRP5ET4 


985 


1003 


HUMGRP5E_T5 


985 


1003 



10 

Segment cluster HUMGRP5E__node_7 according to the present invention can be found in 
the following transcript(s): HUMGRP5ET5. Table 172 below describes the starting and ending 
position of this segment on each transcript. 

Table 172 - Segment location on transcripts 



Transcript name 


Segment: f | 


[ Segment ending position 


HUMGRP5EJT5 


1004 


1017 



15 

Microarray (chip) data is also available for this gene as follows. As described above with 
regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
20 were found to hit this segment (with regard to lung cancer), shown in Table 173. 

Table 173 - Oligonucleotides related to this gene 



Oligonucleotide name 


Overexpressed in cancers 


Chip reference 


HUMGRP5EJ)_0_16630 


Lung cancer 


Lung 


HUMGRP5E_0_2J) 


Lung cancer 


Lung 
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Variant protein alignment to the previously known protein: 

Sequence name : /tmp/ 412 zs2mwyT/B0w j OUAXOd : GRP_HUMAN 

Sequence documentation : 

Alignment of: HUMGRP5E_P4 x GRP_HUMAN 
Alignment segment 1/1: 



Quality: 1291.00 

Escore: 0 

Matching length: 141 
length: 148 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 95.27 
Identity: 95.27 

Gaps : 1 



Total 



Matching Percent 



Total Percent 



Alignment : 



1 MRGSELPLVLLALVLCLAPRGRAVPLPAGGGTVLTKMYPRGNHWAVGHLM 50 

I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | 
1 MRGSELPLVLLALVLCLAPRGRAVPLPAGGGTVLTKMYPRGNHWAVGHLM 50 



51 GKKSTGESSSVSERGSLKQQLREYIRWEEAARNLLGLIEAKENRNHQPPQ 100 

i i 1 1 1 1 1 n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

51 GKKSTGES SS VSERGSLKQQLREYIRWEEAARNLLGLIEAKENRNHQPPQ 100 
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101 PKALGNQQPSWDSEDS SNFKDVGSKGK GSQREGRNPQLNQQ 141 

I I I i I I I I 1 I I I I I I I 1 I I I 1 I 1 I 1 I I ! I I I 1 I I I 1 I 1 1 1 I 

101 PKALGNQQPSWDSEDS SNFKDVGSKGKVGRLS APGSQREGRNPQLNQQ 14 8 



Sequence name: / tmp/lme91dnvf v/KbP5io8PtU : GRP_HUMAN 
Sequence documentation : 

Alignment of: HUMGRP5E_P5 x GRP_HUMAN 
Alignment segment 1/1: 

Quality: 1248.00 

Escore: 0 

Matching length: 127 Total 

length: 127 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

Alignment : 

. - - 

1 MRGSELPLVLLALVLCLAPRGRAVPLPAGGGTVLTKMYPRGNHWAVGHLM 50 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I M I I II I 
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1 MRGSELPLVLLALVLCLAPRGRAVPLPAGGGTVLTKMYPRGNHWAVGHLM 5 0 

51 GKKSTGESSSVSERGSLKQQLREYIRWEEAARNLLGLIEAKENRNHQPPQ 100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I t I I ! I I I I I I I I 

51 GKKSTGESSSVSERGSLKQQLREYIRWEEAARNLLGLIEAKENRNHQPPQ 100 

101 PKALGNQQPSWDSEDSSNFKDVGSKGK 127 

I I I I I I I 1 I I 1 I I ! I I I I I I I 11 I 1 I I 

101 PKALGNQQPSWDSEDSSNFKDVGSKGK 127 



10 



15 
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Expression of GRPJHUM AN - gastrin-releasing peptide (HUMGRP5E) transcripts which 
are detectable by amplicon as depicted in sequence name HUMGRP5Ejunc3-7 in normal 

and cancerous lung tissues 

Expression of GRPHUMAN - gastrin-releasing peptide transcripts detectable by or 
5 according to HUMGRP5Ejunc3-7 amplicon (SEQ ID NO: 1648) and HUMGRP5Ejunc3-7F 
(SEQ ID NO: 1646) and HUMGRP5Ejunc3-7R (SEQ ID NO: 1647) primers was measured by 
real time PCR. In parallel the expression of four housekeeping genes PBGD (GenBank 
Accession No. BC019323; amplicon - PBGD- amplicon, SEQ ID NO:334), HPRT1 (GenBank 
Accession No. NM_000194; amplicon - HPRT1 -amplicon, SEQ ID NO: 1297), Ubiquitin 

10 (GenBank Accession No. BC000449; amplicon - Ubiquitin- amplicon, SEQ ID NO:328) and 
SDHA (GenBank Accession No. NMJ)04168; amplicon - SDH A- amplicon, SEQ ID NO:33 1) 
was measured similarly. For each RT sample, the expression of the above amplicon was 
normalized to the geometric mean of the quantities of the housekeeping genes. The normalized 
quantity of each RT sample was then divided by the median of the quantities of the normal post- 

15 mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, "Tissue samples in testing 
sample",), to obtain a value of fold up -regulation for each sample relative to median of the 
normal PM samples. 

Figure 19 is a histogram showing over expression of the above -indicated 
GRP HUMAN - gastrin-releasing peptide transcripts in several cancerous lung samples relative 

20 to the normal samples. As is evident from Figure 19, the expression of GRP HUMAN - 

gastrin-releasing peptide transcripts detectable by the above amplicon in several cancer samples 
was significantly higher than in the non-cancerous samples (Sample Nos. 47-50, 90-93, 96-99, 
Table 2, "Tissue samples in testing sample"). Notably an over- expression of at least 10 fold was 
found in 2 out of 15 adenocarcinoma samples, and in 7 out of 8 small cells carcinoma samples. 

25 Primer pairs are also optionally and preferably encompassed within the present 

invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: HUMGRP5Ejunc3-7F forward 
primer; and HUMGRP5Ejunc3-7R reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 

30 use of any suitable primer pair; for example, for the above experiment, the following amplicon 
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was obtained as a non- limiting illustrative example only of a suitable amplicon: 
HUMGRP5Ejunc3-7. 

HUMGRP5Ejunc3-7F (SEQ ID NO: 1646) 

ACCAGCCACCTCAACCCA 
5 HUMGRP5Ejunc3-7R (SEQ ID NO: 1 647) 

CTGGAGCAGAGAGTCTTTGCCT 

HUMGRP5Ejunc3-7 (SEQ ID NO: 1648) 
ACCAGCCACCTCAACCCAAGGCCCTGGGCAATCAGCAGCCTTCGTGGGATTCAGAG 
GATAGCAGCAACTTCAAAGATGTAGGTTCAAAAGGCAAAGACTCTCTGCTCCAG 



Expression of GRPJHUMAN - gastrin-releasing peptide (HUMGRP5E) transcripts which are 
detectable by amplicon as depicted in sequence name HUMGRP5Ejunc3-7 in different normal 
tissues 

15 Expression of GRP_HUMAN - gastrin-releasing peptide transcripts detectable byor 

according to HUMGRP5E junc3-7 amplicon (SEQ ID NO: 1648) and HUMGRP5E junc3-7F 
(SEQ ID NO: 1646) and HUMGRP5E junc3~7R (SEQ ID NO: 1647) was measured by real 
time PGR. In parallel the expression of four housekeeping genes -RPL19 (GenBank Accession 
No. NM_000981; RPL1 9 amplicon, SEQ ID NO: 1630), TATA box (GenBank Accession No. 

20 NM_003 1 94; TATA amplicon, SEQ ID NO: 1 633), Ubiquitin (GenBank Accession No. 

BC000449; amplicon - Ubiquitin- amplicon, SEQ ID NO:328) and SDHA (GenBank Accession 
No. NM_004168; amplicon - SDHA- amplicon, SEQ ID NO:331) was measured similarly. For 
each RT sample, the expression of the above amplicon was normalized to the geometric mean of 
the quantities of the housekeeping genes. The normalized quantity of each RT sample was then 

25 divided by the median of the quantities of the breast samples (Sample Nos. 33-35,Table 3, 
"Tissue samples on normal panel", above), to obtain a value of relative expression of each 
sample relative to median of the breast samples. 
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HUMGRP5Ejunc3-7F (SEQ ID NO: 1646) 

ACCAGCCACCTCAACCCA 

HUMGRP5Ejunc3-7R (SEQ ID NO: 1647) 

CTGGAGCAGAGAGTCTTTGCCT 
5 HUMGRP5Ejunc3-7 (SEQ ID NO: 1 648) 

ACCAGCCACCTCAACCCAAGGCCCTGGGCAATCAGCAGCCTTCGTGGGATTCAGAG 
GATAGCAGCAACTTCAAAGATGTAGGTTCAAAAGGCAAAGACTCTCTGCTCCAG 
The results are shown in Figure 20, demonstrating the expression of GRJP_HUMAN - gastrin- 
releasing peptide (HUMGRP5E) transcripts which are detectable by amplicon as depicted in 
10 sequence name HUMGRP5Ejunc3-7 in different normal tissues. 



1 5 DESCRIPTION FOR CLUSTER D56406 

Cluster D56406 features 3 transcript(s) and 10 segment(s) of interest, the names for which 
are given in Tables 174 and 175, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 176. 

Table 174 - Transcripts of interest 



20 



Transcript Name ;■ ' . \-;\ \:( ■;• 


Sequence ID No. 


D56406_PEA_1_T3 


22 


D56406_PEA_1_T6 


23 


D56406_PEA_1_T7 


24 


Table 175 - Segments of interest 


Segment Name 


Sequence ID No. 


D56406_PEA_l_node_0 


340 


D56406_PEA_l_node_l 3 


341 


D56406_PEA_l_node_l 1 


342 
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D56406_PEA_ l_node_2 


343 


D5 6406_PEA_l_node_3 


344 


D56406_PEA_l_node_5 


345 


D56406_PEA_l_node_6 


346 


D56406_PEA_l_node_7 


347 


D56406_PEA_l_node_8 


348 


D56406_PEA_l_node_9 


349 


Table 176 - Proteins of interest 


. Protein Name ; - _ • J- i Z. - 


SequencejJD No. . • 


D56406_PEA_1_P2 


1301 


D56406_PEA_1_P5 


1302 


D56406_PEA_1_P6 


1303 



These sequences are variants of the known protein Neurotensin/neuromedin N precursor 
5 [Contains: Large neuromedin N (NmN- 125); Neuromedin N (NmN) (NN); Neurotensin (NT); 
Tail peptide] (SwissProt accession identifier NEUT_HUMAN) ? SEQ ID NO: 1422, referred to 
herein as the previously known protein. 

Protein Neurotensin/neuromedin N precursor is known or believed to have the following 
function(s): Neurotensin may play an endocrine or paracrine role in the regulation of fat 
10 metabolism. It causes contraction of smooth muscle. The sequence for protein 
Neurotensin/neuromedin N precursor is given at the end of the application, as 
"Neurotensin/neuromedin N precursor [Contains: Large neuromedin N (NmN- 125); 
Neuromedin N (NmN) (NN); Neurotensin (NT); Tail peptide] amino acid sequence". Protein 
Neurotensin/neuromedin N precursor localization is believed to be Secreted; Packaged within 
15 secretory vesicles. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: signal transduction, which are annotation(s) related to Biological 
Process; neuropeptide hormone, which are annotation(s) related to Molecular Function; and 
extracellular; soluble fraction, which are annotation(s) related to Cellular Component. 
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The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

As noted above, cluster D56406 features 3 transcript(s), which were listed in Table 174 
5 above. These transcript(s) encode for protein(s) which are variant(s) of protein 

Neurotensin/neuromedin N precursor. A description of each variant protein according to the 
present invention is now provided. 

Variant protein D56406_PEA_1 JP2 according to the present invention has an amino acid 

10 sequence as given at the end of the application; it is encoded by transcript(s) 

D56406_PEA1_T3. An alignment is given to the known protein (Neurotensin/neuromedin N 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 

15 is as follows: 

Comparison report between D56406JPEA_1 JP2 and NEUTHUMAN: 
l.An isolated chimeric polypeptide encoding for D56406JPEA_1_P2, comprising a first 
amino acid sequence being at least 90 % homologous to 
MMAGMOQLVCMLLLAFSSWSLCSDSEEEM 

20 LLNVCSLVNNLNSPAEETGEVHEEELVARREXPTALDGFSLEAMLTIYQLHKICHSRAF 
QHWE corresponding to amino acids 1-120 of NEUT__HUMAN, which also corresponds to 
amino acids 1-120 of D56406JPEA_1_P2 5 second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

25 ARWLTPVIPALWEAETGGSRGQEMETIPANT corresponding to amino acids 121 - 151 of 
D56406JPEA_1_P2, and a third amino acid sequence being at least 90 % homologous to 
LIQEDILDTGNDKNGKEEVIKRKff corresponding to 

amino acids 121 - 170 of NEUT HUMAN, which also corresponds to amino acids 152-201 of 
D56406_PEA_1_P2, wherein said first, second and third amino acid sequences are contiguous 

30 and in a sequential order. 
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2.An isolated polypeptide encoding for an edge portion of D56406JPEA_1_ P2, 
comprising an amino acid sequence being at least 70%, optionally at least about 80%, preferably 
at least about 85%, more preferably at least about 90% and most preferably at least about 95% 
homologous to the sequence encoding for ARWLTPVIPALWEAETGGSRGQEMETIPANT, 
5 corresponding to D56406_PEA_1 JP2. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 

10 secreted. The protein localization is believed to be secreted because both signal-peptide 

prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans- membrane region.. 

Variant protein D56406JPEA_1_P2 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 177, (given according to their position(s) on the 

15 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein D56406JPEA_1 JP2 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 177 - Amino acid mutations 



SNP pdsition(s) on amino acid 
sequence . . : £.y '\ : -r-; 


Alternative amino acid(s) 


Previously known SNP? f 


30 


M->V 


No 


44 


S->P 


No 


84 


v-> 


No 


84 


V -> A 


No 



20 

Variant protein D56406JPEA_1_P2 is encoded by the following transcript(s): 
D56406JPEA_1_T3, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript D56406_PEA_1_T3 is shown in bold; this coding portion starts at 
position 106 and ends at position 708. The transcript also has the following SNPs as listed in 
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Table 178 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein D56406JPEA_1_P2 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 



5 Table 1 78 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 4 •? , - 


Alternative nueleifc acid 


Previously known SNP'? 


94 


G->T 


No 


95 


A->T 


No 


858 


T->G 


Yes 


103 


A->G 


Yes 


193 


A->G 


No 


235 


T->C 


No 


339 


T->C 


No 


356 


T-> 


No 


356 


T->C 


No 


417 


A->T 


No 


757 


T-> 


No 



Variant protein D56406JPEA_1 JP5 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
10 D56406 PEA1 T6. An alignment is given to the known protein (Neurotensin/neuromedin N 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

1 5 Comparison report between D56406 JPEA_1JP5 and NEUT_HUMAN: 

LAn isolated chimeric polypeptide encoding for D56406_PEA_1JP5, comprising a first 
amino acid sequence being at least 90 % homologous to MMAGMK1QLVCMLLLAFSSWSLC 



WO 2006/131783 



PCT/IB2005/004037 



343 

corresponding to amino acids 1 - 23 of NEUT_HUMAN, which also corresponds to amino acids 
1-23 of D56406JPEA_1_P5, and a second amino acid sequence being at least 90 % 
homologous to 

SEEEMKALEADFLTNMHTSKISKAHVPSWKMTLLNVCSLVNNLNSPAEETGEVHEEEL 
5 VARJRKLPTALDGFSLEAMLTIYQLHKICHSRAFQHWELIQEDILDTGNDKNGKEEVIKR 
KIPYILKRQLYENKPRRPYILKRDSYYY corresponding to amino acids 26- 170 of 
NEUTJHUMAN, which also corresponds to amino acids 24- 168 of D56406JPEAJJP5, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

2.An isolated chimeric polypeptide encoding for an edge portion of D56406JPEA_1 JP5, 

10 comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise CS, having a 
structure as follows: a sequence starting from any of amino acid numbers 23 -x to 23; and ending 

15 at any of amino acid numbers 24 4- ((n-2) - x), in which x varies from 0 to i>2. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 

20 secreted. The protein localization is believed to be secreted because both signal-peptide 

prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein D56406_PEA_1 JP5 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 179 ? (given according to their position(s) on the 

25 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein D56406_PEA_1_P5 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 179 -Amino acid mutations 
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SNP>|)Qsitioii(s) on amino aicid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


28 


M-> V 


No 


42 


S ->P 


No 


82 


v-> 


No 


82 


V -> A 


No 



Variant protein D56406JPEA_1_P5 is encoded by the following transcript(s): 
D56406__PEA_1_T6> for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript D56406_PEA 1_T6 is shown in bold; this coding portion starts at 
5 position 106 and ends at position 609. The transcript also has the following SNPs as listed in 
Table 1 80 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein D56406 PEA1P5 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 180- Nucleic acid SNPs 



SNP position on nucleotide ; 

sequence,.. '- ■ : .— . ;-. 


Alternative nucleic acid % , '; 


Previously known SNP? r 


94 


G->T 


No 


95 


A->T 


No 


759 


T->G 


Yes 


806 


G-> A 


Yes 


1014 


T->G 


No 


1178 


T->G 


No 


103 


A->G 


Yes 


187 


A->G 


No 


229 


T->C 


No 


333 


T->C 


No 


350 


T-> 


No 
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350 


T -> C 


No 


411 


A->T 


No 


658 • 


T-> 


No 



Variant protein D56406JPEA1JP6 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 D56406JPEA_1_T7. An alignment is given to the known protein (Neurotensin/neuromedin N 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

1 0 Comparison report between D56406_PEA_1 J>6 and NEUT__HUM AN : 

l.An isolated chimeric polypeptide encoding for D56406_PEA_1JP6, comprising a first 
amino acid sequence being at least 90 % homologous to 

MMAGMKIQLVCMLLLAFSS corresponding to 

amino acids 1 - 45 of NEUT JHUMAN, which also corresponds to amino acids 1 - 45 of 
15 D56406JPEA_1_P6, and a second amino acid sequence being at least 90 % homologous to 

LIQEDILDTGNDl^GKEEV corresponding to 

amino acids 121 - 170 of NEUTJHUMAN, which also corresponds to amino acids 46 - 95 of 
D56406 PEA1 JP6, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

20 2. An isolated chimeric polypeptide encoding for an edge portion of D56406_PEA_1_P6, 

comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise KL, having a 

25 structure as follows: a sequence starting from any of amino acid numbers 45-x to 45; and ending 
at any of amino acid numbers 46 4- ((n-2) - x), in which x varies from 0 to n-2. 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
5 prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein D56406JPEA1 JP6 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 181, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
10 the SNP is known or not; the presence of known SNPs in variant protein D56406JPEA_1_P6 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 181 - Amino acid mutations 



i$NP pbsipon(s) ojf amino acid 
sequence . „ 


Alternative amiii^it0id(s) , : ; 


" Previously ISnown SNP? 


30 


M-> V 


No 


44 


S->P 


No 



15 Variant protein D56406_PEA_1P6 is encoded by the following transcript(s): 

D56406_PEA_1_T7, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript D56406JPEA_1_T7 is shown in bold; this coding portion starts at 
position 106 and ends at position 390. The transcript also has the following SNPs as listed in 
Table 182 (given according to their position on the nucleotide sequence, with the alternative 

20 nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein D56406_PEA__1JP6 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 182 - Nucleic acid SNPs 



SNP position on nucleotide 


Alternative nucleic acid 


Previously known SNP? 


sequence 
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94 


G->T 


No 


95 


A->T 


No 


103 


A->G 


Yes 


193 


A->G 


No 


235 


T->C 


No 


439 


T-> 


No 


540 


T->G 


Yes 


587 


G-> A 


Yes 


795 


T->G 


No 


959 


T->G 


No 



As noted above, cluster D56406 features 10 segment(s), which were listed in Table 2 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
5 provided. 

Segment cluster D56406_PEA_l_node_0 according to the present invention is supported 
by 48 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): D56406JPEA_1_T3, D56406_PEA__1_T6 and 
10 D56406_PEA_1_T7. Table 183 below describes the starting and ending position of this segment 
on each transcript. 

Table 183 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


D56406_PEA_1_T3 


1 


178 


D56406_PEA_1_T6 


1 


178 


D56406_PEA_1_T7 


1 


178 



Microarray (chip) data is also available for this segment as follows. As described above 
15 with regard to the cluster itself, various oligonucleotides were tested for being differentially 
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expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (with regard to lung cancer), shown in Table 184. 

Table 184 - Oligonucleotides related to this segment 



.Oligonucleotide name \ 


Qyferexpi^^ 


Chip reference " & 


D56406J)_5J) 1 


lung malignant tumors 


LUN 



Segment cluster D56406_PEA_l_node_13 according to the present invention is supported 
by 43 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): D56406J>EA_1_T3, D56406JPEA__1_T6 and 
D56406JPEA_1_T7. Table 185 below describes the starting and ending position of this segment 
10 on each transcript. 



Table 185 - Segment location on transcripts 



Transcript blame . ' , v' • 


Segment st^rig^sWoir^ v 


Segment ending positioir ;S . 


D56406JPEA_1_T3 


559 


902 


D56406_PEA_1JT6 


460 


1239 


D56406_PEA_1_T7 


241 


1020 



15 According to an optional embodiment of the present invention, short segments related to 

the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

Segment cluster D56406JPEA_l_node_l 1 according to the present invention is supported 
20 by 1 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): D56406JPEA_1 JT3. Table 186 below describes the 
starting and ending position of this segment on each transcript. 

Table 186 - Segment location on transcripts 
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Transcript name - ; 


Segment starting position 


Segment ending position ; 


D56406_PEA_1_T3 


466 


558 



Segment cluster D56406JPEA_l_nodeJ2 according to the present invention can be found 
in the following transcript(s): D56406_PEA_1JT3 and D56406_PEA_1_T7. Table 187 below 
5 describes the starting and ending position of this segment on each transcript. 

Table 187 - Segment location on transcripts 



- — ■ - -■ --' 1 — -T'"-"'" ".a?''-""' ...... 1 

'Transcript name 


• Segment starting position ■' 


Segment ending position 


D56406_PEA_1_T3 


179 


184 


D56406_PEA_1_T7 


179 


184 



Segment cluster D56406JPEA_l_node_3 according to the present invention is supported 
10 by 46 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): D56406JPEA_1_T3, D56406_PEA_1__T6 and 
D56406JPEA_1_T7. Table 188 below describes the starting and ending position of this segment 
on each transcript. 

Table 188 - Segment location on transcripts 



Transcript name v f . ? 


Segment starting position • . 


Segment ending position 


D56406_PEA_1_T3 


185 


240 


D56406_PEA_1_T6 


179 


234 


D56406_PEA_1_T7 


185 


240 



15 

Segment cluster D56406_PEA_l_node_5 according to the present invention is supported 
by 48 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): D56406JPEA__1_T3 and D5 6406_PEA_1 JT6. Table 
20 1 89 below describes the starting and ending position of this segment on each transcript. 
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Table 189 - Segment location on transcripts 



Transcript name ' * 


Segment starting position 


Segment ending position 


D56406_PEA_1_T3 


241 


355 


D56406_PEA_1_T6 


235 


349 



Segment cluster D56406_PEA_l_node_6 according to the present invention is supported 
by 34 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): D56406JPEA_1 JT3 and D56406JPEA_1 JT6. Table 
190 below describes the starting and ending position of this segment on each transcript. 



Table 190 - Segment location on transcripts 



Transcript name 


Segment starting position * 


Segment ending position 


D56406_PEA_1_T3 


356 


389 


D56406_PEA_1_T6 


350 


383 



Segment cluster D56406_PEA_l_nodeJ7 according to the present invention is supported 
by 32 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): D56406_PEA_1_T3 and D56406JPEA_1_T6. Table 
191 below describes the starting and ending position of this segment on each transcript. 



Table 191 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


D56406_PEA_1_T3 


390 


415 


D56406_PEA_1_T6 


384 


409 



Segment cluster D56406_PEA_l_node_8 according to the present invention can be found 
in the following transcript(s): D56406JPEA_1_T3 and D56406_PEA_1_T6. Table 192 below 
describes the starting and ending position of this segment on each transcript. 
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Table 192 - Segment location on transcripts 



Transcript name ; 5 


Segment starting position 


Segment ending position 


D56406_PEA_1_T3 


416 


423 


D56406_PEA_1_T6 


410 


417 



Segment cluster D56406JPEA_l_node_9 according to the present invention is supported 
5 by 31 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): D56406J>EA_J_T3 and D56406JPEA_1_T6. Table 
193 below describes the starting and ending position of this segment on each transcript. 



Table 193 - Segment location on transcripts 



Transcript name , < ; 


Segment starting position 


Segment ending position 

.... - >< • ■-' -2.- •-'<$&'"■ 


D56406_PEA_1_T3 


424 


465 


D56406_PEA_1_T6 


418 


459 



Variant protein alignment to the previously known protein: 
15 Sequence name: /tmp/ jU4 9325aMA/ 8F0XuN7La5 :NEUT_HUMAN 

Sequence documentation : 

Alignment of: D5 64 0 6_PEA_1_P2 x NEUTJHUMAN 

20 

Alignment segment 1/1: 

Quality: 1591.00 



Escore : 



0 
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Matching length: 170 Total 

length: 201 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 
5 Total Percent Similarity: 84.58 Total Percent 

Identity: 84.58 

Gaps: 1 



10 



Alignment : 



1 MMAGMKIQLVCMLLLAFSSWSLCSDSEEEMKALEADFLTNMHTSKISKAH 50 

I ! I I I I I I II I I I I I I I I II I I I I I I I I I 1 I I I I I 11 I I I 1 1 I I II I I I I 

1 MMAGMK1QLVCMLLLAFSSWSLCSDSEEEMKALEADFLTNMHTSKISKAH 50 
15 51 VPSWKMTLLNVCSLVNNLNSPAEETGEVHEEELVARRKLPTALDGFSLEA 100 

I I I I II I I i I I I I I I I I I I I I I ! I I I I I I I I I I I ! I I I I I I I I I I I I I I I 

51 VPSWKMTLLNVCSLVNNLNSPAEETGEVHEEELVARRKLPTALDGFSLEA 100 

101 MLT I YQLHKI CHSRAFQHWEARWLT PVI PALWEAETGGSRGQEMETI PAN 150 
20 I I I I I I I I I I I I I I I I I I I I 

101 MLTIYQLHKICHSRAFQHWE 120 

. • • • • 

151 TLIQEDILDTGNDKNGKEEVIKRKIPYILKRQLYENKPRRPYILKRDSYY 200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
25 121 . LIQEDILDTGNDKNGKEEVIKRKIPYILKRQLYENKPRRPYILKRDSYY 169 

201 Y 201 
I 

170 Y 170 

30 
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5 Sequence name: /tmp/wWui8Kd4y9/ zbf SihRwnR : NEUT_HUMAN 
Sequence documentation : 

Alignment of: D5 640 6_PEA_1_P5 x NEUT_HUMAN 

10 

Alignment segment 1/1: 

Quality: 1572.00 

Escore: 0 
15 Matching length: 168 

length: 170 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 98.82 
20 Identity: 98.82 

Gaps : 1 



Total 
Matching Percent 
Total Percent 



Alignment : 

25 1 MMAGMKIQLVCMLLLAFSSWSLC. .SEEEMKALEADFLTNMHTSKISKAH 48 

| | 1 i | | | | | I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1 MMAGMKIQLVCMLLLAFSSWSLCSDSEEEMKALEADFLTNMHTSKISKAH 50 

4 9 VPSWKMTLLNVCSLVNNLNSPAEETGEVHEEELVARRKLPTALDGFSLEA 98 

30 | | | | | | | i | | I I I I I I 1 I 1 I I I I I I ! I I i I I I I I I I I ! I I I I I I I I I I I I 

51 VPSWKMTLLNVCSLVNNLNSPAEETGEVHEEELVARRKLPTALDGFSLEA 100 
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99 MLTIYQLHKICHSRAFQHWELIQEDILDTGNDKNGKEEVIKRKIPYILKR 14 8 

I I I I I II i I I I I I I I I I I I I I I I I M I I I I I I t i I I I I 1 I I I I I I I I I I I 

101 MLTIYQLHKICHSRAFQHWELIQEDILDTGNDKNGKEEVIKRKIPYILKR 150 

14 9 QLYENKPRRPYILKRDSYYY 168 

I I I I I I I I I I I I I I t I I I I I 
151 QLYENKPRRPYILKRDSYYY 17 0 



15 Sequence name: /tmp/f 5d07f F5D7/E4N5xjUIAN :NEUT__HUMAN 
Sequence documentation : 

Alignment of: D5 640 6__PEA_1_P6 x NEUT_HUMAN 

20 

Alignment segment 1/1: 

Quality: 844.00 

Escore: 0 

25 Matching length: 95 Total 

length: 170 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 55.88 Total Percent 

30 Identity: 55.88 

Gaps : 1 
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Alignment : 

1 MMAGMKIQLVCMLLLAFSSWSLCSDSEEEMKALEADFLTNMHTSK 45 

5 I I i I I I II 1 I ! 1 I ! I 1 I I I I I I I I 1 1 I I 1 1 1 I 1 I I I ! I I I I I 1 I I 

1 MMAGMK I'Q L VCMLLL A FSSWSLCSDSEE EMKALE AD FL TNMH T S KI S K AH 50 

45 45 

10 51 VPSWKMTLLNVCSLVNNLNSPAEETGEVHEEELVARRKLPTALDGFSLEA 100 

4 6 LIQEDILDTGNDKNGKEEVIKRKIPYILKR 7 5 

I I I I I I 1 I I I I I I I 1 I I I II I I I I 1 I I I I I 

101 MLTIYQLHKICHSRAFQHWELIQEDILDTGNDKNGKEEVIKRKIPYILKR 150 



15 



7 6 QLYENKPRRPYILKRDSYYY 95 

I I I I I I I I I I I I I I I I I I I 1 

151 QLYENKPRRPYILKRDSYYY 170 



20 



DESCRIPTION FOR CLUSTER F05068 
25 Cluster F05068 features 3 transcript(s) and 12 segment(s) of interest, the names for which 

are given in Tables 194and 195, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 196. 

Table 194 - Transcripts of interest 



Transcript Name 


Sequence ID No. 


F05068JPEA_1_T3 


25 
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F05068_PEA_1_T4 


26 


F05068JPEA_1_T6 


27 


Table 195 - Segments of interest 


Segment Name \ . ■ f * 


Sequence ID No. - 


F05068_PEA_l_node_0 


350 


F05068_PEA_l_node_l 0 


351 ■ 


F05068_PEA_l_node_12 


352 


F05068_PEA_l_node_l 3 


353 


F05068_PEA_l_node_4 


354 


F05068_PEA_l_node_8 


355 


F05068_PEA_l_node_l 1 


356 


F05068_PEA_l_node_3 


357 


F05068 PEA_l_node_5 


358 


F05068_PEA_l_node_6 


359 


F05068_PEA_l_node_7 


360 


F05068_PEA_l_node_9 


361 


Table 196- Proteins of interest 


Protein Name , ' ■ ' 

• ■; - : ."'-V - " •- -iv- 


Sequence ID No. \}~. 'y ' 


F05068_PEA_1_P7 


1304 


F05068_PEA_1_P8 


1305 



5 



These sequences are variants of the known protein ADM precursor [Contains: 
Adrenomedullin (AM); Proadrenomedullin N-20 terminal peptide (ProAM-N20) (ProAM N- 
terminal 20 peptide) (PAMP)] (SwissProt accession identifier ADMLJHUMAN), SEQ ID 
NO:1423 3 referred to herein as the previously known protein. 
10 Protein ADM precursor is known or believed to have the following function(s): AM and 

PAMP are potent hypotensive and vasodilatator agents. Numerous actions have been reported, 
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most related to the physiologic control of fluid and electrolyte homeostasis. In the kidney, AM 
is diuretic and natriuretic, and both AM and PAMP inhibit aldosterone secretion by direct 
adrenal actions. In pituitary gland, both peptides at physiologically relevant doses inhibit basal 
ACTH secretion. Both peptides appear to act in brain and pituitary gland to facilitate the loss of 

5 plasma volume, actions which complement their hypotensive effects in blood vessels. The 

sequence for protein ADM precursor is given at the end of the application, as "ADM precursor 
[Contains: Adrenomedullin (AM); Proadrenomedullin N-20 terminal peptide (ProAM-N20) 
(ProAM N-terminal 20 peptide) (PAMP)] amino acid sequence". Known polymorphisms for 
this sequence are as shown in Table 197. 

10 Table 197 - Amino acid mutations for Known Protein 



SNP position(s^x>n /r ,/ 
amiao acidiequenpe - 


;' '-f-, •/■". ; • ;-.>;. : ; :'.;V«- ; . .1 ; , 


50 


S -> R (in dbSNP:5005). /FTId=VAR_014861. 



Protein ADM precursor localization is believed to be Secreted. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: cAMP biosynthesis; progesterone biosynthesis; signal transduction; 
cell-cell signaling; pregnancy; excretion; circulation; response to wounding, which are 
annotation(s) related to Biological Process; ligand; hormone, which are annotation(s) related to 
Molecular Function; and extracellular space; soluble fraction, which are annotation(s) related to 
Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

Cluster F05068 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
25 according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 21 refer to weighted expression of ESTs in 



15 



20 
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each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
5 Figure 21 and Table 198. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: uterine malignancies. 



Table 198 - Normal tissue distribution 



Name of Tis$iie . 


■Numbe^' : .c 


bladder 


164 


bone 


259 


brain 


26 


colon 


66 


epithelial 


73 


general 


67 


head and neck 


0 


kidney 


49 


liver 


0 


lung 


51 


lymph nodes 


0 


breast 


87 


ovary 


0 


pancreas 


30 


skin 


295 


stomach 


0 


Thyroid 


0 


uterus 


13 



Table 199 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI 


P2 


SH 


R3 


SP2 


R4 
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bladder 


7.6e-01 


8.0e-01 


9.4e-01 


0.5 


9.9e-01 


0.4 


bone 


7.5e-01 


8.8e-01 


1 


0.1 


1 


0.3 


brain 


5.2e-01 


6.1e-01 


7.0e-04 


2.1 


l.le-02 


1.4 


colon 


6.2e-01 


6.1e-01 


9.7e-01 


0.5 


9.6e-01 


0.6 


epithelial 


1.0e-01 


3.0e-02 


7.8e-01 


0.7 


5.8e-01 


0.9 


general 


3.7e-01 


2.6e-01 


8.5e-01 


0.8 


9.0e-01 


0.8 


head and neck 


2.1e-01 


l.le-01 


1 


1.0 


3.2e-01 


2.3 


kidney 


3.8e-01 


3.9e-01 


6.6e-02 


1.8 


1.2e-02 


2.2 


liver 


1.8e-01 


1.2e-01 


2.3e-01 


4.3 


2.3e-01 


2.6 


lung 


6.2e-01 


4.3e-01 


8.5e-01 


0.7 


3.8e-01 


1.0 


lymph nodes 


1 


3.1e-01 


1 


1.0 


1 


1.3 


breast 


7.8e-01 


5.8e-01 


9.1e-01 


0.6 


o o ^ A1 

o.ye-Ui 


u. / 


ovary 


3.8e-01 


2.6e-01 


3.2e-01 


2.4 


1.6e-01 


2.5 


pancreas 


5.1e-01 


3.3e-01 


7.0e-01 


0.9 


1.0e-01 


1.4 


skin 


6.0e-01 


5.2e-01 


9.7e-01 


03 


1 


0.1 


stomach 


3.6e-01 


3.0e-01 


1 


1.0 


4.1e-01 


1.8 


Thyroid 


5.0e-01 


5.0e-01 


6.7e-01 


1.7 


6.7e-01 


1.7 


uterus 


l.le-01 


2.6e-01 


2.1e-03 


3.2 


2.3e-02 


2.2 



above. These transcript(s) encode for protein(s) which are variant(s) of protein ADM precursor. 
A description of each variant protein according to the present invention is now provided. 



10 



Variant protein F05068JPEA_1_P7 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
F05068_PEA_1_T3 and F05068JPEA_1_T6. An alignment is given to the known protein 
(ADM precursor) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

Comparison report between F05068JPEA_1 JP7 and ADMLJHUMAN: 
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l.An isolated chimeric polypeptide encoding for F05068_PEA_1_P7, comprising a first 
amino acid sequence being at least 90 % homologous to 

MKLVSVALMYLGSLAFLGADTARLDVASEFRKK corresponding to amino acids 1 - 33 of 
ADMLHUMAN, which also corresponds to amino acids 1 - 33 of F05068JPEA_1_P7. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein F05068_PEA_1_P7 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 200, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein F05068JPEA_1_P7 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 200 - Amino acid mutations 



SNP p^itioii(s) cm amii» acid 
sequence *, l f i. 


, 4tora^ftive amino aeid(s) V s ;, 


Previously known SNP? 


4 


V->F 


No 


10 


Y->C 


No 



Variant protein F05068 JPEA_1 JP7 is encoded by the following transcript(s): 
F05068_PEAJ_T3 and F05068JPEA_1 JT6, for which the sequence(s) is/are given at the end 
of the application. 

The coding portion of transcript F05068_PEA_1_T3 is shown in bold; this coding portion 
starts at position 267 and ends at position 365. The transcript also has the following SNPs as 
listed in Table 201 (given according to their position on the nucleotide sequence, with the 
alternative nucleic acid listed; the last column indicates whether the SNP is known or not; the 
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presence of known SNPs in variant protein F05068_PEA_1_P7 sequence provides support for 
the deduced sequence of this variant protein according to the present invention). 

Table 201 - Nucleic acid SNPs 



SNP position pri nucleotide 
sequence ^ / . 


Alternative nucleic acid 

VS ^ ■ , VS. _ 


Previously known SNP? 

"i ■ - ' ' .... ; 

* ■ • .. - % ' - 1 ■ '-i 


26 


C->T 


Yes 


164 


T-> 


No 


593 


G->C 


Yes 


860 


C-> 


No 


860 


C -> A 


No 


1022 


G -> A 


No 


1023 


G-> A 


No 


1023 


G->C 


Yes 


1084 


G->A 


Yes 


1088 


C-> 


No 


1088 


C->A 


No 


1106 


C-> 


No 


177 


T-> 


No 


1106 


C->A 


No 


1149 


G-> 


No 


1154 


C-> 


No 


1171 


T->G 


Yes 


1192 


G-> 


No 


1224 


C-> 


No 


1266 


C-> 


No 


1282 


C->T 


No 


1381 


G-> A 


No 


1450 


T-> 


No j 


206 


C->T 


Yes 
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1457 


T->G 


No 


1534 


C -> 


No i 


1535 


C -> 


No 


1554 


A->G 


Yes 


1572 


A->C 


No 


1572 


A->G 


No 


1655 


A->C 


Yes 


1669 


T->C 


Yes 


1721 


C -> T 


XT/-. 

JNO 


245 


G-> 


No 


259 


C-> 


No 


276 


G->T 


No 


295 


A->G 


No 


317 


A->C 


Yes 


566 


C->G 


Yes 



The coding portion of transcript F05068JPEA_1 JT6 is shown in bold; this coding portion 
starts at position 267 and ends at position 365. The transcript also has the following SNPs as 
listed in Table 202 (given according to their position on the nucleotide sequence, with the 
5 alternative nucleic acid listed; the last column indicates whether the SNP is known or not; the 
presence of known SNPs in variant protein F05068_PEA_1 JP7 sequence provides support for 
the deduced sequence of this variant protein according to the present invention). 

Table 202 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


26 


C->T 


Yes 


164 


T-> 


No 


593 


G->C 


Yes 


739 


C->G 


Yes 


1093 


C-> 


No 
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1093 


C->A 


No 


1255 


G-> A 


No 


1256 


G->A 


No 


1256 


G -> C 


Yes 


1317 


G-> A 


Yes 


1321 


C-> 


No 


1321 


C-> A 


No 


177 


T-> 


No 


1339 


C-> 


No 


1339 


C->A 


No 


1382 


G-> 


No 


1387 


C-> 


No 


1404 


T ->G 


Yes 


1425 


G-> 


No 


1457 


C-> 


No 


1499 


C -> 


No 


1515 


C->T 


No 


1614 


G-> A 


No 


206 


C->T 


Yes 


1683 


T-> 


No 


1690 


T->G 


No 


1767 


C-> 


No 


1768 


C-> 


No 


1787 


A-> G 


Vac 

Y es 


1805 


A->C 


No 


1805 


A->G 


No 


1888 


A->C 


Yes 


1902 


T->C 


Yes 


1954 


C->T 


No 


245 


G-> 


No 
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259 


c-> 


No 


276 


G->T 


No 


295 


A->G 


No 


317 ' 


A->C 


Yes 


566 


C->G 


Yes 



Variant protein F05068JPEA_1 JP8 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
F05068 JPEA_1_T4. An alignment is given to the known protein (ADM precursor) at the end of 
the application. One or more alignments to one or more previously published protein sequences 



5 are given at the end of the application. A brief description of the relationship of the variant 
protein according to the present invention to each such aligned protein is as follows: 
Comparison report between F05068_PEA_1_P8 and ADML^HUMAN: 
l.An isolated chimeric polypeptide encoding for F05068_PEA_1 JP8, comprising a first 
amino acid sequence being at least 90 % homologous to 

10 MKLVSVALMYLGSLAFLGA 

DVKAGPAQTLIRPQDMKGASRSPED corresponding to amino acids 1 - 82 of 
ADMLHUMAN, which also corresponds to amino acids 1-82 of F05068JPEA_1_P8, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 

15 having the sequence R corresponding to amino acids 83 - 83 of F05068_PEA_1 JP8, wherein 
said first and second amino acid sequences are contiguous and in a sequential order. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
20 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein F05068_PEA_1 JP8 also has the following non-silent SNPs (Single 
25 Nucleotide Polymorphisms) as listed in Table 203, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
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the SNP is known or not; the presence of known SNPs in variant protein F05068JPEA_1_P8 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 203 - Amino acid mutations 



SNP position(s) on amino acid 
' sequence' . v ., - ' ,v 


Alternative amino acid(s) 


Previously known SNP? 


4 


V->F 


No 


50 


S->R 


Yes 


10 


Y->C 


No 



5 



Variant protein F05068JPEA_1 JP8 is encoded by the following transcript(s): 
F05068_PEA_1JT4, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript F05068_PEA__1 JT4 is shown in bold; this coding portion starts at 
position 267 and ends at position 515. The transcript also has the following SNPs as listed in 
10 Table 204 (given according to their position on the nucleotide sequence, with the alternative 

nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein F05068 PEA1JP8 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 204 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid : 


Previously known SNP? ; : 


26 


C->T 


Yes 


164 


T-> 


No 


443 


G->C 


Yes 


589 


C->G 


Yes 


943 


C-> 


No 


943 


C->A 


No 


1105 


G-> A 


No 


1106 


G-> A 


No 
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1106 


G-> C 


Yes 


1167 


G-> A 


Yes 


1171 


C-> 


No 


1171 


C -> A 


No 


177 


T-> 


No 


1189 


C -> 


No 


1189 


C->A 


No 


1232 


G-> 


No 


1237 


C-> 


No 


1254 


T->G 


Yes 


1275 


G-> 


No 


1307 


C-> 


No 


1349 


C -> 


No 


1365 


C->T 


No 


1464 


G-> A 


No 


206 


C->T 


Yes 


1533 


T-> 


No 


1540 


T->G 


No 


1617 


C-> 


No 


1618 


C-> 


No 


1637 


A->G 


Yes 


1655 


A->C 


No 


1655 


A->G 


No 


1738 


A-> C 


Yes 


1752 


T->C 


Yes 


1804 


C->T 


No 


245 


G-> 


No 


259 


C-> 


No 


276 


G->T 


No 


295 


A->G 


No 
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317 


A->C 


Yes 


416 


C->G 


Yes 


As noted above, cluster F05068 features 12 segment(s), wl 


lich were listed in Table 2 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 



5 provided. 

Segment cluster F05068JPEA_l_node_0 according to the present invention is supported 
by 143 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): F05068JPEA_1_T3, F05068JPEA_1 JT4 and 
10 F05068JPEA__1_T6. Table 205 below describes the starting and ending position of this segment 
on each transcript. 



Table 205 - Segment location on transcripts 



Transcript name 


Segment starting position 


^Segment ending positipii 


F05068_PEA_1_T3 


1 


245 


F05068_PEA_1_T4 


1 


245 


F05068_PEA_1_T6 


1 


245 



15 Segment cluster F05068JPEA_l_node_10 according to the present invention is supported 

by 127 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): F05068JPEA_1_T3, F05068JPEA_1 JT4 and 
F05068JPEA1T6. Table 206 below describes the starting and ending position of this segment 
on each transcript. 

20 Table 206 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


F05068_PEA_1_T3 


749 


909 


F05068_PEA_1_T4 


832 


992 
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F05068_PEA_1_T6 


982 


1142 









Segment cluster F05068JPEA_l_node_12 according to the present invention is supported 
by 123 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): F05068JPEA_1_T3, F05068_PEA__1_T4 and 

F05068JPEA_1_T6. Table 207 below describes the starting and ending position of this segment 
on each transcript. 



Table 207 - Segment location on transcripts 





Segment starting position J 


Segment ending position 


F05068_PEA_1_T3 


986 


1106 


F05068_PEA_1_T4 


1069 


1189 


F05068_PEA_1_T6 


1219 


1339 



10 

Segment cluster F05068_PEA_l_node_13 according to the present invention is supported 
by 181 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): F05068JPEA_1 JT3, F05068JPEA_1_T4 and 
F05068_PEA_1_T6. Table 208 below describes the starting and ending position of this segment 
15 on each transcript. 



Table 208 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


F05068_PEA_1_T3 


1107 


1737 


F05068_PEA_1_T4 


1190 


1820 


F05068_PEA_1_T6 


1340 


1970 



Segment cluster F05068_PEA_l_node_4 according to the present invention is supported 
20 by 15 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): F05068JPEA_1_T3 and F05068JPEA_1_T6. Table 
209 below describes the starting and ending position of this segment on each transcript. 

Table 209- Segment location on transcripts 



Transcript name " f ; 


Segment starting position 


Segment Ending position 


F05068_PEA_1_T3 


365 


514 


F05068JPEA_1_T6 


365 


514 



5 Segment cluster F05068JPEA_l_node_8 according to the present invention is supported 

by 13 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following tmnscript(s): F05068_PEA_1_T4 and F05068_PEA_1_T6. Table 
210 below describes the starting and ending position of this segment on each transcript. 

Table 210 - Segment location on transcripts 





Segment starting position 


; Segment ending position 


F05068_PEA_1_T4 


515 


747 


F05068_PEA_1_T6 


665 


897 



10 

According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

15 Segment cluster F05068JPEA_l_node_l 1 according to the present invention is supported 

by 112 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): F05068_PEA_1„T3, F05068_PEA_1_T4 and 
F05068_PEA_1_T6. Table 211 below describes the starting and ending position of this segment 
on each transcript 

20 Table 211 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


F05068_PEA_1_T3 


910 


985 


F05068_PEA_1_T4 


993 


1068 
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F05068_PEA_1_T6 


1143 


1218 









Segment cluster F05068_PEA_l_node_3 according to the present invention is supported 
by 145 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): F05068JPEA_1_T3, F0S068JPEAJJT4 and 

F05068_PEA_1_T6. Table 212 below describes the starting and ending position of this segment 
on each transcript. 



Table 212 - Segment location on transcripts 



Traiiscript name fc x 


Segment starting position 


Segment ending position 


F05068_PEA_1_T3 


246 


364 


F05068_PEA_1_T4 


246 


364 


F05068_PEA_1_T6 


246 


364 



Segment cluster F05068_PEA_l_node_5 according to the present invention is supported 
by 124 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): F05068_PEA_1 JT3, F05068JPEAJLT4 and 
F05068JPEA_1_T6. Table 213 below describes the starting and ending position of this segment 
15 on each transcript. 



Table 213 - Segment location on transcripts 



Transcript name ; 


Segment starting position 


Segment ending position 


F05068_PEA_1_T3 


515 


573 


F05068_PEA_1_T4 


365 


423 


F05068_PEA_1_T6 


515 


573 



Segment cluster F05068JPEA_l_node_6 according to the present invention is supported 
20 by 1 10 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): F05068JPEA_1_T3, F05068 JPEA_1_T4 and 
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F05068JPEA_1 JT6. Table 214 below describes the starting and ending position of this segment 
on each transcript. 

Table 214 - Segment location on transcripts 



Transcript name r ^ij: ' •% 


£e^ment starting position 


Segment ending position ; 


F05068_PEA_1_T3 


574 


613 


F05068_PEA_1_T4 


424 


463 


F05068_PEA_1_T6 


574 


613 



Segment cluster F05068 JPEA_l_node_7 according to the present invention is supported 
by 109 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): F05068JPEA_1_T3, F05068_PEA_1__T4 and 
F05068JPEA_1_T6. Table 215 below describes the starting and ending position of this segment 
10 on each transcript. 



Table 215 - Segment location on transcripts 



Transcript name ■ . 


f Segment starting posit|orf 


Segment ending position 


F05068_PEA_1_T3 


614 


664 


F05068_PEA_1_T4 


464 


514 


F05068_PEA_1_T6 


614 


664 



Segment cluster F05068JPEA_l_node_9 according to the present invention is supported 
15 by 114 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): F05068JPEA_1_T3, F05068_PEA_1_T4 and 
F05068 PEA1T6. Table 216 below describes the starting and ending position of this segment 
on each transcript. 



Table 216 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


F05068_PEA_1_T3 


665 


748 
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F05068_PEA_1_T4 


748 


831 


F05068_PEA_1_T6 


898 


981 



5 



Variant protein alignment to the previously known protein: 

Sequence name: /tmp/kEsi3RWsCN/ Isvdh j f iNV : ADML_HUMAN 

10 

Sequence documentation : 

Alignment of: F050 68_PEA_1__P7 x ADML_HUMAN 
15 Alignment segment 1/1: 

Quality: 304.00 

Escore: 0 

Matching length: 33 Total 

20 length: 33 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 
25 Gaps : 0 



Alignment : 

1 MKLVSVALMYLGSLAFLGADTARLDVASEFRKK 



33 
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I I I I I I I M I I I I 1 I I I I I I I I I I I I I I I I I I I 

1 MKL V S VALM Y L G S L AFLG ADT ARL D VA S E FRKK 



Sequence name: / tmp/tcrlWIx4kg/aghbr 8Eh8n : ADMLJ3UMAN 
Sequence documentation : 

Alignment of: F050 68_PEA_1_P8 x ADML_HUMAN 
Alignment segment 1/1: 

Quality: 

Escore: 0 

Matching length: 
length: 82 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 

Alignment : 

1 MKL VSVALMYLGSL AFLG ADT ARL DVASEFRKKWNKWALSRGKRELRMSS 50 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MKLVSVALMYLGSLAFLGADTARLDVASEFRKKWNKWALSRGKRELRMSS 50 



791 . 00 

82 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 
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51 SYPTGLADVKAGPAQTLIRPQDMKGASRSPED 82 

I I I I I I i I !! I I I I I I I I ! I I t I I i I I I I M I 

51 SYPTGLADVKAGPAQTLIRPQDMKGASRSPED 82 



DESCRIPTION FOR CLUSTER H14624 
Cluster HI 4624 features 1 transcript(s) and 15 segment(s) of interest, the names for which 
10 are given in Tables 217 and 218, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 219. 

Table 217 ~ Transcripts of interest 



Transcript Name f ; 


Sequence ID No. . \ '#'51* 


H14624_T20 | 


28 


Table 218- Segments of interest 


Segment Name 


Sequence ID No. aMM: - : 


H14624_node_0 


362 


H14624_node_16 


363 


H14624_node_3 


364 


H14624_node_10 


365 


H14624_node_l 1 


366 


H14624_node_12 


367 


H14624_node_13 


368 


H14624_node_14 


370 


H14624_node_15 


371 


H14624_node_4 


372 


H14624_node_5 


373 


H14624_node_6 


374 
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H14624_node_7 


375 


H14624_node_8 


376 


H14624_node_9 


377 


Table 219 - Proteins of interest 


Protein Name ' ' t, : • K ' ■ 


Sequence ID No. • " t[ \ . ,, 


H14624_P15 


1306 



5 Cluster HI 4624 can be used as a diagnostic marker according to overexpression of 

transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 22 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 

10 the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 22 and Table 220. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: colorectal cancer, epithelial malignant tumors, a mixture of 
15 malignant tumors from different tissues, lung malignant tumors and pancreas carcinoma. 



Table 220 - Normal tissue distribution 



Name of Tissue 


Number 


adrenal 


0 


bladder 


410 


bone 


71 


brain 


42 


colon 


6 


epithelial 


91 
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general 


74 


head and neck 


0 


kidney 


0 


lung 


30 


breast 


949 


ovary 


7 


pancreas 


2 


prostate 


94 


stomach 


3 


Thyroid 


128 


uterus 


54 



Table 221 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


Pi 


P2 , 


SPl 




SP2 


R4 ' 


adrenal 


4.2e-01 


4.6e-01 


4.6e-01 


2.2 


5.3e-01 


1.9 


bladder 


5.4e-01 


6.0e-01 


1.2e-02 


1.6 


2.2e-01 


1.0 


bone 


4.9e-01 


8.5e-01 


1.8e-01 


1.3 


7.5e-01 


0.6 


brain 


4.7e-01 


7.0e-01 


6.3e-05 


2.3 


9.4e-03 


1.4 


colon 


4.4e-02 


9.9e-02 


4.5e-03 


5.4 


2.0e-02 


3.9 


epithelial 


7.7e-03 


3.6e-01 


1.5e-ll 


2.0 


2.9e-02 


1.1 


general 


5.1e-03 


5.9e-01 


8.3e-21 


2.2 


1.5e-04 


1.2 


head and neck 


1.4e-01 


2.8e-01 


4.6e-01 


2.2 


7.5e-01 


1.3 


kidney 


6.5e-01 


7.2e-01 


5.8e-01 


1.7 


7.0e-01 


1.4 


lung 


6.1e-02 


1.4e-01 


3.3e-05 


5.8 


8.1e-03 


2.9 


breast 


2.4e-01 


4.1e-01 


1 


0.3 


1 


0.2 


ovary 


8.5e-01 


7.3e-01 


6.8e-01 


1.2 


1.6e-01 


1.6 


pancreas 


7.5e-03 


4.9e-02 


1.2e-21 


22.4 


2.4e-16 


15.1 


prostate 


8.3e-01 


8.9e-01 


7.2e-01 


0.8 


8.8e-01 


0.6 


stomach 


4.6e-01 


8.5e-01 


1.0e-03 


2.7 


l.le-01 


1.4 
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Thyroid 


7.0e-01 


7.0e-01 


5.9e-01 


1.0 


5.9e-01 


1.0 


uterus 


4.1e-01 


7.3e-01 


2.3e-01 


1.2 


6.2e-01 


0.7 



above. A description of each variant protein according to the present invention is now provided. 



Variant protein H14624JP15 according to the present invention has an amino acid 
5 sequence as given at the end of the application; it is encoded by transcript(s) H14624_T20. One 
or more alignments to one or more previously published protein sequences are given at the end 
of the application. A brief description of the relationship of the variant protein according to the 
present invention to each such aligned protein is as follows: 

Comparison report between H14624_P15 and Q9HAP5 (SEQ ID NO: 1701): 
10 1 An isolated chimeric polypeptide encoding for H14624_P15, comprising a first amino 

acid sequence being at least 90 % homologous to 

MLQGPGSLLLLFLASHCCLGSARGLFLFGQPDFSYKRSNCKPIPANLQLCHGIEYQNMR 
LPNLLGHETMKEVLEQAGAW 

VQVKDRCAPVMSAFGFPWPDMLECDRFPQDNDLCIPLASSDHLLPATEE corresponding 
15 to amino acids 1-167 of Q9HAP5, which also corresponds to amino acids 1 - 167 of 

H14624JP15, and a second amino acid sequence being at least 70%, optionally at least 80%, 
preferably at least 85% ? more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence GKPSLLLPHSLLG corresponding to amino 
acids 168 - 180 of H14624JP15, wherein said first and second amino acid sequences are 
20 contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a tail of H14624JP15, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
GKPSLLLPHSLLG in H14624_P15. 

25 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
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prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein H14624JP15 also has the following non- silent SNPs (Single Nucleotide 
Polymorphisms) as listed in Table 222, (given according to their position(s) on the amino acid 
5 sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein H14624JP15 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

Table 222 - Amino acid mutations 



SNP position(s) on amino acid 
sequence ; 


Alternative amino acid(s) ") \, 


Previously known SNP? ^ 

A- ■ • . : ; ' .-. i'> .:. ' ~W- -' ■** 


11 


L-> 


No 


170 


P->S 


Yes 


28 


F-> 


No 


29 


G-> 


No 


38 


S-> 


No 


45 


A->V 


Yes 


60 


L-> 


No 



10 Variant protein H14624JP15 is encoded by the following transcript(s): H14624JT20, for 

which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
H14624_T20 is shown in bold; this coding portion starts at position 857 and ends at position 
1396. The transcript also has the following SNPs as listed in Table 223 (given according to their 
position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 

1 5 indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
H14624JP15 sequence provides support for the deduced sequence of this variant protein 
according to the present invention). 

Table 223 - Nucleic acid SNPs 



SNP position on nucleotide 


Alternative nucleic acid 


Previously known SNP? 


sequence 
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389 


A->G 


No 


476 


C ->T 


No 


969 


G-> 


No 


988 


G->T 


Yes 


990 


C->T 


Yes 


1034 


C-> 


No 


1168 


C ->T 


Yes 


1364 


C ->T 


Yes 


488 


T->C 


No 


819 


C->G 


Yes 


851 


C -> 


No 


887 


c-> 


No 


922 


G-> A 


Yes 


934 


C->T 


Yes 


938 


T-> 


No 


943 


C -> 


No 



As noted above, cluster HI 4624 features 15 segment(s), which were listed in Table 2 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
5 provided. 

Segment cluster H14624_node_0 according to the present invention is supported by 3 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H14624_T20. Table 224 below describes the starting and 
10 ending position of this segment on each transcript. 

Table 224 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H14624_T20 


1 


573 
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Segment cluster H14624_node_16 according to the present invention is supported by 3 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H14624 T20. Table 225 below describes the starting and 



5 ending position of this segment on each transcript. 
Table 225 - Segment location on transcripts 



Transcript name ' 


l Segment starting position 


Segmeiit ending positio^.' 


H14624__T20 


1359 


1745 



Segment cluster H14624_node_3 according to the present invention is supported by 67 
10 libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H14624JT20. Table 226 below describes the starting and 
ending position of this segment on each transcript. 
Table 226 - Segment location on transcripts 



Transcript name 


Segment startmg position 


Segment ending position - 


H14624_T20 


574 


822 



According to an optional embodiment of the present invention, short segments related to 
15 the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



Segment cluster H14624_node_10 according to the present invention can be found in the 
following transcript(s): H14624_T20. Table 227 below describes the starting and ending 
20 position of this segment on each transcript. 

Table 227 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H14624_T20 


1070 


1079 
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Segment cluster H14624_node_l 1 according to the present invention is supported by 99 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H14624JT20. Table 228 below describes the starting and 
ending position of this segment on each transcript. 



5 Table 228 - Segment location on transcripts 



Transcript name . .. ; 


Segment starting position ; * 


.Segment ending position: 


H14624T20 


1080 


1114 



Segment cluster H14624_node_12 according to the present invention can be found in the 
following transcript(s): H14624JT20. Table 229 below describes the starting and ending 
1 0 position of this segment on each transcript. 



Table 229 - Segment location on transcripts 



Transcript name 


Segment starting position (< |. 


Segment ending position , 


H14624_T20 


1115 


1135 



Segment cluster H14624_node_13 according to the present invention is supported by 124 
15 libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H14624_T20. Table 230 below describes the starting and 
ending position of this segment on each transcript. 



Table 230 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H14624_T20 


1136 


1227 



Segment cluster H14624_node_14 according to the present invention is supported by 1 14 
libraries. The number of libraries was determined as previously described. This segment can be 
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found in the following trariscript(s): H14624JT20. Table 231 below describes the starting and 
ending position of this segment on each transcript. 

Table 231 - Segment location on transcripts 



TranscSpt namfe 


/Segment Starting position- 


i Segment ending position V \ : 


H14624JT20 


1228 


1287 



Segment cluster H14624_node_15 according to the present invention is supported by 124 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H14624JF20. Table 232 below describes the starting and 
ending position of this segment on each transcript. 



Table 232 - Segment location on transcripts 



Transcript name 


Segment starting position ^ 

*: ■■■ . . • . .-/ «• •:: 


Segment ending position .; ? 


H14624T20 


1288 


1358 



Segment cluster H14624_node_4 according to the present invention is supported by 65 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H14624_T20. Table 233 below describes the starting and 
ending position of this segment on each transcript. 

Table 233 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H14624T20 


823 


892 



Segment cluster H14624_node_5 according to the present invention can be found in the 
following transcript(s): H14624_T20. Table 234 below describes the starting and ending 
position of this segment on each transcript. 

Table 234 - Segment location on transcripts 
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Transcript name • 


Segment starting position \. 


Segment ending position " : 


H14624_T20 


893 


903 



Segment cluster H14624_node_6 according to the present invention can be found in the 
following transcript(s): H14624_T20. Table 235 below describes the starting and ending 
5 position of this segment on each transcript. 



Table 235 - Segment location on transcripts 



Transcript nmme . ' f . ■ ? 


-Segment starting position J.., 


Segment hiding pbsitipB 


H14624_T20 


904 


927 



Segment cluster H14624_node_7 according to the present invention can be found in the 
10 following transcript(s): H14624_T20. Table 236 below describes the starting and ending 
position of this segment on each transcript. 



Table 236 - Segment location on transcripts 



Transcnpt hame * " 


Segment starting position 


Segment ending position 


H14624JT20 


928 


934 



15 Segment cluster H14624_node_8 according to the present invention is supported by 85 

libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H14624_T20. Table 237 below describes the starting and 
ending position of this segment on each transcript. 

Table 237 ' - Segment location on transcripts 



Transcript name 


Segment starting position 


, Segment ending position 


H14624JT20 


935 


1014 
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Segment cluster H14624_node_9 according to the present invention is supported by 87 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): H14624JT20. Table 238 below describes the starting and 
ending position of this segment on each transcript. 



5 Table 238 - Segment location on transcripts 



Transcript name / 


Segment starting position -,. 


\ Segment ending position 


H14624T20 


1015 


1069 



10 



Variant protein alignment to the previously known protein: 
Sequence name: /tmp/UpblSbFkr j /N4PrGQAB2V : Q9HAP5 

15 

Sequence documentation : 
Alignment of: H14624_P15 x Q9HAP5 
20 Alignment segment 1/1: 

Quality: 1702.00 

Escore: 0 

Matching length: 167 

25 length: 167 

Matching Percent Similarity: 100.00 
Identity: 100.00 



Total 
Matching Percent 
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Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps: 0 

5 Alignment: 

1 MLQGPGSLLLLFLASHCCLGSARGLFLFGQPDFSYKRSNCKPIPANLQLC 50 

I I ! I I I t ! I I I 1 i I I I 1 I I I I I I ! I M I I I I I I I I I t I I i I I M I I I I i I 

1 MLQGPGSLLLLFLASHCCLGSARGLFLFGQPDFSYKRSNCKPIPANLQLC 50 

10 ..... 

51 HGIEYQNMRLPNLLGHETMKEVLEQAGAWIPLVMKQCHPDTKKFLCSLFA 100 

I I I I I ! I I I I I t I ! I II I I I I I I I I I I I t I I I I I I I I I ! I II I t I I M I I 

51 HGIEYQNMRLPNLLGHETMKEVLEQAGAWIPLVMKQCHPDTKKFLCSLFA 100 
15 101 PVCLDDLDETIQPCHSLCVQVKDRCAPVMSAFGFPWPDMLECDRFPQDND 150 

I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I t 1 I I I I I I I I I I I I I I I I I 

101 PVCLDDLDETIQPCHSLCVQVKDRCAPVMSAFGFPWPDMLECDRFPQDND 15 0 

151 LC I PLAS S DHLLPATEE 167 

20 II I I I I I I I I I t I I I I I 

151 LCI PLAS S DHLLPATEE 167 



25 DESCRIPTION FOR CLUSTER H38804 

Cluster H38804 features 2 transcript(s) and 20 segment(s) of interest, the names for which 
are given in Tables 239 and 240, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 241. 

Table 239 - Transcripts of interest 



Transcript Name 



Sequence ID No. 
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H38804_PEA_1_T24 


29 


H38804_PEA_1_T8 


30 


Table 240 - Segments of interest 


Segment Name \ ' 'V'. -i 


Sequence tD No. j 


H3 8804_PEA_l_node_0 


378 


H3 8 804_PEA_l_node_l 


379 


H3 8804_PEA_l_node_l 6 


380 


H38804_PEA_l_node_19 


381 


H38804_PEA_l_node_24 


382 


H38804_PEA_l_node_25 


383 


H38804_PEA_l_node_28 


384 


H3 8804_PEA_l_node_29 


385 


H38804_PEA_l_node_30 


386 


H3 8804_PEA_l_node_l 0 


387 


H38804_PEA_l_node_12 


388 


H38804_PEA_l_node_l 3 


389 


H38 804_PEA_l_node_l 4 


390 


H38804_PEA_l_node_2 


391 


H38804_PEA_l_node_20 


392 


H38804_PEA_l_node_23 


393 


H38804_PEA_l_node_26 


394 


H38804_PEA_l_node_3 


395 


H38804_PEA_l_node_4 


396 


H38804_PEA_l_node_5 


397 


Table 241 - Proteins of interest 


Protein Name 


Sequence ID No. 


H38804_PEA_1_P5 


1307 
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H38804 PEA 1 P17 



387 
T308~ 



These sequences are variants of the known protein Mitotic checkpoint protein BUB3 
(SwissProt accession identifier BUB3JHUMAN), SEQ ID NO: 1424, referred to herein as the 
previously known protein. 
5 Protein Mitotic checkpoint protein BUB3 is known or believed to have the following 

function(s): Required for kinetochore localization of BUB 1. The sequence for protein Mitotic 
checkpoint protein BUB3 is given at the end of the application, as "Mitotic checkpoint protein 
BUB3 amino acid sequence". Known polymorphisms for this sequence are as shown in Table 
242 

1 0 Table 242 - Amino acid mutations for Known Protein 



SNB posiibn(s>;bii 

amino acid sequeb&e 

r ...: -fc. ' , 


Comment . : \ f < f-\, : } v p' 


326 - 327 


Missing 



Protein Mitotic checkpoint protein BUB3 localization is believed to be Nuclear. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: mitosis; mitotic checkpoint; mitotic spindle checkpoint; cell 
15 proliferation, which are amiotation(s) related to Biological Process; and nucleus, which are 
annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.cli/sprot/>; or Locuslink, available 
from<h1ip://www.ncbimlm.nih.gov/projects/LocusLiiik/>. 

20 

Cluster H38804 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 23 refer to weighted expression of ESTs in 
25 each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 
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Overall, the following results were obtained as shown with regard to the histograms in 
Figure 23 and Table 243. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: transitional cell carcinoma, brain malignant tumors, a 
5 mixture of malignant tumors from different tissues and gastric carcinoma. 

Table 243 - Normal tissue distribution 



Name of Tissue > f . ;*S 


Number; / .s.y 


adrenal 


124 


bladder 


0 


bone 


64 


brain 


40 


colon 


75 


epithelial 


86 


general 


79 


head and neck 


334 


kidney 


69 


liver 


14 


lung 


125 


lymph nodes 


218 


breast 


263 


bone marrow 


62 


muscle 


27 


ovary 


109 


pancreas 


43 


prostate 


32 


skin 


53 


stomach 


0 


T cells 


557 


Thyroid 


257 
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113 



uterus 



Table 244 - P values and ratios for expression in cancerous tissue 



.Name of Tissue v 


pi v 


P2 " V\. v 


SPH--; : ; 


R3 


SP2 'V 


R4 ;' 


adrenal 


6.3e-01 


5.4e-01 


1.8e-01 


1.4 


5.0e-02 


1.9 


bladder 


7.0e-02 


2.6e-02 


3.2e-02 


4.9 


9.9e-03 


6.2 


bone 


3.7e-01 


2.3e-01 I 


7.9e-01 


0.9 


3.2e-01 


1.6 


brain 


3.1e-02 


4.2e-03 


5.3e-01 


1.2 


l.le-02 


2.1 


colon 


2.4e-01 


l.le-01 


2.0e-01 


1.7 


1.6e-01 


1.8 


epithelial 


l.le-01 


2.2e-02 ; 


1.5e-01 


1.2 


8.6e-03 


1.3 


general 


2.3e-02 


2.3e-04 


9.0e-02 


1.2 


4.7e-05 


1.4 


head and neck 


4.4e-01 


4.7e-01 


9.2e-01 


0.6 


8.9e-01 


0.5 


kidney 


8.2e-01 


8.4e-01 


9.0e-01 


0.8 


3.5e-01 


1.0 


liver 


8.3e-01 


1.5e-01 


1 


0.8 


5.3e-02 


2.8 


lung 


6.9e-01 


8.1e-01 


5.1e-01 


1.1 


6.0e-01 


0.8 


lymph nodes 


5.1e-01 


6.9e-01 


5.0e-01 


0.9 


9.5e-01 


0.5 


breast 


4.9e-01 


4.2e-01 


9.7e-01 


0.5 


9.5e-01 


0.5 


bone marrow 


6.7e-01 


5.4e-01 


1 


1.5 


3.3e-02 


2.6 


muscle 


8.5e-01 


6.1e-01 


1 


0.4 


6.3e-01 


1.0 


ovary 


3.4e-01 


3.3e-01 


2.5e-01 


1.5 


4.7e-01 


1.1 


pancreas 


4.3e-01 


4.9e-01 


6.3e-01 


1.0 


6.9e-01 


0.9 


prostate 


7.4e-01 


6.5e-01 


1.5e-01 


1.9 


1.0e-01 


2.0 


skin 


6.0e-01 


1.7e-01 


5.4e-01 


1.4 


2.7e-02 


1.2 


stomach 


4.5e-02 


9.9e-03 


2.5e-01 


3.1 


4.3e-02 


4.3 


T cells 


5.0e-01 


6.7e-01 


1 


0.3 


9.8e-01 


0.5 


Thyroid 


5.7e-01 


5.7e-01 


1 


0.4 


1 


0.4 


uterus 


5.7e-01 


6.7e-01 


9.2e-01 


0.6 


8.7e-01 


0.5 



above. These transcript(s) encode for protein(s) which are variant(s) of protein Mitotic 
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checkpoint protein BUB3. A description of each variant protein according to the present 
invention is now provided. 

Variant protein H38804_PEA_1 JP5 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
H38804JPEA_1_T8. An alignment is given to the known protein (Mitotic checkpoint protein 
BUB 3) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as fellows: 

Comparison report between H38804_PEA__1 JP5 and BUB3JHUMAN: 

1. An isolated chimeric polypeptide encoding for H38804_PEA_1_P5, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence 

MGRVRTLAGECSAQAQAQSLLAWLSAPPSGGTPSARLSVRSPSPRDPWGLWAPVLQ 
corresponding to amino acids 1 - 57 of H38804JPEA_1 JP5, and a second amino acid sequence 
being at least 90 % homologous to 

MTGSNEFKLNQPPEDGISSVKFSPNTSQFLLVSSWDTSVRLYDVPANSMRLKYQHTGA 

VLDCAFYDPTHAWSGGLDHQLKMHDLNTDQENLVGTHDAPIRCVEYCPEVNVMVTG 

SWDQTVKLWDPRTPCNAGTFSQPEKVYTLSVSGDRLIVGTAGRRVLVWDLRNMGYVQ 

QRRESSLKYQTRCIRAFPNKQGYVLSSIEGRVAVEYLDPSPEVQKKXYAFKCHRLKENN 

IEQIYPVNAJSFHMHNTFATGGSDGFVNIWDPFNKKRLCQFH 

AIASSYMYEMDDTEHPEDGIFIRQVTDAETKPK corresponding to amino acids 1 - 324 of 
BUB3HUMAN, which also corresponds to amino acids 58 - 381 of H38804_PEA_1 JP5, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a head of H38804JPEAJJP5, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 
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MGRVRTLAGECSAQAQAQSLLAVVLSAPPSGGTPSARLSVRSPSPRDPWGLWAPVLQ 
of H38804_PEA_1 JP5. 

The location of the variant protein was determined according to results from a number of 
5 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because one of the two signal- 
peptide prediction programs (HMM:Signal peptide,NN:NO) predicts that this protein has a 
signal peptide.. 

10 Variant protein H38804_PEA_1_P5 also has the following non- silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 245, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein H38804_PEA_1_P5 
sequence provides support for the deduced sequence of this variant protein according to the 

15 present invention). 

Table 245 - Amino acid mutations 



SNP position(s) 6n amino acid 
sequence ' 


Alternative aminpacid(sf y 


Previously toiown ,SNP? 


126 


H -> Y 


No 


129 


S ->R 


Yes 


256 


I-> 


No 


256 


I->N 


No 


258 


G-> 


No 


266 


D-> 


No 


266 


D->E 


No 


266 


D->N 


Yes 


296 


A->G 


No 


296 


A->V 


No 


306 


F->C 


No 


314 


F-> 


No 
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215 


R->K 


No 


361 


T-> A 


No 


381 


K-> 


No 


217 


L-> 


No 


220 


D-> 


No 


220 


D->E 


No 


245 


F-> 


No 


245 


F -> V 


No 


248 


K-> 


No 


248 


K->Q 


No 



Variant protein H38804 PEA_1_P5 is encoded by the following transcript(s): 
H38804_PEA_1_T8 ? for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript H38804_PEA_1_T8 is shown in bold; this coding portion starts at 
5 position 475 and ends at position 1617. The transcript also has the following SNPs as listed in 
Table 246 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein H38804JPEA_1_P5 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 246 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


. Previously known SNP? 


161 


C-> 


No 


167 


C-> 


No 


1118 


G -> A 


No 


1123 


T-> 


No 


1134 


C-> 


No 


1134 


C -> A 


No 


1207 


T-> 


No 


1207 


T->G 


No 
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1216 


A-> 


No 


1216 


A->C 


No 


1241 


T-> 


No 


1241 


T -> A 


No 


167 


C-> A 


No 


1248 


C-> 


No 


1248 


C->G 


No 


1270 


G-> A 


Yes 


1272 


C-> 


No 


1272 


C->A 


No 


1361 


C->G 


No 


1361 


C->T 


No 


1391 


T->G 


No 


1414 


T-> 


No 


1419 


A->G 


No 


192 


T-> 


No 


1555 


A->G 


No 


1615 


A-> 


No 


1642 


G->A 


Yes 


1846 


T->C 


Yes 


2090 


A->G 


No 


2356 


C ->G 


No 


2712 


G-> 


No 


2909 


T->C 


No 


2909 


T->G 


No 


3020 


T->G 


No 


208 


C->T 


Yes 


3251 


T-> 


No 


3306 


T-> 


No 


3307 


T->G 


No 
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3354 


T-> 


No 


3521 


->G 


No 


3601 


C-> 


No 


3601 


C->G 


No 


3633 


T-> 


No 


3633 


T->G 


No 


3638 


A-> 


No 


849 


G->T 


No 


3638 


A->C 


No 


3674 


C->T 


Yes 


3812 


T->G 


No 


3862 


G-> A 


Yes 


3864 


T-> A 


No 


3865 


T-> A 


No 


3990 


T->G 


No 


4096 


T->G 


No 


4152 


G->A 


Yes 


850 


C ->T 


No 


855 


C->T 


Yes 


861 


T->G 


Yes 


1098 


T->C 


No 



Variant protein H38804_PEA_1_P17 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 H38804_PEA_1 JT24. An alignment is given to the known protein (Mitotic checkpoint protein 
BUB3) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 
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Comparison report between H38804JPEAJ JP17 and BUB3 HUM AN : 

1. An isolated chimeric polypeptide encoding for H38804_PEA_1 JP17, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence 

MGRVRTLAGECSAQAQAQSLLAVVLSAPPSGGTPSARLSVRSPSPRDPWGLWAPVLQ 
corresponding to amino acids 1 - 57 of H38804J > EA_1_P17, and a second amino acid sequence 
being at least 90 % homologous to 

MTGSNEFKLNQPPEDGISSVKFSPNTSQFLLVSSWDTSVRLYDVPANSMRLKYQHTGA 
VLDCAFYDPTHAWSGGLDHQLKMHDLNTDQENLVGTHDAPIRCVEYCPEVNVMVTG 
SWDQTVKLWDPRTPCNAGTFSQPEKVYTLSVSGDRLIVGTAGRRVLVWDLRNMGYVQ 
QRRESSLKYQTRCIRAFPNKQGWLSSIEGRVAVEYLDPSPEVQKKKYAFKCHRLKENN 

IEQIYPVNAISFtlNIFINTFA 

AIASSYMYEMDDTEHPEDGIFIRQVTDAETKPKSPCT corresponding to amino acids 1 - 
328 of BUB3 JHXJMAN, which also corresponds to amino acids 58 - 385 of 
H38804 PEA_1_P17, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

2. An isolated polypeptide encoding for a head of H38804_PEA_1 JP17 ? comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

MGRVRTLAGECSAQAQAQSLLAVVLSAPPSGGTPSARLSVRSPSPRDPWGLWAPVLQ 
of H38804J>EA_1_P17. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because one of the two signal- 
peptide prediction programs (HMM:Signal peptide,NN:NO) predicts that this protein has a 
signal peptide.. 
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Variant protein H38804_PEA_1_P17 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 247, (given according to their positions) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein H38804_PEA_1_P17 
5 sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 247 - Amino acid mutations 



SNP position(s) on imiho aeijl 
sequence ' jf \i; ^ < . / J.: $£ 


Alternative anptino acid(s) J 

TV- ' . V X :S\: ' 7 M 


Previously known SNP? 


126 


H -> Y 


No 


129 


S->R 


Yes 


256 


I-> 


No 


256 


I->N 


No 


258 


G-> 


No 


266 


D-> 


No 


266 


D->E 


No 


266 


D->N 


Yes 


296 


A->G 


No 


296 


A->V 


No 


306 


F->C 


No 


314 


F-> 


No 


215 


R->K 


No 


361 


T->A 


No 


381 


K-> 


No 


217 


L-> 


No 


220 


D-> 


No 


220 


D->E 


No 


245 


F-> 


No 


245 


F-> V 


No 


248 


K-> 


No 
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248 



K->Q 



No 



10 



Variant protein H38804_PEA_1 JP17 is encoded by the following transcript(s): 
H38804JPEA_1 JT24, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript H38804JPEA_1_T24 is shown in bold; this coding portion starts at 
position 475 and ends at position 1629. The transcript also has the following SNPs as listed in 
Table 248 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein H38804_PEA_1JP17 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 248 - Nucleic acid SNPs 



SNP position on nucleotide ,? 
sequence 


Alternative nncleio acid ^ 


Previously known SNP?' ■ 


161 


C-> 


No 


167 


C-> 


No 


1118 


G -> A 


No 


1123 


T-> 


No 


1134 


C-> 


No 


1134 


C -> A 


No 


1207 


T-> 


No 


1207 


T->G 


No 


1216 


A-> 


No 


1216 


A->C 


No 


1241 


T-> 


No 


1241 


T-> A 


No 


167 


C -> A 


No 


1248 


C-> 


No 


1248 


C->G 


No 


1270 


G-> A 


Yes 


1272 


C-> 


No 
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1272 


C-> A 


No 


1361 


C ->G 


No 


1361 


C->T 


No 


1391 


T->G 


No 


1414 


T-> 


No 


1419 


A->G 


No 


192 


T-> 


No 


1555 


A->G 


No 


1615 


A-> j 


No ! 


1721 


G-> 


No 


1918 


T->C 


No 


1918 


T->G 


No 


2029 


T->G 


No 


2260 


T-> 


No 


2315 


T-> 


No 


2316 


T->G 


No 


2363 


T-> 


No 


208 


C->T 


Yes 


2530 


->G 


No 


2610 


C -> 


No 


2610 


C->G 


No 


2642 


T-> 


No 


2642 


T->G 


No 


2647 


A-> 


JNo 


2647 


A->C 


No 


2683 


C->T 


Yes 


2821 


T->G 


No 


2871 


G->A 


Yes 


849 


G->T 


No 


2873 


T -> A 


No 
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2874 


T -> A 


No 


2999 


T->G 


No 


3105 


T->G 


No 


3161 


G-> A 


Yes 


850 


C ->T 


No 


855 


C->T 


Yes 


861 


T->G 


Yes 


1098 


T->C 


No 


As noted above, cluster F 


[38804 features 20 segment(s), which were listed in Table 2 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 



5 provided. 

Segment cluster H38804_PEA_l_nodeJ) according to the present invention is supported 
by 125 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H38804J>EA_1_T24 and H38804JPEA_1 JH8. 



10 Table 249 below describes the starting and ending position of this segment on each transcript. 
Table 249 - Segment location on transcripts 



Transcript name i ; ; 


Segment starting position 


Segment ending position 


H38804_PEA_1_T24 


1 


213 


H38804_PEA_1_T8 


1 


213 



Segment cluster H38804_PEA_l_node_l according to the present invention is supported 
15 by 9 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H38804JPEA_1_T24 and H38804_PEA_1__T8. 
Table 250 below describes the starting and ending position of this segment on each transcript. 

Table 250 - Segment location on transcripts 
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Transcript name ;/ 


Segment starting position 


Segment ending position 


H38804_PEA_1JT24 


214 


645 


H38804_PEA_1_T8 


214 


645 



Segment cluster H38804JPEA_l_node_16 according to the present invention is supported 
by 214 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): H38804JPEA_1_T24 and H38804_PEA_1_T8. 
Table 251 below describes the starting and ending position of this segment on each transcript. 

Table 251 - Segment location on transcripts 



Transcmpt name '"'[. y \0 


Segment starting position ; 


i Segment ending position : .; 


H38804_PEA_1_T24 


1063 


1221 


H38804_PEA_1_T8 


1063 


1221 



10 Segment cluster H38804JPEA_1 jaode_19 according to the present invention is supported 

by 198 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H38804_PEA„1_T24 and H38804_PEA_1 JT8. 
Table 252 below describes the starting and ending position of this segment on each transcript. 

Table 252 - Segment location on transcripts 



Transcript name 


Segment starting position 


! Segment ending position 


H38804_PEA_1_T24 


1222 


1360 


H38804_PEA_1_T8 


1222 


1360 



Segment cluster H38804JPEA_l_node_24 according to the present invention is supported 
by 180 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H3S804JPEA_1 JT24 and H38804JPEA_1_T8. 
20 Table 253 below describes the starting and ending position of this segment on each transcript. 
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Table 253 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H38804_PEA_1_T24 


1421 


1616 


H38804JPEA_1_T8 ^ 


1421 


1616 



Segment cluster H38804JPEA_l_node_25 according to the present invention is supported 
5 by 28 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H38804_PEA_1_T8. Table 254 below describes the 
starting and ending position of this segment on each transcript. 

Table 254 - Segment location on transcripts 



Transcript name • -V/. 


Segment starting position 


Segment ending position 


H38804_PEA_1_T8 


1617 


1969 



Segment cluster H38804_JPEA_l_node_28 according to the present invention is supported 
by 38 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H38804_PEA„1_T8. Table 255 below describes the 
starting and ending position of this segment on each transcript. 

1 5 Table 255 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H38804_PEA_1_T8 


2018 1 


2607 



Segment cluster H38804_PEA_l__node_29 according to the present invention is supported 
by 259 libraries. The number of libraries was determined as previously described. This segment 
20 can be found in the following transcript(s): H38804_PEA_1_T24 and H38804_PEA„1_T8. 

Table 256 below describes the starting and ending position of this segment on each transcript. 

Table 256 - Segment location on transcripts 
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Transcript name : * ,/ 


Segment starting position 


Segment ending position 


H38804_PEA_1_T24 


1617 


2844 


H38804_PEA_1_T8 


2608 


3835 



Segment cluster H38804JPEA_1 _node_30 according to the present invention is supported 
by 169 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H38804_PEA_1_T24 and H38804J>EA_1_T8. 
5 Table 257 below describes the starting and ending position of this segment on each transcript. 

Table 257 - Segment location on transcripts 



Transcript name f 


Segment starring position 


Segment ending position ; "J; 


H38804_PEA_1_T24 


2845 


3170 


H38804_PEA_1_T8 


3836 


4161 



According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



10 

Segment cluster H3 8804_PEA_l_node_l 0 according to the present invention is supported 
by 179 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H38804JPEA_1_T24 and H38804JPEA_1 JT8. 
Table 258 below describes the starting and ending position of this segment on each transcript. 

15 Table 258 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H38804_PEA_1_T24 


841 


910 


H38804_PEA_1_T8 


841 


910 



Segment cluster H38804_PEA_l_node_12 according to the present invention is supported 
by 181 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): H38804JPEA_1_T24 and H38804JPEA_1_T8. 
Table 259 below describes the starting and ending position of this segment on each transcript. 

Table 259 - Segment location on transcripts 



Transcript name , t j*v 


Segment starting position 


Segment ending position 


H38804_PEA_1_T24 


911 


949 


H38804_PEA_1_T8 


911 


949 



5 

Segment cluster H38804_PEA_l_node_13 according to the present invention is supported 
by 187 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H38804JPEA_1_T24 and H38804_PEA_1_T8. 
Table 260 below describes the starting and ending position of this segment on each transcript. 



1 0 Table 260 - Segment location on transcripts 



Transcript name | |* 


Segment stalling position 


Segment ending position 


H38804_PEA_1_T24 


950 


1028 


H38804_PEA_1_T8 


950 


1028 



Segment cluster H38804_PEA_l_node_14 according to the present invention is supported 
by 179 libraries. The number of libraries was determined as previously described. This segment 
15 can be found in the following transcript(s): H38804JPEA_1_T24 and H38804JPEA_1_T8. 
Table 261 below describes the starting and ending position of this segment on each transcript 

Table 261 - Segment location on transcripts 



Transcript name 


Segment starting position 


: Segment ending position : 


H38804_PEA_1_T24 


1029 


1062 


H38804_PEA_1_T8 


1029 


1062 
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Segment cluster H38804JPEA_l_node_2 according to the present invention is supported 
by 156 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H38804JPEA_1_T24 and H38804_PEA_1 JT8. 
Table 262 below describes the starting and ending position of this segment on each transcript. 



5 Table 262 - Segment location on transcripts 



Transcript name '.f- ; 


Segment starting position ; 


Segment ending position 


H38804_PEA_1_T24 


646 


678 


H38804_PEA_1_T8 


646 


678 



Segment cluster H38804JPEA_l_node_20 according to the present invention is supported 
by 162 libraries. The number of libraries was determined as previously described. This segment 
10 can be found in the following transcript(s): H38804_PEA_1JT24 and H38804JPEA_1_T8. 
Table 263 below describes the starting and ending position of this segment on each transcript. 

Table 263 - Segment location on transcripts 



Transcript name 


Segment starting position - 


Segment ending position 


H38804_PEA_1_T24 


1361 


1399 


H38804_PEA_1_T8 


1361 


1399 



1 5 Segment cluster H38804 _PEA__l__node_23 according to the present invention can be 

found in the following transcript(s): H38804_PEA_1„T24 and H38804JPEA_1 JT8. Table 264 
below describes the starting and ending position of this segment on each transcript. 

Table 264 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H38804_PEA_1_T24 


1400 


1420 


H38804_PEA_1_T8 


1400 


1420 



20 
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Segment cluster H38804JPEA_l_node_26 according to the present invention is supported 
by 21 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H38804JPEAJ JT8. Table 265 below describes the 
starting and ending position of this segment on each transcript. 



5 Table 265 - Segment location on transcripts 



Transcript name . K/. ' 


Segment starting position 


Segment ending position 


H38804_PEA_1_T8 


1970 


2017 



Segment cluster H38804_PEA_l_node_3 according to the present invention is supported 
by 162 libraries. The number of libraries was determined as previously described. This segment 
10 can be found in the following rranscript(s): H38804_PEA_1 JT24 and H38804_PEA_1_T8. 



Table 266 below describes the starting and ending position of this segment on each transcript. 
Table 266 - Segment location on transcripts 



Transcript : name 


Segment starting position 


Segment ending position 


H38804_PEA_1_T24 


679 


716 


H38804_PEA_1_T8 


679 


716 



15 Segment cluster H3 8804_PEA__1 _node_4 according to the present invention is supported 

by 172 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H38804JPEA_1 JT24 and H38804J>EA_1_T8. 



Table 267 below describes the starting and ending position of this segment on each transcript. 
Table 267 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H38804_PEA_1_T24 


717 


827 


H38804_PEA_1_T8 


717 


827 



20 
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Segment cluster H38804_PEA_1 jaode_5 according to the present invention can be found 
in the following transcript(s): H38804JPEA_1_T24 and H38804J>EA_1JT8. Table 268 below 
describes the starting and ending position of this segment on each transcript. 

Table 268 - Segment location on transcripts 



Traa^cript name 


Segment starting position • . 


Segment ending position 


H38804_PEA_1_T24 


828 


840 


H38804_PEA_1_T8 


828 


840 



5 



10 



Variant protein alignment to the previously known protein: 

Sequence name: / tmp/RR4oV8 zYLg/ QlORqeqpIp : BUB3_HUMAN 

15 Sequence documentation: 

Alignment of: H3880 4_PEA_1_P5 x BUB 3_HUMAN 

Alignment segment 1/1: 

20 

Quality: 3244.00 

Escore: 0 

Matching length: 324 Total 

length: 324 

25 Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 
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Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 



Alignment : 

58 MTGSNEFKLNQPPEDGISSVKFSPNTSQFLLVSSWDTSVRLYDVPANSMR 107 

| | I I I ] I I I I I I I I I I I I M I I 1 I I I I I I II I I I M I I I I I I I I I I I M I 

1 MTGSNEFKLNQPPEDGISSVKFSPNTSQFLLVSSWDTSVRLYDVPANSMR 50 
10 8 LKYQHTGAVLDCAFYDPTHAWSGGLDHQLKMHDLNTDQENLVGTHDAPIR 157 

I | | ! | ! | I I i I I I 1 I I I I I I I I II i I I I I I M I I 1 I I I I I I i 1 I i I 1 I t 1 

51 LKYQHTGAVLDCAFYDPTHAWSGGLDHQLKMHDLNTDQENLVGTHDAPIR 100 

158 CVEYCPEVNVMVTGSWDQTVKLWDPRTPCNAGTFSQPEKVYTLSVSGDRL 2 07 

! | | I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I 
101 CVEYCPEVNVMVTGSWDQTVKLWDPRTPCNAGTFSQPEKVYTLSVSGDRL 150 

208 IVGTAGRRVLVWDLRNMGYVQQRRESSLKYQTRCIRAFPNKQGYVLSSIE 257 

| I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I ! I I I I I 

151 IVGTAGRRVLVWDLRNMGYVQQRRESSLKYQTRCIRAFPNKQGYVLSSIE 200 

258 GRVAVEYLDPSPEVQKKKYAFKCHRLKENNIEQIYPVNAISFHNIHNTFA 307 

| | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 
201 GRVAVE YLDPS PEVQKKKYAFKCHRLKENN I EQ I YP VNAI S FHN I HNT FA 250 

30 8 TGGSDGFVNIWDPFNKKRLCQFHRYPTSIASLAFSNDGTTLAIASSYMYE 357 

I I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I I I I I M I I I I I I I I I I I 
251 TGGSDGFVNIWDPFNKKRLCQFHRYPTSIASLAFSNDGTTLAIASSYMYE 300 

358 MDDTEHPEDGIFIRQVTDAETKPK 381 
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I I I I i I I 1 1 i t I I 1 M I I I I I I I I 

301 MDDTEHPEDGIFIRQVTDAETKPK 



324 



10 



Sequence name : / tmp/DbOdQEpSuo/Lr 8HPXaeBg : BUB3_HUMAN 



Sequence documentation : 



Alignment of: H3 8 80 4_PEA_1_P17 x BUB3_HUMAN 



Alignment segment 1/1: 



15 Quality: 3288.00 

Escore: 0 

Matching length: 32 8 

length: 328 
Matching Percent Similarity: 100.00 
20 Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 



Total 



Matching Percent 



Total Percent 



25 Alignment: 



30 



58 MTGSNEFKLNQPPEDGISSVKFSPNTSQFLLVSSWDTSVRLYDVPANSMR 107 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MTGSNEFKLNQPPEDGISSVKFSPNTSQFLLVSSWDTSVRLYDVPANSMR 50 

10 8 LKYQHTGAVLDCAFYDPTHAWSGGLDHQLKMHDLNTDQENLVGTHDAPIR 157 
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i I I I I I I I I I I I ! I I I I I I I I 1 1 I I M 1 I I I 1 I I I I I I I I I ! I I M M I I 

51 LKYQHTGAVLDCAFYDPTHAWSGGLDHQLKMHDLNTDQENLVGTHDAPIR 100 

15 8 CVEYCPEVNVMVTGSWDQTVKLWDPRTPCNAGTFSQPEKVYTLSVSGDRL 207 

I I I I I I I I I I I I I I I I I I 1 I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
101 CVEYCPEVNVMVTGSWDQTVKLWDPRTPCNAGTFSQPEKVYTLSVSGDRL 15 0 

208 IVGTAGRRVLVWDLRNMGYVQQRRESSLKYQTRCIRAFPNKQGYVLSSIE 257 

I I I I I II I I II I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

151 IVGTAGRRVLVWDLRNMGYVQQRRESSLKYQTRCIRAFPNKQGYVLSSIE 200 
258 G R V AVE Y L D P S P E VQKKK Y A FK C HRLKENN I E Q I Y P VN A I S F HN I HN T FA 307 

I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I II I I I I I M I I I I M I I 

201 GRVAVE YLD P S PE VQKKK YAFKC HRLKENN I EQ I Y P VN A I S FHN I HN T FA 250 
308 TGGSDGFVNIWDPFNKKRLCQFHRYPTSIASLAFSNDGTTLAIASSYMYE 357 

I I || I II I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I M 

251 TGGSDGFVNIWDPFNKKRLCQFHRYPTSIASLAFSNDGTTLAIASSYMYE 300 

358 MDDTEHPEDGIFIRQVTDAETKPKSPCT 385 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 MDDTEHPEDGIFIRQVTDAETKPKSPCT 328 



DESCRIPTION FOR CLUSTER HSENA78 
Cluster HSENA78 features 1 transcript(s) and 7 segment(s) of interest, the names for 
which are given in Tables 269 and 270, respectively, the sequences themselves are given at the 
end of the application. The selected protein variants are given in table 271. 

Table 269 - Transcripts of interest 



Transcript Name 


Sequence ID No. 


HSENA78_T5 


31 
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Table 270 - Segments of interest 



Segment Name '• ' ' 


Sequence ID No. 


HSENA78_node_0 


398 


HSENA78_node_2 


399 


HSENA78_node_6 


400 


HSENA78_node_9 


401 


HSENA78_node_3 


402 


HSENA78_node_4 


403 


HSENA78_node_8 


404 


Table 271 - Proteins of interest 


Protein Name " . 


Sequence ID No. 


HSENA78_P2 


1309 



5 These sequences are variants of the known protein Small inducible cytokine B5 precursor 

(SwissProt accession identifier SZ05_HUMAN; known also according to the synonyms 
CXCL5; Epithelial-derived neutrophil activating protein 78; Neutrophil- activating peptide 
ENA- 78), SEQ ID NO: 1425, referred to herein as the previously known protein. 

Protein Small inducible cytokine B5 precursor is known or believed to have the following 
10 function(s): Involved in neutrophil activation. The sequence for protein Small inducible 

cytokine B5 precursor is given at the end of the application, as "Small inducible cytokine B5 
precursor amino acid sequence". Protein Small inducible cytokine B5 precursor localization is 
believed to be Secreted. 

The following GO Annotation(s) apply to the previously known protein. The following 
15 annotation(s) were found: chemotaxis; signal transduction; cell-cell signaling; positive control 
of cell proliferation, which are annotation(s) related to Biological Process; and chemokine, 
which are annotation(s) related to Molecular Function. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
20 from <http://www.ncbi.nlm.nih.gov/proj ects/LocusLink/>. 
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Cluster HSENA78 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
5 the table and the numbers on the y-axis of figure 24 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
10 Figure 24 and Table 272. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: epithelial malignant tumors and lung malignant tumors. 



Table 272 - Normal tissue distribution 



fJame Of Tissue /• | v ,-. 1 3/- ' 


Number c 


colon 


0 


epithelial 


2 


general 


38 


kidney 


0 


lung 


3 


breast 


8 


skin 


0 


stomach 


36 


uterus 


4 



Table 273 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI 


; P2 


SP1 


R3 


SP2 


R4 


colon 


2.6e-01 


3.3e-01 


1.7e-01 


2.7 


2.7e-01 ! 


2.2 


epithelial 


2.5e-01 


9.0e-02 


3.2e-03 


4.1 


8.5e-07 


5.5 


general 


8.4e-01 


7.2e-01 


1 


0.3 


1 


0.4 


kidney 


1 


7.2e-01 


1 


1.0 


1.7e-01 


1.9 
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lung 


8.5e-01 


4.8e-01 


4.1e-01 


1.9 


4.0e-05 


3.8 


breast 


9.5e-01 


8.7e-01 


1 


0.8 


6.8e-01 


1.2 


skin 


2.9e-01 


4.7e-01 


1.4e-01 


7.0 


6.4e-01 


1.6 


stomach 


5.0e-01 


4.3e-01 


7.5e-01 


1.0 


4.3e-01 


1.3 


uterus 


7.1e-01 


8.5e-01 


6.6e-01 


1.3 


8.0e-01 


1.0 



above. These transcript(s) encode for protein(s) which are variant(s) of protein Small inducible 
cytokine B5 precursor. A description of each variant protein according to the present invention 
is now provided. 



Variant protein HSENA78JP2 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) HSENA78JT5. An 
alignment is given to the known protein (Small inducible cytokine B5 precursor) at the end of 
the application. One or more alignments to one or more previously published protein sequences 
10 are given at the end of the application. A brief description of the relationship of the variant 
protein according to the present invention to each such aligned protein is as follows: 
Comparison report between HSENA78JP2 and SZ05JHUMAN: 

l.An isolated chimeric polypeptide encoding for HSENA78_P2 5 comprising a first amino 
acid sequence being at least 90 % homologous to 
15 MSLLSSRAARVPGPSSSLCALLVLLLLLTQPGPIASAGPAAAVLRELRCVCLQTTQGVHP 
KMISNLQVFAIGPQCSKVEW corresponding to amino acids 1-81 of SZ05JHUMAN, 
which also corresponds to amino acids 1-81 of HSENA78JP2. 



The location of the variant protein was determined according to results from a number of 
20 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 
25 Variant protein HSENA78_P2 also has the following non-silent SNPs (Single Nucleotide 

Polymorphisms) as listed in Table 274, (given according to their position(s) on the amino acid 
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sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein HSENA78JP2 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

Table 274 - Amino acid mutations 



SNP po3ition(s) on amino acid 
sequence . - . 


Alternative amino acid(s) 


Previously known SNP? 


80 


V-> 


No 


81 


V-> 


No 



5 



Variant protein HSENA78JP2 is encoded by the following transcript(s): HSENA78_T5, 
for which the sequence(s) is/are given at the end of the application. The coding portion of 
transcript HSENA78 T5 is shown in bold; this coding portion starts at position 149 and ends at 
position 391. The transcript also has the following SNPs as listed in Table 275 (given according 
10 to their position on the nucleotide sequence, with the alternative nucleic acid listed; the last 
column indicates whether the SNP is known or not; the presence of known SNPs in variant 
protein HSENA78_P2 sequence provides support for the deduced sequence of this variant 
protein according to the present invention). 



Table 275- Nucleic acid SNPs 



SNP position on nucleotide 
•■sequence ■ r i ..." _ •,' : - 


Alternative nucleic acid 


Previously known SNP? •■■ ' 


92 


C->T 


Yes 


144 


C->T 


No 


1151 


A->T 


Yes 


1389 


T->C 


No 


1867 


C->G 


Yes 


145 


C->T 


No 


181 


C ->T 


Yes 


316 


G->A 


Yes 


388 


G-> 


No 
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390 


T-> 


No 


605 


T-> 


No 


972 


C->T 


Yes 


1105 


A->G 


Yes 


As noted above, cluster H 


[SENA78 features 7 segment(s), which were listed in Table 270 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
5 provided. 

Segment cluster HSENA78_node_0 according to the present invention is supported by 24 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): HSENA78_T5. Table 276 below describes the starting and 
10 ending position of this segment on each transcript. 



Table 276 - Segment location on transcripts 



Transcript name ^ - % * ; 


--Segment starting position 


; Segment ending position 


HSENA78_T5 


1 


257 



Segment cluster HSENA78_node_2 according to the present invention is supported by 22 
15 libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): HSENA78_T5. Table 277 below describes the starting and 
ending position of this segment on each transcript. 



Table 277 - Segment location on transcripts 



Transcript name 


Segment starting position 


i Segment ending position 


HSENA78_T5 


258 


390 



20 

Segment cluster HSENA78_node_6 according to the present invention is supported by 68 
libraries. The number of libraries was determined as previously described. This segment can be 
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found in the following transcript(s): HSENA78JT5. Table 278 below describes the starting and 
ending position of this segment on each transcript. 

Table 278 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position . 


HSENA78_T5 


585 


2370 



Segment cluster HSENA78_node_9 according to the present invention is supported by 28 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): HSENA78JT5. Table 279 below describes the starting and 
ending position of this segment on each transcript. 

10 Table 279 - Segment location on transcripts 



Transcript name 3 


Segment starting position 


Segment ending position; 5 


HSENA78_T5 


2394 


2546 



According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



15 Segment cluster HSENA78 node_3 according to the present invention is supported by 1 

libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): HSENA78JT5. Table 280 below describes the starting and 
ending position of this segment on each transcript. 



Table 280 - Segment location on transcripts 



Transcript name 


Segment starting position ■ 


Segment ending position 


HSENA78_T5 


391 


500 



20 

Segment cluster HSENA78__node_4 according to the present invention is supported by 17 
libraries. The number of libraries was determined as previously described. This segment can be 
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found in the following transcript(s): HSENA78JT5. Table 281 below describes the starting and 
ending position of this segment on each transcript. 

Table 281 - Segment location on transcripts 



Transcript name . 


^Segment starting position ?■} 


Segment ending position 


HSENA78_T5 


501 


584 



5 

Segment cluster HSENA78_node_8 according to the present invention can be found in the 
following transcript(s): HSENA78_T5. Table 282 below describes the starting and ending 
position of this segment on each transcript. 

Table 282 - Segment location on transcripts 



Transcript name ?f~f < 


Segment starting position" v 


^Segment inding po|itian \ [ 


HSENA78JT5 


2371 


2393 



10 



15 Variant protein alignment to the previously known protein: 

Sequence name : / tmp / 5 k i Q Y 6Mx Wx /pLnTrxsCqkrSZO 5_HUMAN 

Sequence documentation : 

20 Alignment of: HSENA78_P2 x SZ05JHUMAN 

Alignment segment 1/1: 

Quality: 767.00 

25 Escore: 0 
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Matching length: 
length: 81 

Matching Percent Similarity: 
Identity: 100.00 
5 Total Percent Similarity: 

Identity: 100.00 

Gaps : 



417 

81 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 



Alignment : 

10 - 

1 MSLLSSRAARVPGPSSSLCALLVLLLLLTQPGPIASAGPAAAVLRELRCV 5 0 

I ! I I I I ! I I I I I I I 11 ! I ! I ! 11 I I I I i i I I I I I I t II I I I 1 I I i I I I i I 

1 MSLLS SRAARVPGPS S SLCALLVLLLLLTQPGP IAS AGPAAAVLRELRCV 50 

15 51 CLQTTQGVHPKMISNLQVFAIGPQCSKVEVV 81 

I I I I I I II I I I I I) I 1 I I I M I I I I 1 I I I I 1 
51 CLQTTQGVHPKMISNLQVFAIGPQCSKVEW 81 



20 

DESCRIPTION FOR CLUSTER HUMODCA 
Cluster HUMODCA features 1 transcript(s) and 17 segment(s) of interest, the names for 
which are given in Tables 283 and 284, respectively, the sequences themselves are given at the 
end of the application. The selected protein variants are given in table 285. 

25 Table 283 - Transcripts of interest 



Transcript Name 


Sequence ID No. 


HUMODCA_T17 


32 



Table 284 - Segments of interest 



Segment Name 



Sequence ID No. 
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HUMODC A_node_ 1 


405 


HUMODCA_node_25 


406 


HUMODCA_node_32 


407 


HUMODC A_node_3 6 


408 


HUMODC A_node_3 9 


409 


HUMODCA_node_41 


410 


HUMODCA_node_0 


411 


HUMODC A_node_l 0 


412 


HUMODCA_node_l 2 


413 


HUMODCA_node_l 3 


414 


HUMODCA_node_2 


415 


HUMODCA_node_27 


416 


HUMODCA_node_3 


417 


HUMODCA_node_3 0 


418 


HUMODCA_node_34 


419 


HUMODCA_node_3 8 


420 


HUMODCA_node_40 


421 


Table 285 - Proteins of interest 


Protein Name i / h< V#l ; 


Sequence ID No. - - A ^\X. 

•■■■4-d ' ■ ]';' r ■ 


HUMODCA_P9 


1310 



These sequences are variants of the known protein Ornithine decarboxylase (SwissProt 
5 accession identifier DCOR HUMAN; known also according to the synonyms EC 4. LI. 17; 
ODC) 5 SEQ ID NO: 1426, referred to herein as the previously known protein. 

Protein Ornithine decarboxylase is known or believed to have the following function(s): 
Polyamine biosynthesis; first (rate- limiting) step. The sequence for protein Ornithine 
decarboxylase is given at the end of the application, as "Ornithine decarboxylase amino acid 
10 sequence". Known polymorphisms for this sequence are as shown in Table 286. 

Table 286 - Amino acid mutations for Known Protein 
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SNP position(s) on 
amino acid sequence 


Comment ;;' , ; ' 


415 


Q->E 



annotation(s) were found: polyamine biosynthesis, which are annotation(s) related to Biological 
Process; and ornithine decarboxylase; lyase, which are annotation(s) related to Molecular 
Function. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <ht1p://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 



Cluster HUMODCA can be used as a diagnostic marker according to overexpression of 
10 transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 25 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

15 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 25 and Table 287. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: brain malignant tumors, colorectal cancer, epithelial 
malignant tumors and a mixture of malignant tumors from different tissues. 

20 Table 287 - Normal tissue distribution 



Name of Tissue 


Number 


adrenal 


120 


bladder 


82 


bone 


161 


brain 


53 


colon 


0 


epithelial 


107 
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general 


94 


head and neck 


10 


kidney 


114 


liver 


107 


lung 


120 


lymph nodes 


165 


breast 


61 


bone marrow 


156 


muscle 


55 


ovary 


36 


pancreas 


102 


prostate 


140 


skin 


188 


stomach 


109 


T cells 


278 


Thyroid 


128 


uterus 


118 



Table 288 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI '.. 


P2 


SP1 X. 


R3 


SP2 


R4 % 


adrenal 


8.3e-01 


7.8e-01 


1 


0.2 


8.5e-01 


0.7 


bladder 


5.4e-01 


5.1e-01 


6.2e-01 


1.1 


5.0e-01 


1.1 


bone 


8.3e-01 


3.2e-01 


1 


0.2 


8.4e-01 


0.7 


brain 


2.6e-01 


3.8e-02 


6.5e-04 


2.8 


8.7e-10 


3.6 


colon 


2.2e-02 


5.8e-03 


1.5e-03 


6.9 


6.7e-05 


9.9 


epithelial 


6.4e-02 


2.7e-03 


1.4e-03 


1.5 


1.6e-12 


2.1 


general 


1.3e-03 


5.4e-08 


1.9e-08 


1.7 


1.4e-39 


2.6 


head and neck 


1.7e-01 


1.7e-01 


1 


1.2 


7.5e-01 


1.3 


kidney 


7.7e-01 


7.6e-01 


7.1e-01 


0.8 


6.6e-01 


0.9 



WO 2006/131783 



PCT/IB2005/004037 



421 



liver 


7.3e-01 


5.7e-01 


1 


0.3 


2.4e-01 


1.2 


lung 


7.8e-01 


5.8e-01 


7.6e-01 


0.6 


7.3e-04 


L7 


lymph nodes 


3.9e-01 


2.5e-01 


1.8e-01 


1.1 


1.4e-04 


2.1 


breast 


7.8e-01 


4.7e-01 


7.7e-01 


0.8 


6.4e-01 


1.0 


bone marrow 


3.4e-01 


2.6e-01 


2.8e-01 


2.1 


1.6e-01 


1.2 


muscle 


8.5e-01 


6.1e-01 


1 


0.2 


7.1e-05 


1.0 


ovary 


1.7e-01 


9.3e-02 


3.8e-01 


1.7 


2.2e-02 


2.6 


pancreas 


2.2e-01 


3.2e-01 


5.7e-02 


1.6 


6.6e-U3 


l .i> 


prostate 


5.0e-01 


4.9e-01 


3.8e-02 


1.9 


4.5e-02 


1.7 


skin 


6.2e-01 


5.8e-01 


5.4e-02 


0.9 


1.5e-02 


0.5 


stomach 


4.2e-01 


2.6e-01 


3.7e-01 


0.7 


7.3e~03 


2.3 


T cells 


1 


1 


5.5e-01 


1.5 


8.1e-01 


0.9 


Thyroid 


8.3e-02 


8.3e-02 


5.9e-01 


1.3 


5.9e-01 


1.3 


uterus 


4.2e-01 


2.4e-01 


1.6e-01 


1.2 


4.9e-02 


1.7 



283 above. These transcript(s) encode for protein(s) which are variant(s) of protein Ornithine 
decarboxylase. A description of each variant protein according to the present invention is now 
provided. 



Variant protein HUMODCAJP9 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) HUMODCAJT17. 
An alignment is given to the known protein (Ornithine decarboxylase) at the end of the 
application. One or more alignments to one or more previously published protein sequences are 
1 0 given at the end of the application. A brief description of the relationship of the variant protein 
according to the present invention to each such aligned protein is as follows: 
Comparison report between HUMODCAJP9 and DCOR HUMAN: 
l.An isolated chimeric polypeptide encoding for HUMODCAP9, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
15 preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence MKSLTATSSMKVL^^ corresponding to amino acids 1 - 

29 of HUMODCA_P9, and a second amino acid sequence being at least 90 % homologous to 
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LVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAKELNIDVVGVSFHVGSGCTDPETFV 
QAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEEITGVINPALDKYFPSDSG 
VRIIAEPGRYYVASAFTLAVNIIAKKIVLKEQTGSDDEDESSEQTFMYYVNDGVYGSFN 
CILYDHAHVKPLLQKRPKPDEKYYSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFEN 
5 MGAYTVAAASTFNGFQRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCA 
WESGMKRHRAACASASINV corresponding to amino acids 151-461 of DCORHUMAN, 
which also corresponds to amino acids 30 - 340 of HUMODCAJP9, wherein said first and 
second amino acid sequences are contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a head of HUMODCA P9, comprising a 
10 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence MKSLT ATS SMKVLLPRTF WTRKLMKFLLL of HUMODCAJP9. 

Comparison report between HUMODCAJ>9 and AAA59968 (SEQ ID NO:1702): 

1. An isolated chimeric polypeptide encoding for HUMODCA P9, comprising a first 

15 amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence MKSLT AT S SMKVLLPRTF WTRKLMKFLLL corresponding to amino acids 1 - 
29 of HUMODCAJP9, and a second amino acid sequence being at least 90 % homologous to 
LVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAKELNIDVVGVSFHVGSGCTDPETFV 

20 QAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEEITGVINPALDKYFPSDSG 
VRIIAEPGRYYVASAFTLAVMIAKKIVLKEQTGSDDE^ 

CILYDHAHVKTLLQKRPKPDEKYYSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFEN 
MGAYTVAAASTFNGFQRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCA 
WESGMKRHRAACASASINV corresponding to amino acids 40 - 350 of AAA59968, which 
25 also corresponds to amino acids 30 - 340 of HUMODCAJP9, wherein said first and second 
amino acid sequences are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a head of HUMODCAJP9, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 

30 sequence MKSLTATS SMKVLLPRTF WTRKLMKFLLL of HUMODCAJP9. 

Comparison report between HUMODCA_P9 and AAH14562 (SEQ ID NO:1703): 
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1 .An isolated chimeric polypeptide encoding for HUMODCAP9, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence MKSLTATSSMKVLLPRTFWTRKLMKFLLL corresponding to amino acids 1 - 
5 29 of PIUMODCAP9, and a second amino acid sequence being at least 90 % homologous to 
LVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAKELNIDVVGVSFHVGSGCTDPE1FV 
QAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEEITGVINPALDKYFPSDSG 
VRIIAEPGRYYVASAFTLAWIIAKKIVLKEQTGSDDEDESSEQTFMYYVNDGVYGSFN 
CIL YDH AH VKPLLQKRPKPDEKY YS S SI WGPTCDGLDRI VERCDLPEMH VGD WMLFEN 

10 MGAYTVAAASTFNGFQRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCA 
WESGMKRHRAACASASINV corresponding to amino acids 86 - 396 of AAH14562, which 
also corresponds to amino acids 30 - 340 of HUMODCA_P9, wherein said first and second 
amino acid sequences are contiguous and in a sequential order. 

2 .An isolated polypeptide encoding for a head of HUMODCAJP9, comprising a 

15 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence MKSLTATSSMKVLLPRTFWTRKLMKFLLL of HUMODCA_P9. 

The location of the variant protein was determined according to results from a number of 
20 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 
25 Variant protein HUMODCA_P9 also has the following non-silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 289, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HUMODCA_P9 
sequence provides support for the deduced sequence of this variant protein according to the 
30 present invention). 

Table 289 - Amino acid mutations 
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■ SNPposition(s) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


150 


I->S 


No 


150 


I-> V 


No 


262 


F->L 


No 


263 


E-> 


No 


263 


E->G 


No 


30 


L-> 


No 


301 


N-> 


No 


301 


N->K 


No 


309 


E->K 


No 


312 


D->N 


No 


323 


E->K 


No 


329 


H->P 


No 


174 


I-> 


No 


34 


I-> 


No 


59 


L-> 


No 


70 


V-> 


No 


86 


T-> 


No 


86 


T->N 


No 


90 


A-> 


No 


94 


A-> 


No 


97 


V-> 


No ] 


97 


V->G 


No 


198 


N->D 


No 


200 


G-> 


No 


3 


S -> 


No 


207 


C->G 


No 


207 


C ->R 


No 


223 


P -> 


No 
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262 



F-> 



No 



Variant protein HUMODCAJP9 is encoded by the following transcript(s): 
HUMODCAJT17, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HUMODCAT17 is shown in bold; this coding portion starts at 
5 position 528 and ends at position 1547. The transcript also has the following SNPs as listed in 
Table 290 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HUMODCAJP9 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 290 - Nucleic acid SNPs 



SKP position on nucleotide? -. 
sequence / v \v 


Alternative nucleic . acid " ' • 




28 


C ->G 


Yes 


210 


C-> 


No 


536 


T-> 


No 


615 


T-> 


No 


628 


T-> 


No 


703 


T-> 


No 


736 


T-> 


No 


784 


C -> 


No 


784 


C->A 


No 


797 


A-> 


No 


797 


A->T 


No 


808 


C-> 


No 


217 


C-> 


No 


817 


T-> 


No 


817 


T->G 


No 


869 


C ->T 


Yes 


975 


A->G 


No 
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976 


T->G 


No 


1048 


T-> 


No 


1119 


A->G 


No 


1127 


C-> 


No 


1127 


C->G 


No 


1146 


T->C 


No 


366 


G->C 


No 


1146 


T->G 


No 


1194 


C -> 


No 


1283 


T->C 


Yes 


1311 


T-> 


No 


1311 


T->C 


No 


1315 


A-> 


No 


1315 


A->G 


No 


1430 


C-> 


No 


1430 


C-> A 


No 


1433 


C->G 


No 


366 


G->T 


No 


1433 


C->T 


Yes 


1452 


G-> A 


No 


1461 


G-> A 


No 


1494 


G-> A 


No 


1513 


A->C 


No 


1632 


T-> 


No 


1673 


C-> 


No 


1739 


T-> 


No 


1739 


T->G 


No 


1742 


T->C 


No 


447 


G->A 


Yes 


1786 


C-> 


No 
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1786 


C->G 


No 


1832 


T->C 


Yes 


1877 


C->T 


No 


464 


T->G 


Yes 


473 


A->G 


Yes 


506 


G-> A 


Yes 


521 


T-> 


No 



As noted above, cluster HUMODCA features 17 segment(s), which were listed in Table 
284 above and for which the sequence(s) are given at the end of the application. These 
segment(s) are portions of nucleic acid sequence(s) which are described herein separately 
because they are of particular interest. A description of each segment according to the present 



5 invention is now provided. 

Segment cluster HUMODCA_node_l according to the present invention is supported by 
76 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODCA_T17. Table 291 below describes the 
10 starting and ending position of this segment on each transcript. 

Table 291 - Segment location on transcripts 



Transcript name if 


Segment; starting /position § 


Segment ending position ; ; 


HUMODCA_T17 


118 


256 



Segment cluster HUMODCA_node_25 according to the present invention is supported by 
15 190 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODCAJT17. Table 292 below describes the 
starting and ending position of this segment on each transcript. 

Table 292 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


HUMODCA_T17 


614 


748 
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Segment cluster HUMODCA nodej 2 according to the present invention is supported by 
249 libraries. The number of libraries was determined as previously described. This segment can 



5 be found in the following transcript(s): HUMODCA_T17. Table 293 below describes the 
starting and ending position of this segment on each transcript. 

Table 293 - Segment location on transcripts 



Transcript name V : - || ,'• 


Segment Starting position ; - 


Segment ending position 


HUMODCA_T17 


915 


1077 



1 0 Segment cluster HUMODC A_node_3 6 according to the present invention is supported by 

348 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODC A T 1 7 . Table 294 below describes the 
starting and ending position of this segment on each transcript. 

Table 294 - Segment location on transcripts 



Transcript name 


Segment starting position 


; Segment ending position 


HUMODCA_T17 


1191 


1405 



15 

Segment cluster HUMODC A_node_3 9 according to the present invention is supported by 
297 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODCA_T17. Table 295 below describes the 
20 starting and ending position of this segment on each transcript. 

Table 295 - Segment location on transcripts 



Transcript name 


Segment starting position 


: Segment ending position 


HUMODCA_T17 


1461 


1633 
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Segment cluster HUMODCA_node_41 according to the present invention is supported by 
230 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODCAJT17. Table 296 below describes the 
starting and ending position of this segment on each transcript. 



5 Table 296 - Segment location on transcripts 



Transcript name - \k. v \ 


Segment starting position 


Segment ending position 


HUMODCA_T17 ^ 


1728 


1893 



According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



10 Segment cluster HUMODCA_node_0 according to the present invention is supported by 

9 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODCAJT17. Table 297 below describes the 
starting and ending position of this segment on each transcript. 

Table 297 - Segment location on transcripts 



Transcript name : . -V; 


Segment starting position y f 


Segment ending position- .... • 


HUMODCA_T17 


1 


117 



15 



Segment cluster HUMODCA node 1 0 according to the present invention is supported by 
107 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODCA_T17. Table 298 below describes the 
20 starting and ending position of this segment on each transcript. 

Table 298 - Segment location on transcripts 



Transcript name 


Segment starting position 


1 Segment ending position 


HUMODCA_T17 


385 


494 
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Segment cluster HUMODCA_node_12 according to the present invention is supported by 
132 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODCAT17. Table 299 below describes the 
starting and ending position of this segment on each transcript. 



5 Table 299 - Segment location on transcripts 



Transcript name 


Segment starting position j 


Segment ending position 


HUMODCA_T17 


495 


586 



Segment cluster HUMODCA_node_13 according to the present invention is supported by 
126 libraries. The number of libraries was determined as previously described. This segment can 
10 be found in the following transcript(s): HUMODCA T17. Table 300 below describes the 
starting and ending position of this segment on each transcript. 

Table 300 - Segment location on transcripts 



Tran|cri£rt - 


Segme^llaxting position 


Segment ending portion / T 


HUMODCA_TT7 


587 


613 



15 Segment cluster HUMODCA_node_2 according to the present invention is supported by 

81 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODCA_T17. Table 301 below describes the 
starting and ending position of this segment on each transcript. 

Table 301 - Segment location on transcripts 



Transcript name 


Segment starting position 


! Segment ending position 


HUMODCA_T17 


257 


328 



20 

Segment cluster HUMODCA_node_27 according to the present invention is supported by 
185 libraries. The number of libraries was determined as previously described. This segment can 
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be found in the following transcript(s): HUMODCA_T17. Table 302 below describes the 
starting and ending position of this segment on each transcript. 

Table 302 - Segment location on transcripts 



Transcript name \. 5 


, Segment starting position < ' 


Segment ending position , ? 


HUMODCA_T17 


749 


830 



5 

Segment cluster HUMODCA node 3 according to the present invention is supported by 
85 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODC A T 1 7 . Table 303 below describes the 
starting and ending position of this segment on each transcript. 

1 0 Table 303 - Segment location on transcripts 



Transcript name .; < K^f .f^t 


Segment starting position 


Segment ending position 


HUMODCA_T17 


329 


384 



Segment cluster HUMODC A node 3 0 according to the present invention is supported by 
196 libraries. The number of libraries was determined as previously described. This segment can 
15 be found in the following transcript(s): HUMODCA_T17. Table 304 below describes the 
starting and ending position of this segment on each transcript. 

Table 304 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


HUMODCA_T17 


831 


914 



20 Segment cluster HUMODCA__node_34 according to the present invention is supported by 

259 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODCA_T17. Table 305 below describes tte 
starting and ending position of this segment on each transcript. 
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Table 305 - Segment location on transcripts 



Transcript name 


Segment starting position , 


1 Segment ending position 


HUMODCA_T17 


1078 


1190 



Segment cluster HUMODCA_node_38 according to the present invention is supported by 
5 272 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODCA_TT7. Table 306 below describes the 
starting and ending position of this segment on each transcript. 

Table 306 - Segment location on transcripts 



Transcript name 


Segment starting position 


t Segment ending position 


HUMODCA_T17 


1406 


1460 



Segment cluster HUMODCA_node_40 according to the present invention is supported by 
239 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HUMODCA_T17. Table 307 below describes the 
starting and ending position of this segment on each transcript. 

15 Table 307 - Segment location on transcripts 



Transcript name . ; i \ ; , 


Segment starting position 


i Segment ending position .' 


HUMODCA_T17 


1634 


1727 



20 

Variant protein alignment to the previously known protein: 

Sequence name: /tmp/y03EwE6i01/dRQ512K6e2 : DCOR_HUMAN 
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Sequence documentation : 



Alignment of: HUMODCA_P9 x DC OR_HUMAN 



5 Alignment segment 1/1: 



Escore : 



0 



Quality: 3056.00 



311 



Matching length: 

10 length: 311 

Matching Percent Similarity: 100.00 

Identity: 100.00 

Total Percent Similarity: 100.00 

Identity: 100.00 

15 Gaps: 0 



Total 



Matching Percent 



Total Percent 



Alignment : 



20 



30 L VLRI AT DD S KAVCRL S VK FG AT L RT S RLLLER AKE LN I D VVG VS FH VG S 7 9 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I ! I I I I I I I I I I I I I 

151 LVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAKELNIDVVGVSFHVGS 20 0 



25 



80 GCTDPETFVQAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEE 12 9 

I I I i I I I I I I I I I I 1 I I I I I 1 I I I I I I I I I! I I I I I I I I I I I I I I I I 1 I I 

201 GCTDPETFVQAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEE 250 



30 



130 I T G V I N P A LDKY F P S D S G VR 1 1 AE P GR Y Y VA S AF T L AVN 1 1 AKK I VL KE Q 179 

I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

251 ITGVINPALDKYFPSDSGVRIIAEPGRYYVASAFTLAVNIIAKKIVLKEQ 300 

180 TGSDDEDESSEQTFMYYVNDGVYGSFNCILYDHAHVKPLLQKRPKPDEKY 22 9 
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I I i f I I I I I I I I f I I I I I I I I I I I I I I I I I ! I I I I I ! I I | I I | I | | | M I 

301 TGSDDEDESSEQTFMYYVNDGVYGSFNCILYDHAHVKPLLQKRPKPDEKy 350 

230 YSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFENMGAYTVAAASTFNGF 27 9 
5 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I | | | | | | | | 1 | | 

351 YSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFENMGAYTVAAASTFNGF 40 0 

280 QRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCAWESGMKRH 329 

I I I I I ! I I I I I 1 M I I I I I I I I I I I I I I I I I I I I II ! I I I I I I I I I I I I I 

10 401 QRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCAWESGMKRH 450 

330 RA AC AS AS I NV 34 0 

I I I I I.I I I I I I 

451 RAACAS AS I NV 4 61 

15 



20 

Sequence name: /tmp/y03EwE6i01/dRQ512K6e2 :AAA59968 
Sequence documentation : 
25 Alignment of: HUMODCA_P9 x AAA59968 
Alignment segment 1/1: 

Quality: 3056.00 

30 Escore: 0 
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Matching length: 311 
length: 311 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 



Total 



100.00 Matching Percent 



Total Percent 



10 



Alignment : 



30 LVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAKELNIDVVGVSFHVGS 7 9 

I I i I I I I! I I I II I I I I I I I I I I I I I I I I I ! I I I I i I I I I I I I I I I i I I I 

4 0 LVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAKELNIDVVGVSFHVGS 8 9 



15 



80 G C T D P E T F VQ A I S D ARC VF DM G AE VG F S MY LLDIGGGFPGSE D VKLK FEE 129 

I I I ! I I I I I I I I I I I I I II II I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

90 GCTDPETFVQAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEE 13 9 



20 



130 ITGVINPALDKYFPSDSGVRIIAEPGRYYVASAFTLAVNIIAKKIVLKEQ 17 9 

I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
140 I T GV I N PAL DK Y F P S D S G VRI I AE P GRY Y VA S AFT L AVN 1 1 AKK I VLKEQ 18 9 



25 



180 T G S DDE DE S S E Q T FMY YVN DGV Y G S FN C I L Y DH AHVKP LLQKRPK P DEK Y 229 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I 
190 T G S DDE DE S SEQT FMY YVNDGVYGS FN C I L YDHAHVKPLLQKRPKP DEKY 23 9 



30 



230 YSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFENMGAYTVAAASTFNGF 27 9 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

240 YSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFENMGAYTVAAASTFNGF 28 9 

280 QRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCAWESGMKRH 32 9 
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I | I | I I I I I I i I I I I t I I I I 1 I I I I I I I I I I I I I I I I 1 I 1 I ! I 11 II I I I 

290 QRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCAWESGMKRH 33 9 

330 RAACAS AS INV 340 
I I I I 11 I I I I I 

340 RAACAS AS INV 350 



10 



Sequence name: /tmp/y03EwE6i01/dRQ512K6e2 : AAH14562 
15 Sequence documentation: 

Alignment of: HUMODCA_P9 x AAH145 62 
Alignment segment 1/1: 

20 

Quality: 3056.00 

Escore: 0 

Matching length: 311 Total 

length: 311 

25 Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

30 

Alignment : 
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30 LVLRI AT DDSKAVCRLS VKFGATLRT SRLLLERAKELN I DVVGVS FHVGS 7 9 

I I i I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I i I I I I 

8 6 LVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAKELNI DVVGVS FHVGS 135 

5 • • • • • 

8 0 GCTDPETFVQAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEE 12 9 

I ! I I I I I I I I I I I I 1 I 1 I I 1 ! 1 I I I I I 11 I I I I I I ! I I I I I I I i I 1 I I I I 

13 6 GCTDPETFVQAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEE 185 
• . . • 

10 130 ITGVINPALDKYFPSDSGVRIIAEPGRYYVASAFTLAVNIIAKKIVLKEQ 179 

I I I I I II I I I I i I I 1 1 I I I I ! I I 1 I I I I II I I I I I I I 1 I I I I I I 1 I I I I 1 
186 I T GV I N PAL DK Y F P S D S G VRI 1 AE P GRY Y VA S AF TL AVN 1 1 AKK I VLKE Q 235 

. » • • ■ 

18 0 TGSDDEDESSEQTFMYYVNDGVYGSFNCILYDHAHVKPLLQKRPKPDEKY 229 

15 | I I I M I I I I I I I I I I I I I i I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

23 6 TGSDDEDESSEQTFMYYVNDGVYGSFNCILYDHAHVKPLLQKRPKPDEKY 285 

23 0 YSSSIWGPTCDGLDR1VERCDLPEMHVGDWMLFENMGAYTVAAASTFNGF 27 9 
I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
20 28 6 YSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFENMGAYTVAAASTFNGF 335 

28 0 QRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCAWESGMKRH 32 9 

I I I I I I II I I I I I I I I 11 I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
33 6 QRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCAWESGMKRH 385 



25 



330 RA AC AS AS INV 340 
I I I I I I I I I I I 

38 6 RAACASAS INV 3 96 



30 



DESCRIPTION FOR CLUSTER R00299 
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Cluster R00299 features 1 transcript(s) and 12 segment(s) of interest, the names for which 
are given in Tables 308 and 309, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 310. 

Table 308 - Transcripts of interest 



5 



Transcript Name [ , V "• . 


Sequence ID No. " 


R00299JT2 


33 


Table 309- Segments of interest 


: Segm'ent Name .»■-■■ 


Sequence ID No. . .- : ^v*'- ' '■ .£ : 


R00299_node_2 


422 


R00299_node_30 


423 


R00299_node_10 


424 


R00299_node_14 


425 


R00299_node_15 


426 


R00299_node_20 


427 


R00299_node_23 


428 


R00299_node_25 


429 


R00299_node_28 


430 


R00299_node_31 


431 


R00299_node_5 


432 


R00299_node_9 


433 


Table 310 - Proteins of interest 


Protein Name 


Sequence ID No. 


R00299_P3 


1311 



10 These sequences are variants of the known protein Tescalcin (SwissProt accession 

identifier TESCHUMAN; known also according to the synonyms TSC), SEQ ID NO: 1427, 
referred to herein as the previously known protein. 
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Protein Tescalcin is known or believed to have the following fimction(s): Binds calcium. 
The sequence for protein Tescalcin is given at the end of the application, as "Tescalcin amino 
acid sequence". 

The following GO Annotation(s) apply to the previously known protein. The following 
5 annotation(s) were found: calcium binding, which are annotation(s) related to Molecular 
Function. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

10 

Cluster R00299 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 26 below refer to weighted expression of ESTs 
15 in each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expressbn of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 26 and Table 311. This cluster is overexpressed (at least at a minimum level) in the 
20 following pathological conditions: lung malignant tumors. 



Table 311 - Normal tissue distribution 



Name of Tissue 




bone 


0 


colon 


0 


epithelial 


11 


general 


11 


liver 


0 ! 


lung 


10 


lymph nodes 


22 
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bone marrow 


31 


ovary 


0 


pancreas 


14 


prostate 


16 


stomach 


76 


T cells 


0 


Thyroid 


0 



Table 312 - P values and ratios for expression in cancerous tissue 



Nafiae of Tissue - 


PI ; 4 


i?2 

■ '#'••• ,. 


SP1 . 


R3 


sp2 : ; ? 


R4 


bone 


i 


6.7e-01 


1 


1.0 


7.0e-01 


1.4 


colon 


5.0e-02 


5.3e-02 


2.4e-01 


2.8 


2.1e-01 


2.8 


epithelial 


7.7e-02 


9.5e-02 


4.0e-01 


1.3 


6.1e-03 


1.9 


general 


2.3e-01 


2.6e-01 


5.3e-01 


1.0 


2.6e-04 


1.9 


liver 


1 


4.5e-01 


1 


1.0 


6.9e-01 


1.5 


lung 


4.9e-01 


2.7e-01 


6.5e-01 


1.7 


5.6e-04 


3.8 


lymph nodes 


8.5e-01 


8.7e-01 


1 


0.5 


2.0e-01 


1.1 


bone marrow 


8.6e-01 


8.5e-01 


1 


0.5 


2.3e-01 


1.4 


ovary 


4.0e-01 


4.4e-01 


1 


1.1 


1 


1.1 


pancreas 


7.2e-01 


6.9e-01 


6.7e-01 


1.0 


3.5e-01 


1.5 


prostate 


8.7e-01 


9.1e-01 


6.7e-01 


1.0 


7.5e-01 


0.9 


stomach 


6.6e-01 


7.5e-01 


1 


0.4 


6.7e-01 


0.7 


T cells 


1 


6.7e-01 


1 


1.0 


5.2e-01 


1.8 


Thyroid 


1.8e-01 


1.8e-01 


6.7e-01 


1.6 


6.7e-01 


1.6 



„ _ ^ note{ | a bove ? cluster R00299 features 1 transcript(s), which were listed in Table 308 
above. These transcript(s) encode for protein(s) which are variant(s) of protein Tescalcin. A 
5 description of each variant protein according to the present invention is now provided. 

Variant protein R00299JP3 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) R00299 T2. An 
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alignment is given to the known protein (Tescalcin) at the end of the application. One or more 
alignments to one or more previously published protein sequences are given at the end of the 
application. A brief description of the relationship of the variant protein according to the present 
invention to each such aligned protein is as follows: 

Comparison report between R00299JP3 and Q9NWT9 (SEQ ID NO: 1704): 

1. An isolated chimeric polypeptide encoding for R00299JP3, comprising a first amino 
acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence MAEKALLCPSSAGLGTWPWVLNSAWPVLPLAVDQGVDWRPRGPV 
corresponding to amino acids 1 - 44 of R00299JP3, second amino acid sequence being at least 
90 % homologous to 

SSDQIEQLHRRPKQLSGDQPTIRKENFNNVPDLELNPIRSKIVRAFFDNRNLRKGPSGLA 
DEINFEDFLTIMSYFRPIDTTMDEEQVELSRKEKLRFLFHMYDSDSDGRITLEEYRNV 
corresponding to amino acids 74 - 191 of Q9NWT9, which also corresponds to amino acids 45 - 
162 of R00299_P3, and a third amino acid sequence being at least 70%, optionally at least 80%, 
preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 

VEELLSGOTHIEKESARSIADGAMMEAASVCMGQMEPDQWEGITFEDFLKIWQGIDIE 
TKMHVRFLNMETMALCH corresponding to amino acids 163 - 238 of R00299JP3, wherein 
said first, second and third amino acid sequences are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a head of R00299JP3, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
MAEKALLCPSSAGLGTWPWVLNSAWPVLPLAVDQGVDWRPRGPV of R00299JP3. 

3. An isolated polypeptide encoding for a tail of R00299JP3, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
VEELLSGNPHIEKESARSIADGAMMEAASVCMGQMEPDQVYEGITFEDFLKIWQGIDIE 

TKMHVRFLNMETMALCH in R00299_P3. 

Comparison report between R00299JP3 and TESCJHUMAN: 
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l.An isolated chimeric polypeptide encoding for R00299_P3, comprising a first amino 
acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence MAEKALLCPSSAGLGTWPWVLNSAWPVLPLAVDQGVDWRPRGPV 
5 corresponding to amino acids 1 - 44 of R00299_P3, and a second amino acid sequence being at 
least 90 % homologous to 

SSDQIEQLHRRPKQLSGDQPTIRKENFNNVPDLELNPIRSKIVRAFFDNRNLRKGPSGLA 

DEINFEDFLTIMSYFmDTTMDEEQVELSRKEKLRFLFHMYDSDSDGRITLEEYRNVVE 

ELLSGNPHIEKESARSIADGAMMEAASVCMGQMEPDQVYEGITFEDFLKIWQGIDIETK 
10 MHVRFLNMETMALCH corresponding to amino acids 21 - 214 of TES C_HUMAN, which 

also corresponds to amino acids 45 - 238 of R00299_P3, wherein said first and second amino 

acid sequences are contiguous and in a sequential order. 

2 An isolated polypeptide encoding for a head of R00299_P3, comprising a polypeptide 

being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
1 5 at least about 90% and most preferably at least about 95% homologous to the sequence 

MAEKALLCPSSAGLGTWPWVLNSAWPVLPLAVDQGVDWRPRGPV of R00299_P3. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
20 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because one of the two signal- 
peptide prediction programs (HMM:Signal peptide,NN:NO) predicts that this protein has a 
signal peptide. 

Variant protein R00299_P3 also has the following non-silent SNPs (Single Nucleotide 
25 Polymorphisms) as listed in Table 313, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein R00299_P3 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

Table 313 -Amino acid mutations 
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SNP position(s) on amino acid 
• sequence 4 V .. ; .{■. 


Alternative amino acid(s) 


Previously known SNP? 


120 


R->G 


No 


120 


R->W 


No 



Variant protein R00299JP3 is encoded by the following transcript(s): R00299_T2, for 
which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
R00299_T2 is shown in bold; this coding portion starts at position 142 and ends at position 855. 
The transcript also has the following SNPs as listed in Table 314 (given according to their 
position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
R00299JP3 sequence provides support for the deduced sequence of this variant protein 
according to the present invention). 
Table 314 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence : V 



Alternative nucleic acid , 



Previously known SNP? 



177 



C -> A 



Yes 



499 



C->G 



No 



499 



C ->T 



No 



900 



G->T 



Yes 



916 



G-> 



No 



15 



969 



G-> 



No 



969 



G-> A 



No 



987 



No 



A->C 

As noted above, cluster R00299 features 12 segments), which were listed in Table 309 
above and for which the sequence^) are given at the end of the application. These segments) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
provided. 
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Segment cluster R00299_node_2 according to the present invention is supported by 3 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R00299JT2. Table 315 below describes the starting and 
ending position of this segment on each transcript. 



5 Table 315 - Segment location on transcripts 



Transcript name . , •.. 


Segment starting position 


Segment ending position ; , 


R00299_T2 


1 


271 



Segment cluster R00299_node__30 according to the present invention is supported by 75 
libraries. The number of libraries was determined as previously described. This segment can b< 
10 found in the following transcript(s): R00299 T2. Table 316 below describes the starting and 
ending position of this segment on each transcript. 



Table 316- Segment location on transcripts 



Transcript name* 


Segment starting pbsitfon^: f 


Seg&iertt arising pmtioiEi ; '[ 

' ? ' if' if' V ¥-'f^ ■■d: f 


R00299_T2 


790 


961 



15 

According to an optional embodiment of the present invention, short segments related 1 
the above cluster are also provided. These segments are up to about 120 bp in length, and so < 
included in a separate description. 

20 Segment cluster R00299_node_10 according to the present invention is supported by 4 

libraries. The number of libraries was determined as previously described. This segment can 
found in the following transcript(s): R00299_T2. Table 317 below describes the starting and 
ending position of this segment on each transcript. 

Table 317 - Segment location on transcripts 



Transcript name 



Segment starting position 



Segment ending position 
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R00299_T2 


346 


422 









Segment cluster R00299_node_14 according to the present invention is supported by 61 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R00299_T2. Table 318 below describes the starting and 
ending position of this segment on each transcript. 

Table 318 - Segment location on transcripts 



Transcript name £ ^ 


Segment starting position js\ 


Segment ending position 


R00299_T2 


423 


537 



Segment cluster R00299_node_15 according to the present invention can be found in the 
following transcript(s): R00299_T2. Table 319 below describes the starting and ending position 
of this segment on each transcript. 

Table 319 - Segment location on transcripts 



Transbript namp # : 


' S&gme# starlag position: -l; - 


Segment ending position 


R00299JT2 


538 


562 



Segment cluster R00299_node_20 according to the present invention is supported by 66 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R00299_T2. Table 320 below describes the starting and 
ending position of this segment on each transcript. 

Table 320 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


R00299_T2 


563 


624 
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Segment cluster R00299_node_23 according to the present invention is supported by 71 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R00299 T2. Table 321 below describes the starting and 
ending position of this segment on each transcript. 



5 Table 321 - Segment location on transcripts 



Transcript name / ? 


Segment starting position 


/ .Segment ending position 


R00299_T2 


625 


732 



Segment cluster R00299_node_25 according to the present invention is supported by 62 
libraries. The number of libraries was determined as previously described. This segment can be 
10 found in the following transcript(s): R00299 T2. Table 322 below describes the starting and 
ending position of this segment on each transcript. 

Table 322 - Segment location on transcripts 



iT^anscript name / 


Segment starting position 


Segment ending position 


R00299JT2 


733 


780 



1 5 Segment cluster R00299_node_28 according to the present invention can be found in the 

following transcript(s): R00299_T2. Table 323 below describes the starting and ending position 
of this segment on each transcript. 
Table 323 - Segment location on transcripts 



Transcript name 


Segment starting position 


[: Segment ending position 


R00299_T2 


781 


789 



20 

Segment cluster R00299_node_3 1 according to the present invention is supported by 48 
libraries. The number of libraries was determined as previously described. This segment can be 
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found in the following transcript(s): R00299JT2. Table 324 below describes the starting and 
ending position of this segment on each transcript. 
Table 324 - Segment location on transcripts 



.I^nscript name * V»\ ? 7 


Segment starting position 


Segment ending position 


R00299_T2 


962 


1069 



5 

Segment cluster R00299_node_5 according to the present invention is supported by 45 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R00299JT2. Table 325 below describes the starting and 
ending position of this segment on each transcript. 

10 Table 325 - Segment location on transcripts 





Se^entsta^Bg|>osi1wa 


Segment endit%BdsItiOTL 


R00299JT2 


272 


341 



Segment cluster R00299_node_9 according to the present invention can be found in the 
following transcript(s): R00299JT2. Table 326 below describes the starting and ending position 
15 of this segment on each transcript. 

Table 326 - Segment location on transcripts 



Transcript name ! . 


Segment starting position 


Segment ending position 


R00299_T2 


342 


345 



Microarray (chip) data is also available for this gene as follows. As described above with 
20 regard to the cluster itself, various oligonucleotides were tested for being differentially 

expressed in various disease conditions, particularly cancer. The following oligonucleotide was 
found to hit this segment (with regard to lung cancer), shown in Table 327. 

Table 327 - Oligonucleotide related to this gene 
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Oligonucleotide name 


Overexpressed in cancers 


Chip reference 


R00299_0_8J) 


lung cancer 


Lung 



5 

Variant protein alignment to the previously known protein: 
Sequence name: /tmp/OleVDhrKQO /E jblgLomjM : Q9NWT9 

Sequence documentation : 

10 

Alignment of: R00299_P3 x Q9NWT9 

Alignment segment 1/1: 

15 Quality: 1162.00 

Escore: 0 

Matching length: 118 
length: 118 
Matching Percent Similarity: 100.00 
20 Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 

25 Alignment: 

45 S S DQIEQLHRRFKQLS GDQPT IRKENFNNVPDLELNP IRSKI VRAFFDNR 94 

I M I I I I I I I I 1 I I I I I I I I I I I II I I I II I I I II f I I i I I I I I I I I I I I 

7 4 SSDQIEQLHRRFKQLSGDQPTIRKENFNNVPDLELNPIRSKIVRAFFDNR 123 



Total 
Matching Percent 
Total Percent 
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95 NLRKGPSGLADEINFEDFLTIMSYFRPIDTTMDEEQVELSRKEKLRFLFH 144 

I I M I M ! I I i I I I I I M I M I I I I I i i I I I I i i i I I M I I I 1 I I I I I I I 

124 NLRKGPS GLADE INFEDFLT IMS YFRP1DTTMDEEQVELSRKEKLRFLFH 173 

145 MYDSDSDGRITLEEYRNV 162 

I I I I I I I I I I I I I 1 I I I I 

17 4 MYDSDSDGRITLEEYRNV 191 



Sequence name: /tmp/OleVDhrKQO/EjblgLomjM: TESC_HUMAN 



Sequence documentation : 



Alignment of: R00299_P3 x TESC_HUMAN 



Alignment segment 1/1: 



Quality: 1920.00 

Escore: 0 

Matching length: 194 

length: 194 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 



Total 



Matching Percent 



Total Percent 
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Alignment : 

45 S S DQ I EQLHRRFKQLSGDQPT I RKENFNNVPDLELNP IRSKI VRAFFDNR 94 

| | | | I | | | I I I I I I I I I I I I II I I I I I I I I I t I I I I HI I I M I 1 M II I 

21 SSDQIEQLHRRFKQLSGDQPTIRKENFNNVPDLELNPIRSKIVRAFFDNR 7 0 
95 NLRKGPSGLADEINFEDFLTIMSYFRPIDTTMDEEQVELSRKEKLRFLFH 144 

I | | | | | | | I I I M II I II 1 I I I I I I I I I I I M I I I I I I II I I I M I I I M 

71 NLRKGPSGLADEINFEDFLTIMSYFRPIDTTMDEEQVELSRKEKLRFLFH 120 
145 MYDSDSDGRITLEEYRNVVEELLSGNPHIEKESARSIADGAMMEAASVCM 194 

| I I I 1 I I I I I I I I I I M i I I I I I 11 I I I I I ! I I I I I I I I I I I I I I I I I I I 

121 MYDSDSDGRITLEEYRNVVEELLSGNPHIEKESARSIADGAMMEAASVCM 170 
195 GQMEPDQVYEGITFEDFLKIWQGI DIETKMHVRFLNMETMALCH 238 

I | | I I I I I I I I I I I 1 I I I I I I I i I I I M I I I M I I I I I t I I I I I 

171 GQMEPDQVYEGITFEDFLKIWQGI DIETKMHVRFLNMETMALCH 214 

DESCRIPTION FOR CLUSTER W60282 
Cluster W60282 features 1 transcript(s) and 6 segment(s) of interest, the names for which 
are given in Tables 328 and 329, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 330. 

Table 328 - Transcripts of interest 



Transcript Name 


Sequence ID No, 


W60282_PEA_ljm 


34 


Table 329 - Segments of interest 


Segment Name 


Sequence ID No. 


W60282 JPE AJ_node„l 0 


434 
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W60282_PEA_l_node_l 8 


435 


W60282_PEA_l_node_22 


436 


W60282_PEA_l_node_5 


437 


W60282_PEA_l_node_2 1 


438 


W60282 PEA1 _node_8 


439 


Table 330 - Proteins of interest 


fJPro&anName , x ' 


Sequence ID- No. '\ ;v „ -. J 

.' -"V. *- . ■ S *2 


W60282_PEA_1_P14 


1312 



These sequences are variants of the known protein Kallikrein 1 1 precursor (SwissProt 
5 accession identifier KLKB HUMAN ; known also according to the synonyms EC 3.4.2L-; 
Hippostasin; Trypsin- like protease), SEQ ID NO: 1428, referred to herein as the previously 
known protein. 

Protein Kallikrein 1 1 precursor is known or believed to have the following function(s): 
Possible multifunctional protease. Efficiently cleaves bz-Phe-Arg-4-methylcoumaryl-7-amide, a 
10 kallikrein substrate, and weakly cleaves other substrates for kallikrein and trypsin. The sequence 
for protein Kallikrein 1 1 precursor is given at the end of the application, as "Kallikrein 1 1 
precursor amino acid sequence". Protein Kallikrein 11 precursor localization is believed to be 
Secreted. 

The following GO Annotation(s) apply to the previously known protein. The following 
15 annotation(s) were found: proteolysis and peptidolysis, which are annotation(s) related to 
Biological Process; and chymotrypsin; trypsin; serine-type peptidase; hydrolase, which are 
annotation(s) related to Molecular Function. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
20 from <http://www.ncbi.nhxi.nih.gov/projects/LocusLink/>. 

As noted above, cluster W60282 features 1 transcript(s), which were listed in Table 1 
above. These transcript(s) encode for protein(s) which are variant(s) of protein Kallikrein 1 1 
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precursor. A description of each variant protein according to the present invention is now 
provided. 

Variant protein W60282JPEA_1 JP14 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
W60282_PEA_1_T1 1. An alignment is given to the known protein (Kallikrein 1 1 precursor) at 
the end of the application. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 

Comparison report between W60282_PEA_1 J>14 and Q8IXD7 (SEQ ID NO:1705): 

1. An isolated chimeric polypeptide encoding for W60282JPEAJ JP14, comprising a first 
amino acid sequence being at least 90 % homologous to 

MRILQLILLALATGLVGGETRIIKGFECKPHSQPWQAALFEKTRLLCGATLIAPRWLLTA 
AHCLKP corresponding to amino acids 1 - 66 of Q8IXD7, which also corresponds to amino 
acids 1 - 66 of W60282_PEA_1_P14, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
TPASHLAMRQHHHH corresponding to amino acids 67 - 80 of W60282_PEA_1 _P14, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of W60282JPEA_1 JP14, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence TPASHLAMRQHHHH in W60282_PEA_1_P14. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans-membrane region. 
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Variant protein W60282JPEA_1 JP14 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 331, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein W60282JPEA_1 _P14 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 331 - Amino acid mutations 



§TS[P p6sMon(s jon amiaSacid 

' ' . , if - y " ;;? : y~\ 
sequence ^ ■ "'" ;7- h' 

\£\ ■ . L ^ 


Alternative axninQacid(s) ^/ ! 


, Previously knowii SNB? 


17 


G->E 


Yes 


41 


E->K 


No 



Variant protein W60282_PEA_1_P14 is encoded by the following transcript(s): 
10 W60282_PEA_1_T1 1, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript W60282_PEA_1_T1 1 is shown in bold; this coding portion starts at 
position 705 and ends at position 944. The transcript also has the following SNPs as listed in 
Table 332 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
1 5 known SNPs in variant protein W60282_PEA_1_P14 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

Table 332- Nucleic acid SNPs 



SNP position on nucleotide 
sequence 



219 



Alternative nucleic acid 



A->G 



Previously known SNP? 



Yes 



702 



G-> A 



Yes 



754 



G-> A 



Yes 



825 



G->A 



No 



1289 



A->G 



Yes 



As noted above, cluster W60282 features 6 segment(s), which were listed in Table 329 
above and for which the sequence(s) are given at the end of the application. These segment(s) 
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are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
provided. 

5 Segment cluster W60282_PEA_l_node__10 according to the present invention is 

supported by 45 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): W60282JPEAJ JT1 1 . Table 333 below 
describes the starting and ending position of this segment on each transcript. 

Table 333 - Segment location on transcripts 



Transcript name | ; 


Segment starting position 


Segment ending position *' 


W60282_PEA_1_T11 


745 


901 



10 



Segment cluster W60282 PEA_l_node_18 according to the present invention is 
supported by 49 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): W60282 PEA1 Tl 1. Table 334 below 
15 describes the starting and ending position of this segment on each transcript. 

Table 334 - Segment location on transcripts 



Transcript name ■ . : 


Segment starting position ^} 


Segment ending position ^ 


W60282_PEA_1_T11 


902 


1038 



Segment cluster W60282_PEA_l_node_22 according to the present invention is 
20 supported by 67 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): W60282_PEA_1JT1 1. Table 335 below 
describes the starting and ending position of this segment on each transcript. 

Table 335- Segment location on transcripts 



Transcript name 


Segment starting position 


! Segment ending position 


W60282_PEA_1_T11 


1072 


1507 
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Segment cluster W60282JPEA__l_node_5 according to the present invention is supported 
by 20 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): W60282JPEA_1 JT1 1 . Table 336 below describes 
the starting and ending position of this segment on each transcript. 



Table 336- Segment location on transcripts 


i^anscript name 


■^ej^eifi starting position 


\ %$gnpnt orfiiig position 


W60282JPEAJLJT11 


1 


669 



According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
1 0 included in a separate description. 



Segment cluster W602 82 JPE A_ 1 _node_2 1 according to the present invention is 
supported by 48 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): W60282JPEA_1_T1 1. Table 337 below 
15 describes the starting and ending position of this segment on each transcript. 

Table 337 - Segment location on transcripts 



Transcript mime ,, s 


Segment starting position 


Segment ending position - 


W60282_PEA_1_T11 


1039 


1071 



Segment cluster W60282_PEA_l_node_8 according to the present invention is supported 
20 by 39 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): W60282 JPEA_1 JT1 1 . Table 338 below describes 
the starting and ending position of this segment on each transcript. 

Table 338 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


W60282_PEA_1_T11 


670 


744 
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Variant protein alignment to the previously known protein: 

Sequence name: /tmp/rL7Wdc5hYg/eLOAf KIgqD : KLKB_HUMAN 

10 

Sequence documentation : 

Alignment of: W60 2 82_PEA_1__P14 x KLKB_HUMAN 
15 Alignment segment 1/1: 

Quality: 

Escore: 0 

Matching length: 

20 length: 72 

Matching Percent Similarity: 
Identity: 94.44 

Total Percent Similarity: 
Identity: 94.44 
25 Gaps: 

Alignment : 

1 MRILQLILLALATGLVGGETRIIKGFECKPHSQPWQAALFEKTRLLCGAT 50 
30 | | | | | | | | I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I M I I 

1 MRILQLILLALATGLVGGETRIIKGFECKPHSQPWQAALFEKTRLLCGAT 50 



645 .00 

72 Total 
94.44 Matching Percent 
94.44 Total Percent 

0 
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51 LIAPRWLLTAAHCLKPTPASHL 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i i 

51 LIAPRWLLTAAHCLKPRYIVHL 



Sequence name: / tmp/rL7Wdc5hYg/eLOAf KIgqD : Q8IXD7 
Sequence documentation : 

Alignment of: W602 82__PEA_1JP14 x Q8IXD7 
Alignment segment 1/1: 

Quality: 642.00 

Escore: 0 

Matching length: 66 Total 

length: 66 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

Alignment : 

1 MRILQLILLALATGLVGGETRIIKGFECKPHSQPWQAALFEKTRLLCGAT 50 
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1 MRILQLILLALATGLVGGETRIIKGFECKPHSQPWQAALFEKTRLLCGAT 50 

51 LIAPRWLLTAAHCLKP 

5 I I I I I I I I I I I I I I I I 

51 LIAPRWLLTAAHCLKP 

DESCRIPTION FOR CLUSTER Z41644 
10 Cluster Z41644 features 1 transcript(s) and 21 segment(s) of interest, the names for which 

are given in Tables 339 and 340, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 341. 

Table 339 - Transcripts of interest 



i^^ri^Nsfiae , # ' '* ' U <: -f\ ... 

.''-■.*" : ""if - ■ r " " , ,. ■■ ; -v. ; ^~ ; ; ! • . 


Sequence ID No. 


Z41644_PEA_1_T5 


35 


TbWe 340 - Segments of interest 


Segment Name 'J? .-'C*.- 


Sequence ID No. \-£ -tf'i : ': 


Z41 644_PEA_l_node_0 


440 


Z41 644_PEA_l_node_l 1 


441 


Z41 644_PEA_l_node_12 


442 


Z4 1 644_PEA_l_node_l 5 


443 


Z4 1 644_PEA_l_node_20 


444 


Z4 1 644_PEA_l_node_24 


445 


Z4 1 644_PE A_l_node_l 


446 


Z4 1 644_PEA_l_node_l 0 


447 


Z4 1 644_PE A_l_node_l 3 


448 


Z4 1 644_PE A_l_node_l 6 


449 


Z4 1 644_PEA_l_node_l 7 


450 



66 
66 
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Z4 1 644_PEA_l_node_l 9 


451 


Z4 1 644_PEA_l_node_2 


452 


Z4 1 644_PEA_l_node_2 1 


453 


Z4 1 644_PEA_l_node_22 


454 


Z4 1 644_PEA_l_node_23 


455 


Z4 1 644_PE A_l _node_25 


456 


Z41 644_PEA_l_node_3 


457 


Z4 1 644_PEA_l_node_4 


458 


Z4 1 644JPE A_l_node_6 


459 


Z4 1 644_PEA_l_node_9 


460 


Table 341 - Proteins of interest 


Protein Name ,, ' % f >. • 


Sequence ID No. . % • .v- 


Z41644_PEA_1_P10 


1313 



These sequences are variants of the known protein Small inducible cytokine B14 
5 precursor (SwissProt accession identifier SZ14JHUMAN; loiown also according to the 

synonyms CXCL14; Chemokine BRAK), SEQ ID NO: 1429, referred to herein as the previously 
known protein. 

The sequence for protein Small inducible cytokine B14 precursor is given at the end of 
the application, as "Small inducible cytokine B14 precursor amino acid sequence". Protein 
10 Small inducible cytokine B14 precursor localization is believed to be Secreted. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: chemotaxis; signal transduction; cell-cell signaling, which are 
annotation(s) related to Biological Process; and chemokine, which are annotation(s) related to 
Molecular Function. 

15 The GO assignment relies on information from one or more of the SwissProt/TremBl 

Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 
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Cluster Z41644 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 27 refer to weighted expression of ESTs in 
5 each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 27 and Table 342. This cluster is overexpressed (at least at a minimum level) in the 
10 following pathological conditions: lung malignant tumors, breast malignant tumors and pancreas 
carcinoma. 



Table 342 - Normal tissue distribution 



Name of Tissue ;> 


NumBer Jg; 


bone 


45 


brain 


62 


colon 


327 


epithelial 


179 


general 


104 


head and neck 


10 


kidney 


219 


lung 


6 


lymph nodes 


37 


breast 


87 


bone marrow 


0 


muscle 


20 


ovary 


36 


pancreas 


0 


prostate 


78 


skin 


591 
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stomach 


109 


Thyroid 


386 


uterus 


218 



Table 343 - P values and ratios for expression in cancerous tissue 



Name of Tissue ? 5- 


PI ' 


P2 


SP1 


R3 


SP2 


R4 : p> 


bone 


4.9e-01 


8.5e-01 


1.8e-01 


1.9 


5.3e-01 


1.0 


brain 


6.7e-01 


8.0e-01 


9.1e-01 


0.6 


9.9e-01 


0.4 


colon 


6.4e-01 


7.7e-01 


9.7e-01 


0.4 


1 


0.3 


epithelial 


4.1e-01 


9.4e-01 


9.6e-01 


0.7 


1 


0.4 


general 


1.5e-01 


9.4e-01 


1.8e-01 


1.0 


1 


0.5 


head and neck 


1.9e-01 


3.3e-01 


4.6e-01 


2.8 


7.5e-01 


1.5 


kidney 


7.7e-01 


8.2e-01 


7.0e-01 


0.7 


9.5e-01 


0.5 


lung 


2.2e-01 


5.0e-01 


1.3e-04 


8.7 


8.1e-03 


4.1 


lymph nodes 


6.3e-01 


8.7e-01 


6.3e-01 


1.2 


9.2e-01 


0.6 


breast 


4.0e-01 


6.5e-01 


3.9e-04 


3.5 


2.9e-02 


1.9 


bone marrow 


1 


6.7e-01 


1 


1.0 


5.3e-01 


1.9 


muscle 


5.2e-01 


6.1e-01 


2.7e-01 


3.2 


6.3e-01 


1.2 


ovary 


6.7e-01 


7.1e-01 


7.6e-01 


1.0 


8.6e-01 


0.8 


pancreas 


2.2e-02 


2.3e-02 


5.7e-03 


7.8 


1.6e-03 


8.2 


prostate 


8.8e-01 


9.0e-01 


8.3e-01 


0.6 


9.3e-01 


0.5 


skin 


5.9e-01 


6.9e-01 


2.3e-01 


0.3 


1 


0.0 


stomach 


6.1e-01 


8.9e-01 


8.1e-01 


0.7 


9.9e-01 


0.4 


Thyroid 


7.0e-01 


7.0e-01 


9.9e-01 


0.4 


9.9e-01 


0.4 


uterus 


5.3e-01 


8.2e-01 


9.5e-01 


0.5 


1 


0.3 



As noted above, cluster Z41644 features 1 transcript(s), which were listed in Table 339 
above. These transcript(s) encode for protein(s) which are variant(s) of protein Small inducible 
5 cytokine B14 precursor. A description of each variant protein according to the present invention 
is now provided. 
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Variant protein Z41644_PEA_1 JP10 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
Z41644_PEA_1_T5. An alignment is given to the known protein (Small inducible cytokine B14 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between Z41644JPEA__1 JP10 and SZ14_HUMAN: 

1. An isolated chimeric polypeptide encoding for Z41644J>EA_1_P10, comprising a first 
amino acid sequence being at least 90 % homologous to 
MRLLAAALLLLLLALYTARVDGS 

TTKSVSRYRGQEHCLHPKLQSTKJRFIKWYNAWNEKRR corresponding to amino acids 1 - 
95 of SZ14JHUMAN, which also corresponds to amino acids 1 - 95 of Z41644_PEA_1_P10, 
and a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 
85%, more preferably at least 90% and most preferably at least 95% homologous to a 
polypeptide having the sequence YAPPLLTFLPTRPSCGSQDGKGPPHQVI corresponding to 
amino acids 96 - 123 of Z41644JPEA_1 JP10, wherein said first and second amino acid 
sequences are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of Z41644JPEA_1 _P10, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence YAPPLLTFLPTRPSCGSQDGKGPPHQVI in Z41644__PEA_1_P10. 

Comparison report between Z41644JPEA_1_P10 and Q9NS21 (SEQ ID NO:1706): 
LAn isolated chimeric polypeptide encoding for Z41644J?EA_1 JP10, comprising a first 

amino acid sequence being at least 90 % homologous to 

MRLLAAALLLLLLALYTARVDGSK 

TTKSVSRYRGQEHCLHPKLQSTKRFIKWYTSfAWNEKRR corresponding to amino acids 13 - 
107 of Q9NS21, which also corresponds to amino acids 1-95 of Z41644_PEA_1_P10, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence YAPPLLTFLPTRPSCGSQDGKGPPHQVI corresponding to amino acids 
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96 - 123 of Z41644_PEA_1_P10, wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a tail of Z41644JPEA1 JP10, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
5 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence YAPPLLTFLPTRPSCGSQDGKGPPHQVI in Z41644__PEA_1_P10. 

Comparison report between Z41644_PEA_1_P10 and AAQ89265 (SEQ ID NO:781): 

1. An isolated chimeric polypeptide encoding for Z41644JPEA_1 JP10, comprising a first 
amino acid sequence being at least 90 % homologous to 

10 MRLLAAALLLLLLALYTARVDGSKCKCSRKGPKIRYSDVKKLEMKPKYPH 

TTKSVSRYRGQEHCLHPKLQSTKRFIKWYNAWNEKRR corresponding to amino acids 13 - 
107 of AAQ89265, which also corresponds to amino acids 1 - 95 of Z41644JPEA__1JP10, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 

15 having the sequence YAPPLLTFLPTRPSCGSQDGKGPPHQVI corresponding to amino acids 
96 - 123 of Z41644 PEA1 JP10, wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of Z41644_PEA_1_P10, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 

20 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence YAPPLLTFLPTRPSCGSQDGKGPPHQVI in Z41644_PEA_1 J>10. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
25 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein Z41644JPEA_1JP10 also has the following non-silent SNPs (Single 
30 Nucleotide Polymorphisms) as listed in Table 344, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
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the SNP is known or not; the presence of known SNPs in variant protein Z41644_PEA_1_P10 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 344 - Amino acid mutations 



SMP position(s) oa Wnlino acid 
sequence . • 


Alternative amino acid(s) " 


Previously known SNP? . 


32 


P->H 


Yes 


64 


S-> 


No 


80 


T -> A 


No 


80 


T->P 


No 



5 



Variant protein Z41644_PEA_1_P10 is encoded by the following transcript(s): 
Z41644_PEA_1_T5, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z41644_PEA_1_T5 is shown in bold; this coding portion starts at 
position 744 and ends at position 1 1 12. The transcript also has the following SNPs as listed in 
10 Table 345 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Z41644_PEA_1JP10 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 345 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid ; % • 


Previously known SNP? 


102 


A->G 


Yes 


572 


C-> 


No 


3707 


C->T 


Yes 


3735 


C->T 


Yes 


4079 


G->A 


No 


4123 


G->A 


Yes 


4233 


A->G 


Yes 
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4328 


C-> 


No 


4350 


A->G 


Yes 


4376 


G->A 


Yes 


4390 


A->G 


Yes 


4619 


G->T 


Yes 


838 


C -> A 


Yes 


4754 


C ->T 


No 


4757 


C -> A 


No 


4794 


T->G 


No 


4827 


G-> 


No 


934 


C-> 


No 


981 


A->C 


No 


981 


A->G 


No 


1817 


A->C 


Yes 


2546 


T-> 


No 


2684 


T -> A 


No 


2885 


T->C 


Yes 


As noted above, cluster Z41644 features 21 segment(s), w 


hich were listed in Table 340 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
5 provided. 

Segment cluster Z41644JPEA_l_node_0 according to the present invention is supported 
by 53 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644_PEA_1_T5. Table 346 below describes the 
10 starting and ending position of this segment on each transcript. 



Table 346 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z41644_PEA_1_T5 


1 


616 
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Segment cluster Z41644JPEA_l_node_ll according to the present invention is supported 
by 9 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z41644JPEAJ_T5. Table 347 below describes the 
starting and ending position of this segment on each transcript. 

Table 347 - Segment location on transcripts 



Transcript nanle * k' %\ M > 

■ . ■ ■.. ■> ■'<'" ■ - 


Segment starting position 


Segment ending position . . 


Z41644_PEA_1_T5 


1028 


2089 



1 0 Segment cluster Z41 644_PEA_l_node_12 according to the present invention is supported 

by 6 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644_PEA_1_T5. Table 348 below describes the 
starting and ending position of this segment on each transcript. 

Table 348 - Segment location on transcripts 



Transcript liaihe ' '[."i 


Segment starting position ^ ; v 


Segment ending position f* .-' • 


Z41644_PEA_1_T5 


2090 


2350 



15 



Segment cluster Z41644_PEA_l_node_15 according to the present invention is supported 
by 23 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644_PEA_1_T5. Table 349 below describes the 
20 starting and ending position of this segment on each transcript. 



Table 349 - Segment location on transcripts 



Transcript name 


Segment starting position 


: Segment ending position 


Z41644_PEA_1_T5 


2368 


3728 
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Segment cluster Z41644JPEA_l_node_20 according to the present invention is supported 
by 260 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644JPEA_1_T5. Table 350 below describes the 
starting and ending position of this segment on each transcript. 



5 Table 350 - Segment location on transcripts 



Transcript name .' . ' 


Segment starting position 


Segment ending position 


Z41644_PEA_1_T5 


3938 


4506 



Segment cluster Z4 1 644_PEA_l_node_24 according to the present invention is supported 
by 185 libraries. The number of libraries was determined as previously described. This segment 
10 can be found in the following transcript(s): Z41644JPEA_1 JT5. Table 351 below describes the 
starting and ending position of this segment on each transcript. 



Table 351 - Segment location on transcripts 



Transcript name „.'y 


■ Segment starting position . 


\ Segment ending position 


Z41644_PEA_1_T5 


4637 


4799 



According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
1 5 included in a separate description. 



Segment cluster Z41644_PEA_l_node_l according to the present invention is supported 
by 53 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644_PEA_1_T5. Table 352 below describes the 
20 starting and ending position of this segment on each transcript. 



Table 352 - Segment location on transcripts 



Transcript name 


Segment starting position 


i Segment ending position 


Z41644_PEA_1_T5 


617 


697 
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Segment cluster Z41644_PEA_l_node_10 according to the present Invention is supported 
by 138 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644_PEA_1_T5. Table 353 below describes the 
starting and ending position of this segment on each transcript. 



Table 353 - Segment location on transcripts 



TrWscript name - v - : V. v " 


i Segment starting position ,f. 


Segment ending position v- 


Z41644JPEAJLT5 


972 


1027 



Segment cluster Z41644_PEA_lnode_13 according to the present invention can be 
found in the following transcript(s): Z41644JPEA_1_T5. Table 354 below describes the starting 
and ending position of this segment on each transcript. 

Table 354 - Segment location on transcripts 



TransOTpt nam$ \;; " : v 




Segment ending position . , 


Z41644_PEA_1_T5 


2351 


2367 



Segment cluster Z4 1 644_PEA_1 _node_l 6 according to the present invention is supported 
by 152 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644_PEA_1 JT5. Table 355 below describes the 
starting and ending position of this segment on each transcript. 



Table 355 - Segment location on transcripts 



Transcript name . 


Segment starting position 


..' Segment ending position 


Z41644_PEA_1_T5 


3729 


3809 



Segment cluster Z4 1 644 JPEA_l_node_l 7 according to the present invention can be 
found in the following transcript(s): Z41644JPEA_1_T5. Table 356 below describes the starting 
and ending position of this segment on each transcript. 
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Transcript name 


Segment starting position 


Segment ending position 


Z41644_PEA_1_T5 


3810 


3829 



Segment cluster Z41644JPEA_l_node_19 according to the present invention is supported 
5 by 112 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644_PEA_1_T5. Table 357 below describes the 
starting and ending position of this segment on each transcript. 



Table 357 - Segment location on transcripts 



Transcript name 3s. . .. 


Segment starting position 


Segment ending position 


Z41644_PEA_1_T5 


3830 


3937 



10 

Segment cluster Z41644__PEA_lnode_2 according to the present invention is supported 
by 58 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644_PEA_1_T5. Table 358 below describes the 
starting and ending position of this segment on each transcript, 

15 Table 358 - Segment location on transcripts 



Transcript name •;{> "": 


Segment starting position 


Segment ending position 


Z41644_PEA_1_T5 


698 


737 



20 



Segment cluster Z41644JPEA_l_node_21 according to the present invention can be 
found in the following transcript(s): Z41644 PEA1T5. Table 359 below describes the starting 
and ending position of this segment on each transcript. 

Table 359 - Segment location on transcripts 



Transcript name 



Segment starting position 



Segment ending position 
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Z41644_PEA_1_T5 


4507 


4529 









Segment cluster Z41644JPEA_l_node_22 according to the present invention is supported 
by 164 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z41644JPEA_1 JT5. Table 360 below describes the 
starting and ending position of this segment on each transcript. 



Table 360 - Segment location on transcripts 



Transcript name • 


Segment starting position s 


Segment ending position 


Z41644_PEA_1_T5 


4530 


4582 



10 Segment cluster Z41644_PEA_l_node_23 according to the present invention is supported 

by 169 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644_PEA_1_T5. Table 361 below describes the 
starting and ending position of this segment on each transcript. 



Table 361 - Segment location on transcripts 



Trfeseript name 


Segment starting position 


Sag&iiit ending position X 


Z41644JPEA_1_T5 


4583 


4636 



15 



Segment cluster Z41 644_PEA_l_node_25 according to the present invention is supported 
by 138 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644JPEA_1_T5. Table 362 below describes the 
20 starting and ending position of this segment on each transcript. 



Table 362 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z41644_PEA_1_T5 


4800 


4902 
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Segment cluster Z41644JPEA_1 jtiode_3 according to the present invention is supported 
by 75 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644_PEA_1_T5. Table 363 below describes the 
5 starting and ending position of this segment on each transcript. 



Table 363 - Segment location on transcripts 



Irmiscriptname \ 


Segment starting positioji ./ 


Segment ending position 


Z41644_PEA„1_T5 


738 


773 



Segment cluster Z41644JPEA_l_node_4 according to the present invention is supported 
10 by 61 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644_PEA_1_T5. Table 364 below describes the 
starting and ending position of this segment on each transcript. 



Table 364 - Segment location on transcripts 



Transcript name ''.#%'/':"■ 


Segment Starting position * f 


Segment ending position 


Z41644_PEA_1_T5 


774 


807 



15 

Segment cluster Z41644JPEA_l_node_6 according to the present invention is supported 
by 101 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644J > EA_1JT5. Table 365 below describes the 
starting and ending position of this segment on each transcript. 

20 Table 365 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z41644_PEA_1_T5 


808 


913 
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Segment cluster Z41644_PEA1 node_9 according to the present invention is supported 
by 134 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z41644JPEA_1_T5. Table 366 below describes the 
starting and ending position of this segment on each transcript. 



5 Table 366 - Segment location on transcripts 



Tr^c^t namec S r ; ' 


SegmSrt starting position 


Segment ending position 


Z41644_PEA_1_T5 


914 


971 



Variant protein alignment to the previously known protein: 

Sequence name: /tmp/p5SSvhT9Xp/HQeIMsUrf m : SZ14JHUMAN 

Sequence documentation : 

15 

Alignment of: Z4164 4_PEA_1_P10 x SZ14_HUMAN 



Alignment segment 1/1: 

20 Quality: 953.00 

Escore: 0 

Matching length: 95 Total 

length: 95 
Matching Percent Similarity: 100.00 Matching Percent 
25 Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 
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Alignment : 



1 MRLLAAALLLLLLALYTARVDGSKCKCSRKGPKIRYSDVKKLEMKPKYPH 50 

I 1 I I I 1 1 I I I I I I II I I I I I I I I I I I I I 1 1 I I I I I I I I 1 II 1 I I ! 1 11 I I 

1 MRLLAAALLLLLLALYTARVDGSKCKCSRKGPKIRYSDVKKLEMKPKYPH 5 0 



10 



51 CEEKMVIITTKSVSRYRGQEHCLHPKLQSTKRFIKWYNAWNEKRR 95 

I I I I I I I I I I I I II II I I II I I I I I I I 11 I I I I I II I I I I I I I I I 

5 1 CEEKMVI I TTKSVSRYRGQEHCLHPKLQS TKRFIKWYNAWNEKRR 9 5 



15 



20 



Sequence name: /tmp/p5SSvhT9Xp/HQeIMsUrfm: Q9NS21 



Sequence documentation : 



Alignment of: Z4164 4_PEA_1_P10 x Q9NS21 



Alignment segment 1/1: 



25 Quality: 
Escore: 0 

Matching length: 
length: 96 
Matching Percent Similarity: 
30 Identity: 98.96 



957.00 



96 



Total 



100.00 Matching Percent 
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Total Percent Similarity: 100.00 Total Percent 

Identity: 98.96 

Gaps : 0 



5 Alignment: 

1 MRLLAAALLLLLL ALYT ARVDG SKCKC S RKGPK I RY S DVKKLEMKPKYPH 50 

I I I I I I ! I I ! i I I I ! I I I t I I I 1 I I I I I I I I I ! ! i I I M I 1 I I I I I I I I I 

13 MRLLAAALLLLLLALYTARVDGSKCKCSRKGPKIRYS DVKKLEMKPKYPH 62 

0 - 

51 CEEKMVII TTKSVSRYRGQEHCLHPKLQSTKRFIKWYNAWNEKRRY 9 6 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : 
63 CEEKMVIITTKSVSRYRGQEHCLHPKLQSTKRFIKWYNAWNEKRRF 10 8 



15 



20 Sequence name: /tmp/p5SSvhT9Xp/HQeIMsUrfm: AAQ8 92 65 
Sequence documentation : 

Alignment of: Z4164 4_PEA_1_P10 x AAQ89265 

25 

Alignment segment 1/1: 

Quality: 953.00 

Escore: 0 

30 Matching length: 95 Total 

length: 95 
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Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



475 

100.00 
100.00 



Matching Percent 



Total Percent 



Alignment : 



10 



1 MRLLAAALLLLLLALYTARVDGSKCKCSRKGPKIRYSDVKKLEMKPKYPH 5 0 

I M I I 1 I I I I I I I I I I I I I I I 1 II I 1 I I I I I I I I I I I II ! I I I I I I I I I I 

13 MRLLAAALLLLLLALYTARVDGSKCKCSRKGPKIRYSDVKKLEMKPKYPH 62 



15 



51 CEEKMVIITTKSVSRYRGQEHCLHPKLQSTKRFIKWYNAWNEKRR 95 

I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I 

63 CEEKMVIITTKSVSRYRGQEHCLHPKLQSTKRFIKWYNAWNEKRR 107 



DESCRIPTION FOR CLUSTER Z44808 
Cluster Z44808 features 5 transcript(s) and 21 segment(s) of interest, the names for which 
20 are given in Tables 367 and 368, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 369. 

Table 367 - Transcripts of interest 



Transcript Name , V 


Sequence ID No. 1 


Z44808_PEA_1_T11 


36 


Z44808_PEA_1_T4 


37 


Z44808_PEA_1_T5 


38 


Z44808_PEA_1_T8 


39 


Z44808_PEA_1_T9 


40 



Table 368 - Segments of interest 
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Segment Name ' . i ' - 


Sequence ID No. 


Z44808JPEA_l_node_0 


461 


Z44808_PEA_l_node_l 6 


462 


Z44808JPEA_l_node_2 


463 


Z44808_PEA_l_node_24 


464 


Z44808_PEA_l_node_32 


465 


Z44808_PEA_l_node_33 


466 


Z44808_PEA_l_node_36 


467 


Z44808_PEA_l_node_37 


468 


Z44808_PEA_l_node_4 1 


469 


Z44808_PEA_l_node_l 1 


470 


Z44808_PEA_l_node_l 3 


471 


Z44808_PEA_l_node_l 8 


472 


Z44808_PEA_l_node_22 


473 


Z44808_PEA_l_node_26 


474 


Z44808_PEA_l_node_30 


475 


Z44808_PEA_l_node_34 


476 


Z44808_PEA_l_node_35 


477 


Z44808_PEA_l_node_39 


478 


Z44808_PEA_l_node_4 


479 


Z44808_PEA_l_node_6 


480 


Z44808_PEA_l_node_8 


481 


Table 369 - Proteins of interest 


Protein Name 


Sequence ID No. 


Z44808_PEA_1_P5 


1314 


Z44808_PEA_1_P6 


1315 


Z44808_PEA_1_P7 


1316 


Z44808_PEA_1_P11 


1317 
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These sequences are variants of the known protein SPARC related modular calcium- 
binding protein 2 precursor (SwissProt accession identifier SM02_HUMAN; known also 
according to the synonyms Secreted modular calcium-binding protein 2; SMOC-2; Smooth 
5 muscle- associated protein 2; SMAP-2; MSTP1 17), SEQ ID NO: 1430, referred to herein as the 
previously known protein. 

Protein SPARC related modular calcium-binding protein 2 precursor is known or believed 
to have the following function(s): calcium binding. The sequence for protein SPARC related 
modular calcium-binding protein 2 precursor is given at the end of the application, as "SPARC 
10 related modular calcium-binding protein 2 precursor amino acid sequence". Known 
polymorphisms for this sequence are as shown in Table 370. 

Table 370 - Amino acid mutations for Known Protein 



SNP position(s) on 
amino acid sequence 


Comment 

"", V ■/ 'W V: 1 - 7" % '>:■■ • ■■ " • ■■ %i 

7 " ■"• V -v? /' '■>■ • ^ ; { , ( 


169- 170 


KT -> TR 


212 


S ->P 


429 - 446 


TPRGHAESTSNRQPRXQG -> RSKRNL 


434 


A->V 


439 


N -> Y 



Protein SPARC related modular calcium-binding protein 2 precursor localization is 
15 believed to be Secreted. 

Cluster Z44808 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 28 refer to weighted expression of ESTs in 
20 each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 
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Overall, the following results were obtained as shown with regard to the histograms in 
Figure 28 and Table 371. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: colorectal cancer, lung cancer and pancreas carcinoma. 

Table 371 - Normal tissue distribution 



Name of Tissue r , , * y> : : 


Number 


bladder 


123 


bone 


304 


brain 


18 


colon 


0 


epithelial 


40 


general 


37 


kidney 


2 


lung 


0 


breast 


61 


ovary 


116 


pancreas 


0 


prostate 


128 


stomach 


36 


uterus 


195 



Table 372 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI -j/f; 




SP1 


R3 


SP2 


R4 


bladder 


6.8e-01 


7.6e-01 


7.7e-01 


0.8 


9.1e-01 


0.6 


bone 


7.0e-01 


8.8e-01 


9.9e-01 


0.3 


1 


0.2 


brain 


6.8e-01 


7.2e-01 


3.0e-02 


2.6 


1.7e-01 


1.6 


colon 


9.2e-03 


1.3e-02 


1.2e-01 


3.6 


1.6e-01 


3.1 


epithelial 


2.1e-02 


4.0e-01 


1.0e-04 


1.9 


2.7e-01 


1.0 


general 


2.6e-02 


7.2e-01 


4.9e-07 


1.9 


3.0e-01 


1.0 


kidney 


7.3e-01 


8.1e-01 


1 


1.0 


1 


1.0 
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lung 


4.0e-03 


1.8e-02 


8.0e-04 


12.2 


2.1e-02 


6.0 


breast 


4.8e-01 


6.1e-01 


9.8e-02 


2.0 


3.9e-01 


1.2 


ovaiy 


8.1e-01 


8.3e-01 


9.1e-01 


0.6 


9.7e-01 


0.5 


pancreas 


1.2e-01 


2.1e-01 


1.0e-03 


6.5 


5.9e-03 


4.6 


prostate 


8.4e-01 


8.9e-01 


9.0e-01 


0.6 


9.8e-01 


0.4 


stomach 


5.0e-01 


8.7e-01 


9.6e-04 . 


1.5 


1.9e-01 


0.8 


uterus 


6.7e-01 


7.9e-01 


9.2e-01 


0.5 


1 


0.3 



As noted above, cluster Z44808 features 5 transcript(s), which were listed in Table 367 
above. These transcript(s) encode for protein(s) which are variant(s) of protein SPARC related 
modular calcium-binding protein 2 precursor. A description of each variant protein according to 
the present invention is now provided. 

5 

Variant protein Z44808JPEA 1P5 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
Z44808JPEA_1_T4. An alignment is given to the known protein (SPARC related modular 
calcium-binding protein 2 precursor) at the end of the application. One or more alignments to 
10 one or more previously published protein sequences are given at the end of the application. A 
brief description of the relationship of the variant protein according to the present invention to 
each such aligned protein is as follows: 

Conparison report between Z44808_PEA_1_P5 and SMO2JH0JMAN: 
15 l.An isolated chimeric polypeptide encoding for Z44808JPEA_1_P5, comprising a first 

amino acid sequence being at least 90 % homologous to 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGR 
TFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDD 
GTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCPGSVNEKLPQREGTGKTDDAA 
20 AFALETQPQGDEEDIASRYPTLWTEQVKSRQNKTNKNSVSSCDQE 

DNWIPECAHGGLYKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPA 

KARDLYKGRQLQGCPGAJKJ^ 

RVVffWYFKLLDKNSSGDIGKK 
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ELMGCLGVAKEDGKADTKKRHTPRGHAESTSNRQ corresponding to amino acids 1-441 
of SMQ2 JHUMAN, which also corresponds to amino acids 1 - 441 of Z44808JPEA__1_P5, and 
a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 
85%, more preferably at least 90% and most preferably at least 95% homologous to a 
5 polypeptide having the sequence DAMVVSSRPKATTHRKSRTLSRR corresponding to amino 
acids 442 - 464 of Z44808JPEA_1_P5, wherein said first and second amino acid sequences are 
contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of Z44808_PEA_1_P5, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
10 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence DAMVVSSRPKATTHRKSRTLSRR in Z44808JPEA_1_P5. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
15 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

20 Variant protein Z44808JPEA_1 JP5 is encoded by the following transcript(s): 

Z44808JPEA_1_T4, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z44808_PEA_1 JT4 is shown in bold; this coding portion starts at 
position 586 and ends at position 1977. The transcript also has the following SNPs as listed in 
Table 373 (given according to their position on the nucleotide sequence, with the alternative 

25 nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Z44808_PEA_1 JP5 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 373 - Nucleic acid SNPs 



SNP position on nucleotide 


Alternative nucleic acid 


! Previously known SNP? 


sequence 
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549 


A->G 


No 


648 


T->G 


No 


4403 


G->T 


No 


4456 


G -> A 


Yes 


4964 


G->C 


Yes 


1025 


C-> 


No 


1677 


T->C 


No 


2691 


C ->T 


Yes 


3900 


T-> C 


No 


3929 


G-> A 


Yes 


4099 


G->T 


Yes 


4281 


T->C 


No 


4319 


G->C 


Yes 



Variant protein Z44808JPEA_1_P6 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 Z44808JPEA_1_T5. An alignment is given to the known protein (SPARC related modular 
calcium-binding protein 2 precursor) at the end of the application. One or more alignments to 
one or more previously published protein sequences are given at the end of the application. A 
brief description of the relationship of the variant protein according to the present invention to 
each such aligned protein is as follows: 

10 

Comparison report between Z44808JPEAJJP6 and SM02_HUMAN: 
LAn isolated chimeric polypeptide encoding for Z44808JPEA_1_P6, comprising a first 
amino acid sequence being at least 90 % homologous to 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGR 
15 TFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDD 
GTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCPGSVNEKLPQREGTGKTDDAA 
APALETQPQGDEEDIASRYPTLWTEQVKSRQNKTNKNSVSSCDQEHQSALEEAKQPKN 
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DNVVIPECAHGGLYKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPA 
KARDLYKGRQLQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEE 
RVVHWWKLLDKNSSGDIGKKJEIKPFKRFLRKKSKPKKCVKKF 

ELMGCLGVAKEDGKADTKKRH corresponding to amino acids 1 - 428 of SM02JHUMAN, 
5 which also corresponds to amino acids 1 - 428 of Z44808JPEAJL JP6, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
RSKRNL corresponding to amino acids 429 - 434 of Z44808JPEA_1_P6, wherein said first and 
second amino acid sequences are contiguous and in a sequential order. 
10 2.An isolated polypeptide encoding for a tail of Z44808_PEA_1 JP6, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence RSKRNL in Z44808JPEAJJP6. 

15 The location of the variant protein was determined according to results from a number of 

different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 

20 region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein Z44808JPEA_1 JP6 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 374, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein Z44808_PEA_1_P6 

25 sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 374 - Amino acid mutations 



SNP positions) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


147 


A-> 


No 
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Variant protein Z44808_PEA_1_JP6 is encoded by the following transcript(s): 
Z44808_PEA_1_T5 ? for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z44808_PEA_1_T5 is shown in bold; this coding portion starts at 
5 position 586 and ends at position 1887. The transcript also has the following SNPs as listed in 
Table 375 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Z44808_PEA_1_P6 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 375 - Nucleic acid SNPs 



SNP positipti on nucleotide. "• 

.sequence*?: ^M } k '^ : '' 


AlteriltfiVe nucleic acid 


Previously known SNP? 


549 


A->G 


No 


648 


T->G 


No 


2866 


G-> A 


Yes 


3374 


G->C 


Yes 


1025 


C-> 


No 


1677 


T->C 


No 


2310 


T->C 


No 


2339 


G-> A 


Yes 


2509 


G->T 


Yes 


2691 


T->C 


No 


2729 


G->C 


Yes 


2813 


G->T 


No 



Variant protein Z44808 JPEA_1 JP7 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
15 Z44808_PEA_1_T9. An alignment is given to the known protein (SPARC related modular 
calcium-binding protein 2 precursor) at the end of the application. One or more alignments to 
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one or more previously published protein sequences are given at the end of the application. A 
brief description of the relationship of the variant protein according to the present invention to 
each such aligned protein is as follows: 

5 Comparison report between Z44808JPEA_1 JP7 and SM02 HUMAN: 

LAn isolated chimeric polypeptide encoding for Z44808_PEA__1 JP7, comprising a first 
amino acid sequence being at least 90 % homologous to 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGR 
TFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDD 

10 GTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCPGSVNEKLPQREGTGKTDDAA 
APALETQPQGDEEDIASRYPTLWTEQVKSRQNKTNKNSVSSCDQEHQSALEEAKQPKN 
DNWIPECAHGGLYKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPA 
KARDLYKGRQLQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEE 
RVVHWWKIXDKNSSGDIGKK^ 

15 ELMGCLGVAKEDGKADTKKRHTPRGHAESTSNRQ corresponding to amino acids 1-441 
of SM02HUMAN, which also corresponds to amino acids 1 - 441 of Z44808JPEAJ JP7, and 
a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 
85%, more preferably at least 90% and most preferably at least 95% homologous to a 
polypeptide having the sequence LLWLRGKVSFYCF corresponding to amino acids 442 - 454 

20 of Z44808JPEA 1JP7, wherein said first and second amino acid sequences are contiguous and 
in a sequential order. 

2 .An isolated polypeptide encoding for a tail of Z44808_PEA_1_P7, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 

25 sequence LLWLRGKVSFYCF in Z44808JPEA_1_P7. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
30 secreted. The protein localization is believed to be secreted because both signatpeptide 
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prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein Z44808_PEA_1_P7 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 376, (given according to their position(s) on the 
5 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein Z44808_PEA_1 JP7 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 376 - Ammo acid mutations 



^SNP 5 p94tion^) :8 Oii amino acid, 

sequence . ^Jf \ '■ " \, t 


AHemative wuno aq|d(s) 9 


Previously known SIJP? r 


147 


A-> 


No 



10 



Variant protein Z44808 JPEA_1 JP7 is encoded by the following transcript(s): 
Z44808_PEA_1__T9, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z44808JPEA_1JT9 is shown in bold; this coding portion starts at 
position 586 and ends at position 1947. The transcript also has the following SNPs as listed in 
15 Table 377 (given according to their position on the nucleotide sequence, with the alternative 

nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Z44808_PEA__1_P7 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 377 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


549 


A->G 


No 


648 


T->G 


No 


1025 


C -> 


No 


1677 


T->C 


No 


2169 


C -> A 


Yes 
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Variant protein Z44808JPEA_1 JP1 1 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
Z44808_PEA__1_T11. The identification of this transcript was performed using a non-EST 
based method for identification of alternative splicing, described in the following reference: 
5 "Sorek R et al., Genome Res. (2004) 14:1617-23." An alignment is given to the known protein 
(SPARC related modular calcium-binding protein 2 precursor) at the end of the application. One 
or more alignments to one or more previously published protein sequences are given at the end 
of the application. A brief description of the relationship of the variant protein according to the 
present invention to each such aligned protein is as follows: 

10 

Comparison report between Z44808JPEA__1 JP1 1 and SM02_HUMAN: 

1. An isolated chimeric polypeptide encoding for Z44808_PEA_1_P11, comprising a first 
amino acid sequence being at least 90 % homologous to 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGR 
15 TFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDD 
GTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCPGSVNEKLPQREGTGKT 
corresponding to amino acids 1 - 1 70 of SM02_HUMAN, which also corresponds to amino 
acids 1 - 170 of Z44808 PEA __1_P11, and a second amino acid sequence being at least 90 % 
homologous to 

20 DIASRYPTLWTEQVKSRQNKTNKNSVSSCDQEHQSALEEA^ 

YKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQLQ 
GCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEERVVHWYFKLLD 
KNSSGDIGKKJEIKPFKRFLRK^ 

DGKADTKKRHTPRGHAESTSNRQPRKQG corresponding to amino acids 188 - 446 of 
25 SM02JFIUMAN, which also corresponds to amino acids 171 -429 of Z44808JPEA_1_P1 1, 
wherein said first and second amino acid sequences are contiguous and in a sequential order. 

2. An isolated chimeric polypeptide encoding for an edge portion of Z44808_PEA_1JP1 1, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 

30 acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise TD, having a 
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structure as follows: a sequence starting from any of amino acid numbers 170-x to -170; and 
ending at any of amino acid numbers 1714- ((n-2) - x), in which x varies from 0 to n-2. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
5 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein Z44808_PEA_1_P1 1 also has the following non- silent SNPs (Single 
10 Nucleotide Polymorphisms) as listed in Table 378, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein Z44808JPEA_1_P1 1 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

1 5 Table 378 - Amino acid mutations 



SNP position(s) on amino acid 

•^sequence ■ \ \\ , . V 


Altrimative amino acid(s) 


Previously known SNP?, .-■ \' 


147 


A-> 


No 



Variant protein Z44808_PEA_1JP11 is encoded by the following transcript(s): 
Z44808_PEA_1 JT11, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z44808 JPEA_1_T1 1 is shown in bold; this coding portion starts at 
20 position 586 and ends at position 1872. The transcript also has the following SNPs as listed in 
Table 379 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Z44 8 0 8PE A_ 1 _P 1 1 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

25 Table 379- Nucleic acid SNPs 
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SNP position on nucleotide : > 
sequence ' .,. -: : 


Alternative nucleic acid 


Previously known SNP? 


549 


A->G 


No 


648 


T->G 


No 


2720 


G-> A 


Yes 


3228 


G->C 


Yes 


1025 


C-> 


No 


1626 


T->C 


No 


2164 


T->C 


No 


2193 


G->A 


Yes 


2363 


G->T 


Yes 


2545 


T->C 


No 


2583 


G->C 


Yes 


2667 


G->T 


No 


As noted above, cluster Z44808 features 21 segment(s), w 


hich were listed in Table 368 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
5 provided. 

Segment cluster Z44808_PEA_l_node_0 according to the present invention is supported 
by 29 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z44808J>EA_1_T11 5 Z44808J>EA_1_T4, 
10 Z44808JPEAJ_T5 ? Z44808_PEA_1„T8 and Z44808_PEAJ_T9. Table 380 below describes 
the starting and ending position of this segment on each transcript. 



Table 380 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z44808_PEA_1_T11 


1 


669 


Z44808_PEA_1_T4 


1 


669 
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Z44808_PEA_1_T5 


1 


669 


Z44808_PEA_1_T8 


1 


669 


Z44808_PEA_1_T9 


1 


669 



Segment cluster Z44808_PEA_l_node_16 according to the present invention is supported 
by 39 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z44808JPEA_1_T1 1, Z44808_PEA_1_T4, 

Z44808_PEA_1 JT5, Z44808J>EA_1_T8 and Z44808JPEA_1_T9. Table 381 below describes 
the starting and ending position of this segment on each transcript. 

Table 381 - Segment location on transcripts 



Transcript name 


Segment starting position 




Z44808JPEA_1 Til 


1172 


1358 


Z44808_PEA_1_T4 


1223 


1409 


Z44808_PEA_1_T5 


1223 


1409 


Z44808_PEA_1_T8 


1223 


1409 


Z44808_PEA_1_T9 


1223 


1409 



Segment cluster Z44808JPEA_l_node_2 according to the present invention is supported 
by 34 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z44808_PEA_1_T1 1, Z44 8 08 JPE A_ 1T4, 
Z44808JPEAJ JT5, Z44808_PEA_1_T8 and Z44808_PEA_1_T9. Table 382 below describes 
15 the starting and ending position of this segment on each transcript. 



Table 382 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z44808_PEA_1_T11 


670 


841 


Z44808_PEA_1_T4 


670 


841 


Z44808_PEA_1_T5 


670 


841 
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Z44808_PEA_1_T8 


670 


841 


Z44808_PEA_1_T9 


670 


841 



Segment cluster Z44808_PEA_1 jaode_24 according to the present invention is supported 
by 52 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following traiBcript(s): Z44808JPEA_1 JT1 1, Z44808_PEA_1_T4 ? 

Z44808_PEA_1_T5 5 Z44808JPEA_1_T8 and Z44808JPEA^1_T9. Table 383 below describes 
the starting and ending position of this segment on each transcript. 



Table 383 - Segment location on transcripts 



Transcript name - • 


Segment stalling position 


Segment eSdirigJppskioai' c 


Z44808_PEA_1_T11 


1545 


1819 


Z44808_PEA_1_T4 


1596 


1870 


Z44808_PEA_1_T5 


1596 


1870 


Z44808_PEA_1_T8 


1596 


1870 


Z44808_PEA_1_T9 


1596 


1870 



Segment cluster Z44 8 0 8_PE A_ 1 _node_32 according to the present invention is supported 
by 17 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z44808_PEA__1_T4 and Z44808JPEA_1_T8. Table 
384 below describes the starting and ending position of this segment on each transcript. 

1 5 Table 384 - Segment location on transcripts 



Transcript name 


Segment starting position \ 


Segment ending position 


Z44808JPEA_1_T4 


1909 


3593 


Z44808_PEA_1_T8 


1909 


2397 



Segment cluster Z44808JPEA_l_node_33 according to the present invention is supported 
by 133 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): Z44808_PEA_1 JT11, Z44808_PEA_1 JT4 and 
Z44808JPEA_1_T5. Table 385 below describes the starting and ending position of this segment 
on each transcript. 

Table 385 ~ Segment location on transcripts 



Transcript name ; : 


Segment starting position 


Segment ending position 


Z44808_PEA_1_T11 


1858 


2734 


Z44808_PEA_1_T4 


3594 


4470 


Z44808_PEA_1_T5 


2004 


2880 



5 



Segment cluster Z44808_PEA_l_node_36 according to the present invention is supported 
by 1 17 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z44808JPEAJMT11, Z44808_PEA_1_T4 and 
10 Z44808JPEA 1_T5. Table 386 below describes the starting and ending position of this segment 
on each transcript. 



Table 386 - Segment location on transcripts 



Transcript name 


Segment starting position , 


; Segment Ending jiositioii v 


Z44808_PEA_1_T1 1 


2829 


3080 


Z44808_PEA_1_T4 


4565 


4816 


Z44808_PEA_1_T5 


2975 


3226 



15 Segment cluster Z44 8 0 8 JPE A_ 1 _no de_3 7 according to the present invention is supported 

by 120 libraries. The number of libraries was determined as previously described. This segment 
can be found in tte following transcript(s): Z44808JPEA_1_T11, Z44808 PEA 1T4 and 
Z44808_PEA_1_T5. Table 387 below describes the starting and ending position of this segment 
on each transcript. 

20 Table 387 - Segment location on transcripts 



Transcript name 



Segment starting position 



Segment ending position 
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Z44808_PEA_1_T1 1 


3081 


3429 


Z44808_PEA_1_T4 


4817 


5165 


Z44808_PEA_1_T5 


3227 


3575 



Segment cluster Z44808_PEA_l_node_41 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 



5 can be found in the following transcript(s): Z44808JPEA_1_T9. Table 388 below describes the 
starting and ending position of this segment on each transcript. 

Table 388 - Segment location on transcripts 



Transcript name > ) 


Segment starting position 


iSegmentieri^ tog position '% ? " 


Z44808_PEA_1_T9 


1974 


2206 



10 

According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, ard so are 
included in a separate description. 

15 Segment cluster Z44808JPEA__l_node_l 1 according to the present invention is supported 

by 25 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z44808_PEA_1_T4, Z44808_PEA_1 T5 3 
Z44808_PEA_1_T8 and Z44808JPEA_1_T9. Table 389 below describes the starting and 
ending position of this segment on each transcript. 

20 Table 389 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z44808_PEA_1_T4 


1097 


1147 


Z44808_PEA_1_T5 


1097 


1147 


Z44808_PEA_1_T8 


1097 


1.147 
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Z44808_PEA_1_T9 


1097 


1147 









Segment cluster Z44808JPEA_l_node_13 according to the present invention is supported 
by 28 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z44808JPEAJJT1 1, Z44808_PEA_1JT4, 

Z44808JPEA_1_T5, Z44808_PEAJ JT8 and Z44808JPEA_1_T9. Table 390 below describes 
the starting and ending position of this segment on each transcript. 

Table 390 - Segment location on transcripts 



Transcript name > 


Segment starting position 


Segment ending position 


Z44808_PEA_1_T11 


1097 


1171 


Z44808_PEA_1_T4 


1148 


1222 


Z44808_PEA_1_T5 


1148 


1222 


Z44808_PEA_1_T8 


1148 


1222 


Z44808_PEA_1_T9 


1148 


1222 



Segment cluster Z44808 JPEA_l__node_l 8 according to the present invention is supported 
by 27 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s) : Z44808JPEA_1_T1 1, Z44808JPEA_1_T4, 
Z44808_PEA_1_T5, Z44808_PEA_1 JT8 and Z44808JPEA_1_T9. Table 391 below describes 
1 5 the starting and ending position of this segment on each transcript. 

Table 391 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z44808_PEA_1_T11 


1359 


1441 


Z44808_PEA_1_T4 


1410 


1492 


Z44808_PEA_1_T5 


1410 


1492 


Z44808_PEA_1_T8 


1410 


1492 


Z44808_PEA_1_T9 


1410 


1492 
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Segment cluster Z44808JPEA_l_node_22 according to the present invention is supported 
by 33 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z44808JPEAJ JIT 1, Z44808JPEA_1 JT4, 

Z44808_PEA_1 JT5, Z44808JPEA_1 JT8 and Z44808_PEA_1 JT9. Table 392 below describes 
the starting and ending position of this segment on each transcript. 

Table 392 - Segment location on transcripts 



Transcript jnaa% * . fv, • 


Segment starting position 


Segment ending position 


Z44808_PEA_1_T11 


1442 


1544 


Z44808_PEA_1_T4 


1493 


1595 


Z44808_PEA_1_T5 


1493 


1595 


Z44808_PEA_1_T8 


1493 


1595 


Z44808_PEA_1_T9 


1493 


1595 



10 Microarray (chip) data is also available for this segment as follows. As described above 

with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (with regard to lung cancer), shown in Table 393. 

Table 393 - Oligonucleotides related to this segment 



Oligonucleotide name 


Overexpressed hi cancers ■ 


Chip reference JJ 


Z44808_0_8_0 


Lung squamous cell 
carcinoma 


LUN 



15 

Segment cluster Z44 8 0 8 JPE A_ 1 _no de_2 6 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z44808_PEA_1 JT5. Table 394 below describes the 
20 starting and ending position of this segment on each transcript. 



WO 2006/131783 



PCT/IB2005/004037 



495 



Table 394 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z44808_PEA_1_T5 


1871 


1965 



Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
5 expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (with regard to lung cancer), shown in Table 395. 

Table 395 - Oligonucleotides related to this segment 



Oligonucleotide name 


Overexpressed in cancers v 

-2 , . : . 1 : _— — 


■ Chip reference <• 


Z44808_0_0_72347 


Lung small cell cancer 


LUN 



10 Segment cluster Z44808JPEA_l_node__30 according to the present invention is supported 

by 44 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z44808_PEA_1_T1 1, Z44808JPEA_1_T4, 
Z44808_PEA__1_T5, Z44808JPEA__1_T8 and Z44808JPEA__1 JT9. Table 396 below describes 
the starting and ending position of this segment on each transcript. 

15 Table 396 - Segment location on transcripts 



Transcript name 


Segment startmgposittoiv % 


Segment ending position 


Z44808_PEA_1_T11 


1820 


1857 


Z44808_PEA_1_T4 


1871 


1908 


Z44808_PEA_1_T5 


1966 


2003 


Z44808_PEA_1_T8 


1871 


1908 


Z44808_PEA_1_T9 


1871 


1908 



Segment cluster Z44808_PEA_l_node_34 according to the present invention is supported 
by 70 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): Z44808JPEA_1_T1 1, Z44808JPEA_1_T4 and 
Z44808_PEA_1_T5. Table 397 below describes the starting and ending position of this segment 
on each transcript. 

Table 397 - Segment location on transcripts 



Trmscriptmme . 1 ; , 


Segment starting position 


Segment ending position 


Z44808_PEA_1_T1 1 


2735 


2809 


Z44808_PEA_1_T4 


4471 


4545 


Z44808_PEA_1_T5 


2881 


2955 



5 



Segment cluster Z44808_PEA_l_node_35 according to the present invention can be 
found in the following transcript(s): Z44808J 5 EA_1_T1 1, Z44808JPEA_1 JT4 and 
Z44808JPEA_1 JT5. Table 398 below describes the starting and ending position of this segment 
10 on each transcript. 

Table 398 - Segment location on transcripts 



Transcript name 


Segment starting position 


[Segment ending positioa 'M; i 


Z44808_PEA_1_T11 


2810 


2828 


Z44808_PEA_1_T4 


4546 


4564 


Z44808_PEA_1_T5 


2956 


2974 



Segment cluster Z44808_PEA_l_node_39 according to the present invention is supported 
15 by 1 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z44808_PEA_1_T9. Table 399 below describes the 
starting and ending position of this segment on each transcript. 

Table 399 - Segment location on transcripts 



Transcript name 


Segment starting position 


! Segment ending position. 


Z44808_PEA_1_T9 


1909 


1973 



WO 2006/131783 



PCT/IB2005/004037 



497 

Segment cluster Z44808_PEA_l_node_4 according to the present invention is supported 
by 33 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z44808JPEA_1_T1 1, Z44808J > EA_1_T4, 
5 Z44808_PEA_1_T5 ? Z44808JPEA_1_T8 and Z44808JPEA_1__T9. Table 400 below describes 
the starting and ending position of this segment on each transcript. 



Table 400 - Segment location on transcripts 



Transcript name >' 


Segment starting position : A 


; Segment ending position ; , 


Z44808_PEA_1_T11 


842 


948 


Z44808_PEA_1_T4 


842 


948 


Z44808_PEA_1_T5 


842 


948 


Z44808_PEA_1_T8 


842 


948 


Z44808_PEA_1_T9 


842 


948 



10 Segment cluster Z44808JPEA_l_node_6 according to the present invention is supported 

by 30 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z44808_PEA_1_T11 5 Z44808 PEA1T4, 
Z44808JPEA_1_T5, Z44808_PEA_1_T8 and Z44808 PEA 1T9. Table 401 below describes 
the starting and ending position of this segment on each transcript. 

15 Table 401 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z44808_PEA_1_T11 


949 


1048 


Z44808_PEA_1_T4 


949 


1048 


Z44808_PEA_1_T5 


949 


1048 


Z44808_PEA_1_T8 


949 


1048 


Z44808_PEA_1_T9 


949 


1048 
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Segment cluster Z44808_PEA_l_node_8 according to the present invention is supported 
by 25 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z44808_PEA_1_T1 1, Z44808_PEA_1_T4, 
Z44808JPEA_1 JT5, Z44808_PEA_1_T8 and Z44808J?EA_1_T9. Table 402 below describes 



5 the starting and ending position of this segment on each transcript. 
Table 402 - Segment location on transcripts 



Transcript name /, ; • 


Segment starting position 


Segment ending position ??' 


Z44808_PEA_1_T11 


1049 


1096 


Z44808_PEA_1_T4 


1049 


1096 


Z44808_PEA_1_T5 


1049 


1096 


Z44808_PEA_1_T8 


1049 


1096 


Z44808_PEA_1_T9 


1049 


1096 



10 



Variant protein alignment to the previously known protein: 
15 Sequence name: /tmp/vUqLu6eAVZ/K3 JDuPvaLo : SM02_HUMAN 

Sequence documentation : 

Alignment of: Z4 4 80 8_PEA_1_P5 x SM02_HUMAN 

20 

Alignment segment 1/1: 

Quality: 4440.00 

Escore: 0 
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Matching length: 441 
length: 441 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 



Total 



100.00 Matching Percent 



Total Percent 



10 



Alignment : 



1 MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQK 50 

I I I I! I I I I I 1 I I I I II I I I 1 I 1 I I I I i I I 1 1 I I I I I I I I I I I I I I II I I 

1 MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQK 50 



15 



51 PLCASDGRTFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQ 100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

51 PLCASDGRTFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQ 10 0 



20 



101 ARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKT 150 

I I I I I I I I I I I I I I I I I I I II I I II I i I I I I I I I I I I II I I I I I I I I II I 
101 ARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKT 150 



25 



151 PRCPGSVNEKLPQREGTGKTDDAAAPALETQPQGDEEDIASRYPTLWTEQ 200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

151 PRCPGSVNEKLPQREGTGKTDDAAAPALETQPQGDEEDIASRYPTLWTEQ 20 0 



30 



201 VKSRQNKTNKNS VS S CDQEHQS ALEEAKQPKNDNWI PECAHGGLYKPVQ 250 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

201 VKSRQNKTNKNSVSSCDQEHQSALEEAKQPKNDNWIPECAHGGLYKPVQ 250 

251 CH P S T G Y C WC VL VDT GRP I P GT S TR YE Q PKC DN T ARAH PAKARDL YKGRQ 300 
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I 1 1 I I I I I 1 I I I 1 I I 1 II I I I 1 I 1 1 I 1 I I i i 1 I I I I I I I I I I I 1 I I I I I I 
251 CHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQ 300 

301 LQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEER 350 
5 | f || | | | I I 1 I I I I II I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 LQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEER 350 

351 VVHW Y FKL L DKN S S G D I GKKE I K P FKRFLRKK S K PKKC VKKFVE Y C DVNN 400 
I I I I I I II I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I 
10 351 VVHWYFKLLDKNSSGDIGKKEIKPFKRFLRKKSKPKKCVKKFVEYCDVNN 400 

401 DKS I S VQELMGCLGVAKE DGKADTKKRHT PRGHAE S T SNRQ 4 41 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

4 01 DKS I S VQELMG C L G VAKE DGK AD T KKRH T PRG H AE S T S NRQ 4 41 

15 



20 

Sequence name : /tmp/QSUNf Ts J5y/kLOw5Vb6SD : SM02_HUMAN 
Sequence documentation : 
25 Alignment of: Z 4 4 8 0 8_PEA_1 J? 6 x SM02_HUMAN 
Alignment segment 1/1: 

Quality: 4310.00 

30 Escore: 0 
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Matching length: 428 
length: 428 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 



Total 



100.00 Matching Percent 



Total Percent 



10 



Alignment : 



1 MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQK 50 

I I I I I I I II I I I I I I I I I II I i I I I I i I I I M I I I I I I I I I I I I M I I I I 

1 MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQK 5 0 



15 



51 PLCASDGRTFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQ 100 

I I I t I I I I ! I I I I I i i I I I I M I I I II I i I I I I M I I I I I I I I I I I I I I I 

51 PLCASDGRTFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQ 100 



20 



101 ARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKT 150 

M I I I I I I I I I I I I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I I I I 

101 ARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKT 150 



25 



151 PRCPGSVNEKLPQREGTGKTDDAAAPALETQPQGDEEDIASRYPTLWTEQ 200 

I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I II I I I I I I I i i I I I I I I 

151 PRCPGSVNEKLPQREGTGKTDDAAAPALETQPQGDEEDIASRYPTLWTEQ 200 



30 



201 VKSRQNKTNKNSVSSCDQEHQSALEEAKQPKNDNWIPECAHGGLYKPVQ 250 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I t I I I I I I I I i I 

201 VKSRQNKTNKNSVSSCDQEHQSALEEAKQPKNDNWIPECAHGGLYKPVQ 250 

251 CHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQ 300 
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1 I ! I I 1 I I I I I 1 1 I I 1 I I I I I I I I I I I 1 I 1 I I I I I I I I I I I 1 ! I I I! I ) I 

251 CHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQ 30 0 

301 LQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEER 35 0 

5 I I 1 I I I I I i I I I I I I I I I 1 I I I I I I I 1 I 1 I I I I I I I I I I I I I I I I ! I I I I 

301 LQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEER 350 

351 VVHW Y FKLL DKNS S G D I GKKE I KPFKRFLRKKS KPKKC VKKFVE Y CDVNN 400 

I I I I I I I 1 I I I I I I I I 1 I I I I I I I t I I I I I ! I I I I I I I 1 1 1 I I 1 I I ] I I I 

10 351 VVHWYFKLLDKNSSGDIGKKEIKPFKRFLRKKSKPKKCVKKFVEYCDVNN 40 0 

4 01 DKS I S VQELMGCLGVAKEDGKADTKKRH 42 8 

I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I 

4 01 DKS IS VQELMGCLGVAKEDGKADTKKRH 428 

15 



20 

Sequence name : / tmp/MZVdR4PVdM/5uN8RwVi Jl : SM02_HUMAN 
Sequence documentation : 
25 Alignment of: Z44 8 08_PEA_JL__P7 x SM02_HUMAN 
Alignment segment 1/1: 

Quality: 4440.00 

30 Escore: 0 
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Matching length: 441 Total 

length: 441 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

Alignment : 

1 MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQK 5 0 

M I I I I I I I I I I I I I I ! I I I 1 I I I ) I I | | | I I | | | I | | | | | | | | | | | | | | 

1 MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQK 50 
51 PLCASDGRTFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQ 100 

M M I I I I 1 I I I 1 I I i ! ! I I I I I I I I 1 | | | | | | | I | | | | | | | || | | | | | | 

51 PLCASDGRTFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQ 100 

101 ARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKT 150 

I I I I I I M I I I I I I I I I I i I I I I I I I I I | | | | I I I I I I I I I | | I I I I I I I 
101 ARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKT 150 

• 

151 PRCPGSVNEKLPQREGTGKTDDAAAPALETQPQGDEEDIASRYPTLWTEQ 200 

N I I I M M I I I I I I I I I I I I I I I I I I | | | | M I I I I I I I I I I | | | | | | | 
151 PRCPGSVNEKLPQREGTGKTDDAAAPALETQPQGDEEDIASRYPTLWTEQ 200 
• 

201 VKSRQNKTNKNSVSSCDQEHQSALEEAKQPKNDNWIPECAHGGLYKPVQ 250 

I N I I I I I I I I I I I I I I I I I I I I | | | | | | | | | M | | | | | | | | | | | | | | | , 

201 VKSRQNKTNKNSVSSCDQEHQSALEEAKQPKNDNWIPECAHGGLYKPVQ 250 
251 CHPS TGYCWCVL VDT GRP I PGTS TRYEQPKC DNTARAHPAKARDL YKGRQ 30 0 
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M M I I I I I i I I I I I I I I I I I I I | | | | | | | | ! I I i I I I I I i ) l 1 l l I I I I 

251 CHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQ 300 

301 LQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEER 350 
5 I I I M I I I I I 1 I 1 I ) I I ) I i | | | | | 1 | | | | | ) | | | | I I I I I 1 I ! I I 11 I I 

301 LQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEER 350 

351 VVHW Y FKLLDKNS S G D I GKKE I KP FKRFLRKKS KPKKC VKKFVE YCDVNN 4 00 

I I I I I I II I I I I I I I I I II I M I I I 1 I I I I I I I I I I I I I I I I I I I I ! I I I 

10 351 VVHWYFKLLDKNS S GD I GKKE I KPFKRFLRKKSKPKKCVKKFVE YCDVNN 400 

401 DKSISVQELMGCLGVAKEDGKADTKKRHTPRGHAESTSNRQ 4 41 

I M I I I I I I I I I I I 1 I I I I I I I I I I M 1 I I I I I I 1 I I I I I I 
401 DK S I S VQE LMG C LG VAKE DGKADT KKRHT PRGH AE S T SNRQ 441 

15 



20 

Sequence name : /tmp/3f GVxqLloe/ J5mQduAdOF : SM02_HUMAN 
Sequence documentation : 
25 Alignment of: Z 4 4 8 0 8_PEA_1_P 1 1 x SM02_HUMAN 
Alignment segment 1/1: 

Quality: 4228.00 

30 Escore: 0 
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Matching length: 429 Total 

length: '446 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 
5 Total Percent Similarity: 96.19 Total Percent 

Identity: 96.19 

Gaps: 1 

Alignment : 

10 ..... 

1 MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQK 50 

I I I I I II I I I I I I I I I I I I 1 I I ! I ! 1 I I I I 1 I I I I I I I I I I I I 1 I 1 I M I 

1 MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQK 50 

15 51 PLCASDGRTFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQ 10 0 

II I I I I I I I I I I I I I I I I I I I I I 1 I I I I t M I II I I I I I I I I I it I I I I I 

51 PLCASDGRTFLSRCEFQRAKCKDPQLEIAYRGNCKDVSRCVAERKYTQEQ 100 

101 ARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKT 150 
20 | | | | | | | | | | | M I I II I I I I I II II I I I I I I I I I I I I I I I II I I I I I I I 

101 ARKEFQQVFI PECNDD GTYSQVQCHS YTGYCWCVTPNGRPI SGTAVAHKT 150 

151 PRCPGSVNEKLPQREGTGKT DIASRYPTLWTEQ 183 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

25 151 PRCPGSVNEKLPQREGTGKTDDAAAPALETQPQGDEEDIASRYPTLWTEQ 20 0 

- • • • • 

184 VKSRQNKTNKNSVSSCDQEHQSALEEAKQPKNDNWIPECAHGGLYKPVQ 233 

I I I II I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I 1 I I 1 ) I I I I I I I 

201 VKSRQNKTNKNSVSSCDQEHQSALEEAKQPKNDNVVIPECAHGGLYKPVQ 250 



30 



234 CHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQ 283 
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I I I I I I I I I i I I 1 I I I 1 I I I I I I I I ] I I I I I I ! I I I I I I ] I I I I I 1 I 1 I I 

251 CHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQ 300 

284 LQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEER 333 
5 I II I i I I 1 I I I II I I I I I I I I M I 1 I I I i M I I i I I I I I 1 I I I I I I I i 1 I 

301 LQGCPGAKKHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEER 350 

334 VVHWYFKLLDKNSSGDIGKKEIKPFKRFLRKKSKPKKCVKKFVEYCDVNN 383 
I I I I I I I I II I I I I I I I t I I I II I I I I I II I I I I II I I I I I I I I I I I I I I 
10 351 VVHWYFKLLDKNSSGDIGKKEIKPFKRFLRKKSKPKKCVKKFVEYCDVNN 400 

3 84 DKSISVQELMGCLGVAKEDGKADTKKRHTPRGHAESTSNRQPRKQG 42 9 

I I I 1 I I II I I I 11 I I II I I I I I I I I I I I I I I I I I I i I I I I I I I ! i I 

4 01 DKSISVQELMGCLGVAKEDGKADTKKRHTPRGHAESTSNRQPRKQG 44 6 

15 



Expression of SM02 HUMAN SPARC related modular calcium-binding protein 2 precursor 
Z44808 transcripts which are detectable by amplicon as depicted in sequence name 

20 Z44808junc8- 1 1 in normal and cancerous lung tissues 

Expression of SM02JHUMAN SPARC related modular calcium-binding protein 2 precursor 
(Secreted modular calcium-binding protein 2) (SMOC-2) (Smooth muscle -associated protein 2) 
transcripts detectable by or according to junc8-l 1, Z44808 junc8-l 1 amplicon (SEQ ID NO: 
1651) and Z44808junc8- 1 IF (SEQ ID NO: 1649) and Z44808junc8-1 1R (SEQ ID NO: 1650) 

25 primers was measured by real time PCR. In parallel the expression of four housekeeping genes 
-PBGD (GenBank Accession No. BC019323; amplicon - PBGD-amplicon, SEQ ID NO:334), 
HPRT1 (GenBank Accession No. NM_000194; amplicon- HPRT1 -amplicon, SEQ ID 
NO: 1297), Ubiquitin (GenBank Accession No. BC000449; amplicon — Ubiquitin-amplicon, 
SEQ ID NO:328) and SDHA (GenBank Accession No. NM_004168; amplicon - SDHA- 

30 amplicon, SEQ ID NO:331) was measured similarly. For each RT sample, the expression of the 
above amplicon was normalized to the geometric mean of the quantities of the housekeeping 
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genes. The normalized quantity of each RT sample was then divided by the median of the 
quantities of the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, 
"Tissue samples in testing panel", above), to obtain a value of fold up-regulation for each 
sample relative to median of the normal PM samples. 

Figure 29 is a histogram showing over expression of the above -indicated 
SM02JHUMAN SPARC related modular calcium-binding protein 2 precursor transcripts in 
cancerous lung samples relative to the normal samples. 

As is evident from Figure 29, the expression of SM02JHUMAN SPARC related 
modular calcium-binding protein 2 precursor transcripts detectable by the above amplicon in 
several cancer samples was significantly higher than in the non-cancerous samples (Sample 
Nos. 47-50, 90-93, 96-99 Table 2, "Tissue samples in testing panel"). Notably an over- 
expression of at least 5 fold was found in 2 out of 15 adenocarcinoma samples and in 3 out of 8 
small cells carcinoma samples. 

Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: Z44808junc8-1 IF forward primer; 
and Z44808junc8-1 1 R reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: Z44808junc8-1 1 
Forward primer (SEQ ID NO: 1649): GAAGGCACAGGAAAAACAGATATTG 
Reverse primer (SEQ ID NO: 1650): TGGTGCTCTTGGTCACAGGAT 
Amplicon (SEQ ID NO: 1651): 

GAAGGCACAGGAAAAACAGATATTGCATCACGTTACCCTACCCTTTGGACTGAACA 
GGTTAAAAGTCGGCAGAACAAAACCAATAAGAATTCAGTGTCATCCTGTGACCAAG 

AGCACCA 



Expression of SM02 HUMAN SPARC related modular calcium-binding protein 2 precursor 
(Secreted modular calcium-binding protein 2) (SMOC-2) (Smooth muscle-associated protein 2) 
Z44808 transcripts which are detectable by amplicon as depicted in sequence name Z44808 

junc8-l 1 in different normal tissues 
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Expression of SM02HUMAN SPARC related modular calcium-binding protein 2 
precursor (Secreted modular calcium-binding protein 2) (SMOC-2) (Smooth muscle-associated 
protein 2) transcripts detectable by or according to Z44808 junc8-ll amplicon (SEQ ID NO: 
5 1651) and primers: Z44808 junc8-l 1F(SEQ ID NO: 1649) and Z44808 junc8-UR (SEQ ID NO: 
1650) was measured by real time PCR. In parallel the expression of four housekeeping genes - 
RPL19 (GenBank Accession No. NM 000981; RPL19 amplicon, SEQ ID NO:1630), TATA 
box (GenBank Accession No. NM_003194; TATA amplicon, SEQ ID NO: 1633), Ubiquitin 
(GenBank Accession No. BC000449; amplicon - Ubiquitin-amplicon, SEQ ID NO:328) and 

10 SDHA (GenBank Accession No. NM_004168; amplicon - SDH A- amplicon, SEQ ID NO:331) 
was measured similarly. For each RT sample, the expression of the above amplicon was 
normalized to the geometric mean of the quantities of the housekeeping genes. The normalized 
quantity of each RT sample was then divided by the median of the quantities of the ovary 
samples (Sample Nos. 18-20, Table 3), to obtain a value of relative expression of each sample 

15 relative to median of the ovary samples. 

Primers: 

Forward primer (SEQ ID NO: 1649): GAAGGCACAGGAAAAACAGATATTG 
Reverse primer (SEQ ID NO: 1650): TGGTGCTCTTGGTCACAGGAT 
20 Amplicon (SEQ ID NO: 1651): 

GAAGGCACAGGAAAAACAGATATTGCATCACGTTACCCTACCCTTTGGACTGAACA 
GGTTAAAAGTCGGCAGAACAAAACCAATAAGAATTCAGTGTCATCCTGTGACCAAG 
AGCACCA 

25 The results are demonstrated in Figure 1 8, showing the expression of SM02HUMAN 

SPARC related modular calcium-binding protein 2 precursor (Secreted modular calcium- 
binding protein 2) (SMOC-2) (Smooth muscle- associated protein 2) Z44808 transcripts which 
are detectable by amplicon as depicted in sequence name Z44808 junc8-l 1 in different normal 
tissues. 



30 
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DESCRIPTION FOR CLUSTER AA161 187 
5 Cluster AA161187 features 7 transcript(s) and 20 segment(s) of interest, the names for 

which are given in Tables 403 and 404, respectively, the sequences themselves are given at the 
end of the application. The selected protein variants are given in table 405. 



10 Table 403 - Transcripts of interest 



Trah^cnpt Name -•>■•. 


Sequence ID No. . ,vitf.v 
. ' " ■ - ■ ' -'■ ■"■" f - * ■ ■ 


AA161187_T0 


41 


AA161187_T7 


42 


AA161187_T15 


43 


AA161187_T16 


44 


AA161187_T20 


45 


AA161187_T21 


46 


AA161187_T22 


47 


Table 404 - Segments of interest 


Segment Name •: 


Sequence ID No. 


AA161187_node_0 


482 


AA161187_node_6 


483 


AA161187_node_14 


484 


AA161187_node_16 


485 


AA161187_node_25 


486 ! 


AA161187_node_26 


487 


AA161187_node_28 


488 
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AA161187_node_4 


489 


AA161187_node_7 


490 


AA161187_node_8 


491 


AA161187_node_9 


492 


AA161187_node_10 


493 


AA161187_node_12 


494 


AA161187_node_13 


495 


AA161187_node_19 


496 


AA161187_node_20 


497 


AAl61187_node_21 


498 


AA161187_node_22 


499 


AA161187_node_23 


500 


AA161187_node_24 


501 



Table 405 - Proteins of interest 



., , r— ? ,■-,„ ■ ~ "T^~ 

Protein Name '/• . V-r 


Sequence ID No. f '■ , 


Corresponding Transcripts) * 


AA161187_P1 


1318 


AA161187_T0 


AA161187_P6 


1319 


AA161187_T7 


AA161187_P13 


1320 


AA161187_T15 


AA161187_P14 


1321 


AA161187_T16 


AA161187_P18 


1322 


AA161187_T20 


AA161187_P19 


1323 


AA161187_T21 



These sequences are variants of the known protein Testisin precursor (SwissProt 
5 accession identifier TEST_HUMAN; known also according to the synonyms EC 3.4.21.-; 

Eosinophil serine protease 1; ESP- 1; UNQ266/PRO303), SEQ ID NO: 1431, referred to herein 
as the previously known protein. 

Protein Testisin precursor is known or believed to have the following function(s): Could 
regulate proteolytic events associated with testicular germ cell maturation. The sequence for 
10 protein Testisin precursor is given at the end of the application, as "Testisin precursor amino 
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acid sequence". Protein Testisin precursor localization is believed to be attached to the 
membrane by a GPI- anchor. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: serine-type peptidase, which are annotation(s) related to Molecular 
5 Function; and membrane fraction; cytoplasm; plasma membrane, which are annotation(s) 
related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi .nlm.nih.gov/proj ects/LocusLink/>. 

10 

Cluster AA161 1 87 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the left hand column of 
the table and the numbers on the y-axis of figure 30 refer to weighted expression of ESTs in 
1 5 each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 30 and Table 406. This cluster is overexpressed (at least at a minimum level) in the 
20 following pathological conditions: brain malignant tumors, epithelial malignant tumors and a 
mixture of malignant tumors from different tissues. 



Table 406 - Normal tissue distribution 



Natna of Tissue 


Number 


bone 


0 


brain 


1 


colon 


0 


epithelial 


0 


general 


0 


lung 


0 
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breast 


0 


bone marrow 


0 


ovary 


0 


pancreas 


0 


prostate 


4 


stomach 


0 


uterus 


0 



Table 407 - P values and ratios for expression in cancerous tissue 



Narne of Tissue 


PI 


P2 9 


SP1 


R3 £'/ 


' SP2 : ; , 


R4 . 


bone 


1 


6.7e-01 




1.0 


3.4e-01 


1.9 


brain 


9.8e-01 


6.0e-01 




0.7 


3.8e-03 


3.6 


colon 


4.4e-01 


5.0e-01 


7.0e-01 


1.5 


7.7e-01 


1.3 


epithelial 


1.3e-02 


2.6e-03 


1.7e-03 


8.4 


2.4e-04 


7.9 


general 


1.6e-03 


1.9e-05 


1.9e-05 


12.1 


2.9e-10 


15.6 


lung 


5.0e-01 


6.3e-01 


1.7e-01 


3.9 


3.8e-01 


2.2 


breast 


1 


6.7e-01 




1.0 


8.2e-01 


1.2 


bone marrow 


1 


4.2e-01 




1.0 


1.5e-01 


2.9 


ovary 


6.2e-01 


6.5e-01 


4.7e-01 


1.9 


5.9e-01 


1.6 


pancreas 


1 


4.4e-01 




1.0 


2.8e-01 


2.8 


prostate 


5.9e-01 


5.9e-01 


1.4e-01 


2.9 


2.4e-01 


2.3 


stomach 


1 


4.7e-01 




1.0 


6.4e-01 


1.5 


uterus 


1 


2.4e-01 




1.0 


1.7e-01 


2.0 



As noted above, cluster AA161 187 features 7 transcript(s) 3 which were listed in Table 403 
above. These transcript(s) encode for protein(s) which are variant(s) of protein Testisin 
5 precursor. A description of each variant protein according to the present invention is now 
provided. 



Variant protein AA161 187JP1 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) AA161187__T0. 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signa^peptide 
5 prediction programs predict that this protein has a signal peptide. 

Variant protein AA161187JP1 also has the following non- silent SNPs (Single Nucleotide 
Polymorphisms) as listed in Table 408, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein AA161 187JP1 sequence provides 
10 support for the deduced sequence of this variant protein according to the present invention). 

Table 408 - Amino acid mutations 



SN]^,posM0n(s),0n air^o.acid 

..'V*;.. ^ - i'* -'it' 

sequence . 'V . 


Alternative amino acid(s) #*V 


Previously kno wn SNP?. | 

, .X ,X X Xf' 


1 


M-> 


No 


16 


A-> 


No 


226 


N-> 


No 


253 


I->V 


No 


255 


V->I 


No 


264 


R-> 


No 


264 


R->P 


No 


264 


R->Q 


Yes 



Variant protein AA161187JP1 is encoded by the following transcript(s): AA161187_T0, 
for which the sequence(s) is/are given at the end of the application. The coding portion of 

15 transcript AA161 187JT0 is shown in bold; this coding portion starts at position 107 and ends at 
position 1048. The transcript also has the following SNPs as listed in Table 409 (given 
according to their position on the nucleotide sequence, with the alternative nucleic acid listed; 
the last column indicates whether the SNP is known or not; the presence of known SNPs in 
variant protein AA161 187_P1 sequence provides support for the deduced sequence of this 

20 variant protein according to the present invention). 
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Table 409 - Nucleic acid SNPs 



SNP position on mqleotiife 
f sequence ^ . / ■■• 


Alternative nucleic acid 

; vvv - W : 'y':v; 


Previously known SNP? 


66 


T-> A 


No 


67 


T->G 


No 


105 


C ->T 


No 


108 


T-> 


No 


154 


T-> 


No 


190 


C ->G 


No 


469 


A->G 


Yes 


571 


C->T 


Yes 


782 


A-> 


No 


859 


T->C 


Yes 


863 


A->G 


No 


869 


G-> A 


No 


897 


G-> 


No 


897 


G-> A 


Yes 


897 


G->C 


No 


1000 


A->G 


Yes 


1068 


G-> 


No 


1068 


G-> A 


No 


1069 


C -> A 


No 


1168 


A->G 


Yes 



Variant protein AA161 187JP6 according to the present invention has an amino acid 
5 sequence as given at the end of the application; it is encoded by transcript(s) AA161 187_T7. An 
alignment is given to the known protein (Testisin precursor) at the end of the application. One or 
more alignments to one or more previously published protein sequences are given at the end of 
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the application. A brief description of the relationship of the variant protein according to the 
present invention to each such aligned protein is as follows: 

Comparison report between AA161 187JP6 and TESTJHUMAN: 

1. An isolated chimeric polypeptide encoding for AA161 187JP6, comprising a first amino 
5 acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 

preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence HTREGTLGGQKRAFPDGVEGEKGRGRAWGAASRGSAVPLTIR 
corresponding to amino acids 1 - 42 of AA161 187_P6, and a second amino acid sequence being 
at least 90 % homologous to 

10 GPCGRJR.VITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWALTAAHCFETYS 
DLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPV 
TYTKHIQPICLQASTFEFENRTDCWVTGWGYIKEDEALPSPHTLQEVQVAIINNSMCNH 
LFLKYSFRKDIFGDMVCAGNAQGGKJDACFGDSGGPLACNKNGLWYQIGVVSWGVGC 
GRPNRPGVYTNISHHFEWIQKLMAQSGMSQPDPSWPLLFFPLLWALPLLGPV 

15 corresponding to amino acids 31-314 of TEST_HUMAN, which also corresponds to amino 

acids 43 - 326 of AA161 187JP6, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a head of AA161 187_P6, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 

20 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence HTREGTLGGQKRAFPDGVEGEKGRGRAWGAAS RGSAVPLTIR of 
AA161187JP6. 

The location of the variant protein was determined according to results from a number of 
25 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
membrane. The protein localization is believed to be membrane because although it is a partial 
protein, because both trans-membrane region prediction programs predict that this protein has a 
trans- membrane region. 

30 Variant protein AA161 187JP6 also has the following non-silent SNPs (Single Nucleotide 

Polymorphisms) as listed in Table 410, (given according to their position(s) on the amino acid 



WO 2006/131783 



PCT/IB2005/004037 



516 

sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein AA161 187JP6 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

Table 410 - Amino acid mutations 



SNP positfon<s) msmsmMcM 
sequence . '. " -/• 


Alternative amino acid(s) / 


Previously known SNP? 


238 


N-> 


No 


265 


I-> V 


No 


267 


V->I 


No 


276 


R-> 


No 


276 


R->P 


No 


276 


R->Q 


Yes 



5 



The glycosylation sites of variant protein AA161 187JP6, as compared to the known 
protein Testisin precursor, are described in Table 411 (given according to their position(s) on 
the amino acid sequence in the first column; the second column indicates whether the 
glycosylation site is present in the variant protein; and the last column indicates whether the 
10 position is different on the variant protein). 



Table 411 - Glycosylation site(s) 



Positions) on taocrwn amino 3 
acid sequence r i f 


Present in variant protein? 


1 Position in variant protein? ; 


200 


yes 


212 


167 


yes 


179 


273 


yes 


285 



Variant protein AA161187_P6 is encoded by the following transcript(s): AA161187_T7 ? 
for which the sequence(s) is/are given at the end of the application. The coding portion of 
15 transcript AA161 1 87JT7 is shown in bold; this coding portion starts at position 1 and ends at 
position 979. The transcript also has the following SNPs as listed in Table 412 (given according 
to their position on the nucleotide sequence, with the alternative nucleic acid listed; the last 
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column indicates whether the SNP is known or not; the presence of known SNPs in variant 
protein AA161 1 87JP6 sequence provides support for the deduced sequence of this variant 
protein according to the present invention). 

Table 412- Nucleic acid SNPs 



Sup position on imcledtidef , 
sequence 7 '-?\ ' i* '„■.' ....... 


Alternative nucleic acid - 


Previously blown SNP? f\ 

■ '}" '■■ ■■' ■ '■'■■[* "■ ■ 


400 


A->G 


Yes 


502 


C->T 


Yes 


713 


A-> 


No 


790 ! 


T->C 


Yes 


794 


A->G 


No 


800 


G -> A 


No 


828 


G-> 


No 


828 


G->A 


Yes 


828 


G->C 


No 


931 


A->G 


Yes 


999 


G-> 


No 


999 


G-> A 


No 


1000 


C -> A 


No 


1099 


A->G 


Yes 



5 



Variant protein AA161 187JP13 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) AA161 187JT15. 
An alignment is given to the known protein (Testisin precursor) at the end of the application. 
10 One or more alignments to one or more previously published protein sequences are given at the 
end of the application. A brief description of the relationship of the variant protein according to 
the present invention to each such aligned protein is as follows: 

Comparison report between AA161187_P13 and TEST JHUMAN : 
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LAn isolated chimeric polypeptide encoding for AA161 187_P13, comprising a first 
amino acid sequence being at least 90 % homologous to 

MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAELGRWPWQGS 
LRLWDSHVCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAY 
5 YTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYTKHIQPICLQASTFEFENRTDCWVTG 
WGYIKEDE corresponding to amino acids 1-183 of TEST_HUMAN, which also corresponds 
to amino acids 1-183 of AA161187_P13, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

10 GSSGRHHKQLYVQPPLPQVQFPQGHLWRHG corresponding to amino acids 184 - 213 of 
AA161187_P13, wherein said first amino acid sequence and second amino acid sequence are 
contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of AA161 187_P13, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 

15 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence GSSGRHHKQLYVQPPLPQVQFPQGHLWRHG in AA161 187JP13. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 

20 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein AA161187JP13 also has the following non- silent SNPs (Single 

25 Nucleotide Polymorphisms) as listed in Table 413, (given according to their position(s) on the 

amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein AA161 187_P13 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

30 Table 413 - Amino acid mutations 
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.SNP pqsition(s) on amino acid 
sequence, . > ; v 


"Alternative amino acid(s) 


Previously known SNP? 


1 


M-> 


No 


16 


A-> 


No 



The glycosylation sites of variant protein AA161 187P13, as compared to the known 
protein Testisin precursor, are described in Table 414 (given according to their position(s) on 
the amino acid sequence in the first column; the second column indicates whether the 
5 glycosylation site is present in the variant protein; and the last column indicates whether the 
position is different on the variant protein). 



Table 414 - Glycosylation site(s) 



Position(s J on kn^^ attd 
acid sequence 'X . ■. X 


Present in variant protein? 


Position i&^m^^ > 

\ - V, " f ? v' ' . •"•'.„ *• 


200 


no 




167 


yes 


167 


273 


no 





Variant protein AA161 187JP13 is encoded by the following transcript(s): 
10 AA161 1 87_T15, for which the sequence(s) is/are given at the end of the application. The coding 
portion of transcript AA161187_T15 is shown in bold; this coding portion starts at position 107 
and ends at position 745. The transcript also has the following SNPs as listed in Table 415 
(given according to their position on the nucleotide sequence, with the alternative nucleic acid 
listed; the last column indicates whether the SNP is known or not; the presence of known SNPs 
15 in variant protein AA161187_P13 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 



Table 415 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


1 Previously known SNP? 


66 


T-> A 


No 
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67 


T->G 


No 


105 


C->T 


No 


108 


T-> 


No 


154 


T-> 


No 


190 


C ->G 


No 


469 


A->G 


Yes 


571 


C->T 


Yes 


791 


T->C 


Yes 


795 


A->G 


No 


801 


G->A 


No 


829 


G-> 


No 


829 


G->A 


Yes 


829 


G->C 


No 


932 


A->G 


Yes 


1000 


G-> 


No 


1000 


G-> A 


No 


1001 


C->A 


No 


1100 


A->G 


Yes 



Variant protein AA161 187JP14 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) AA161 187JT16. 
5 An alignment is given to the known protein (Testisin precursor) at the end of the application. 
One or more alignments to one or more previously published protein sequences are given at the 
end of the application. A brief description of the relationship of the variant protein according to 
the present invention to each such aligned protein is as follows: 

Comparison report between AA161187JP14 and TESTJHUMAN: 
10 l.An isolated chimeric polypeptide encoding for AA161 187JP14, comprising a first 

amino acid sequence being at least 90 % homologous to 
MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIV 

LRLWDSHVCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAY 
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YTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYTKHIQPICLQASTFEFENRTDCWVTG 
WGYIKEDE corresponding to amino acids 1-183 of TESTJHUMAN, which also corresponds 
to amino acids 1-183 of AA161 187P14, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
5 preferably at least 95% homologous to a polypeptide having the sequence 

GCCLSPSHYRPHSTAISPHPPGSSGRHHKQLYVQPPLPQVQFPQGHLWRHGLCWQCPRR 
EGCLLRECPCHHSQPRKASCVPVPYLTLMPTPGGGDCCPTLQMQKRRLGCCQGEEEDV 
HPVYPAP corresponding to amino acids 184 - 307 of AA161 187_P14, wherein said first amino 
acid sequence and second amino acid sequence are contiguous and in a sequential order. 
10 2. An isolated polypeptide encoding for a tail of AA161 187JP14, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

GCCLSPSHYRPHSTAISPHPPGSSGRHHKQLYVQPPLPQVQFPQGHLWRHGLCWQCPRR 
15 EGCLLRECPCHHSQPRXASCVPWYLTLMPTPGGGDCCPTLQMQKRRLGCCQGEEEDV 
HPVYPAP in AA161187JP14. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 

20 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein AA161187_P14 also has the following non- silent SNPs (Single 

25 Nucleotide Polymorphisms) as listed in Table 416, (given according to their position(s) on the 

amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein AA161 187JP14 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

30 Table 416- Amino acid mutations 
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SNP position(s) on amino acid 
sequence ^ , 


Alternative amino acid(s) 


Previously known SNP? ". 


1 


M-> 


No 


16 


A-> 


No 


238 


Q-> 


No 



The glycosylation sites of variant protein AA161 187_P14, as compared to the known 
protein Testisin precursor, are described in Table 417 (given according to their position(s) on 
the amino acid sequence in the first column; the second column indicates whether the 
5 glycosylation site is present in the variant protein; and the last column indicates whether the 
position is different on the variant protein). 



Table 417 - Glycosylation site(s) 



. . Position(s) pn kmmn amino \, 
acid sequence / ;r 


Present in^ma»ft>rpt^Qi? ;;- ' 

f Mir ^ % 


Position in variant protgin? - 


200 


no 




167 


yes 


167 


273 


no 





Variant protein AA161 187_P14 is encoded by the following transcript(s): 
10 AA161 187_T16, for which the sequence(s) is/are given at the end of the application. The coding 
portion of transcript AA161187_T16 is shown in bold; this coding portion starts at position 107 
and ends at position 1027. The transcript also has the following SNPs as listed in Table 418 
(given according to their position on the nucleotide sequence, with the alternative nucleic acid 
listed; the last column indicates whether the SNP is known or not; the presence of known SNPs 
15 in variant protein AA1 61 1 87JP14 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 418 - Nucleic acid SNPs 



SNP position on nucleotide 


Alternative nucleic acid 


Previously known SNP? 


sequence 
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66 


T-> A 


No 


67 


T->G 


No 


105 


C ->T 


No 


108 


T-> 


No 


154 


T-> 


No 


190 


C->G 


No 


469 


A->G 


Yes 


571 


C ->T 


Yes 


819 


A-> 


No 


859 


C ->T 


Yes 


1152 


T->C 


Yes 


1156 


A->G 


No 


1162 


G->A 


No 


1190 


G-> 


No 


1190 


G-> A 


Yes 


1190 


G->C 


No 


1293 


A->G 


Yes 


1361 


G-> 


No 


1361 


G-> A 


No 


1362 


C -> A 


No 


1461 


A->G 


Yes 



Variant protein AA161 1 87__P18 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) AA161 187JT20. 
5 An alignment is given to the known protein (Testisin precursor) at the end of the application. 
One or more alignments to one or more previously published protein sequences are given at the 
end of the application. A brief description of the relationship of the variant protein according to 
the present invention to each such aligned protein is as follows: 

Comparison report between AA161 187JP18 and TEST HUMAN: 
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1. An isolated chimeric polypeptide encoding for AA161 187_P1 8, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence HTREGTLGGQKRAFPDGVEGEKGRGRAWGAASRGSAVPLTIR 

5 corresponding to amino acids 1 - 42 of AA161 187JP18, a second amino acid sequence being at 
least 90 % homologous to 

GPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWALTAAHCFET 
corresponding to amino acids 31-86 of TEST_HUMAN, which also corresponds to amino 
acids 43 - 98 of AA161187JP18, a third amino acid sequence being at least 90 % homologous to 

10 DLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPV 
TYTKHIQPICLQASTFEFENRTDCWVTGWGYIKEDEALPSPHTLQEVQVAIINNSMCNH 
LFLKYSFRKDIFGDMVCAGNAQGGKDACF corresponding to amino acids 89- 235 of 
TEST HUMAN, which also corresponds to amino acids 99 - 245 of AA161 187_P18, and a 
fourth amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 

15 more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 

having the sequence VSVPATTPSPGKHPVSLCLI corresponding to amino acids 246 - 265 of 
AA161187JP18, wherein said first amino acid sequence, second amino acid sequence, third 
amino acid sequence and fourth amino acid sequence are contiguous and in a sequential order. 

2 . An isolated polypeptide encoding for a head of AA1 61 1 87_P1 8, comprising a 

20 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence HTREGTLGGQKRAFPDGVEGEKGRGRAWGAASRGS AVPLTIR of 
AA161187JP18. 

3. An isolated chimeric polypeptide encoding for an edge portion of AA161 187_P18, 
25 comprising a polypeptide having a length M n", wherein n is at least about 10 amino acids in 

length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise TD, having a 
structure as follows: a sequence starting from any of amino acid numbers 98-x to 98; and ending 
30 at any of amino acid numbers 99 + ((n-2) - x), in which x varies from 0 to i>2. 
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4.An isolated polypeptide encoding for a tail of AA161 187_P18, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence VSVPATTPSPGKHPVSLCLI in AA161 187_P18. 

5 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
membrane. The protein localization is believed to be membrane because although it is a partial 

10 protein, because both trans-membrane region prediction programs predict that this protein has a 
trans- membrane region. 

Variant protein AA161187_P18 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 419, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 

15 the SNP is known or not; the presence of known SNPs in variant protein AA161 1 87_P18 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 419 - Amino acid mutations 



SNP positions) on amino acid 
sequence . ■ • ', A ■ J;' "i 


Alternative anaino acid(s) 


Previously known SNP? 


236 


N-> 


No 


249 


P->L 


Yes 



20 The glycosylation sites of variant protein AA 1611 87 JP 1 8, as compared to the known 

protein Testisin precursor, are described in Table 420 (given according to their position(s) on 
the amino acid sequence in the first column; the second column indicates whether the 
glycosylation site is present in the variant protein; and the last column indicates whether the 
position is different on the variant protein). 

25 Table 420 - Glycosylation site(s) 
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Position(s) on known amino 
add sequence 


Present in variant protein? 


Position in variant protein? 


200 


yes 


210 


167 


yes 


177 


273 


no 





Variant protein AA161 187JP18 is encoded by the following transcript(s): 
AA161 187 T20, for which the sequence(s) is/are given at the end of the application. The coding 
portion of transcript AA161 187JT20 is shown in bold; this coding portion starts at position 1 
5 and ends at position 796. The transcript also has the following SNPs as listed in Table 421 

(given according to their position on the nucleotide sequence, with the alternative nucleic acid 
listed; the last column indicates whether the SNP is known or not; the presence of known SNPs 
in variant protein AA161 187_P18 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

1 0 Table 421 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence. 


Alternative nucleic acid ; 


Previously known SNP? , 1 


394 


A->G 


Yes 


496 


C->T 


Yes 


707 


A-> 


No 


747 


C ->T 


Yes 


1040 


T->C 


Yes 


1044 


A->G 


No 


1050 


G-> A 


No 


1078 


G-> 


No 


1078 


G-> A 


Yes 


1078 


G->C 


No 


1181 


A->G 


Yes 


1249 


G-> 


No 
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1249 


G-> A 


No 


1250 


C->A 


No 


1349 


A->G 


Yes 



Variant protein AA161187_P19 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) AA161 187JT21. 
5 An alignment is given to the known protein (Testisin precursor) at the end of the application. 
One or more alignments to one or more previously published protein sequences are given at the 
end of the application. A brief description of the relationship of the variant protein according to 
the present invention to each such aligned protein is as follows: 

Comparison report between AA161 187JP19 and TESTHUM AN : 
1 0 1 .An isolated chimeric polypeptide encoding for AA1 611 87_P 1 9, comprising a first 

amino acid sequence being at least 90 % homologous to 

MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAELGRWPWQGS 
LRLWDSHVCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAY 
YTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYTKHIQPICLQASTFEFENRTDCWVTG 

15 WGYIKEDE corresponding to amino acids 1-183 of TEST_HUMAN, which also corresponds 
to amino acids 1 - 183 of AA161 187_P19 ? and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence DKRTQ 
corresponding to amino acids 184 - 188 of AA161187JP19, wherein said first amino acid 

20 sequence and second amino acid sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of AA161187JP19, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence DKRTQ in AA161187_P19. 



The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
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secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein AA161 187_P19 also has the following noi> silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 422, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein AA161 187JP19 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 422 - Amino acid mutations 



"SlbW dS mnin6 aeiS 
sfcquSSbe ' /. , ■ ; . m. ■ ' 4' _ ' 


Alternative ammo acid(s) / J 


Previously known SNP? 


1 


M-> 


No 


16 


A-> 


No 



The glycosylation sites of variant protein AA161 187_P19, as compared to the known 
protein Testisin precursor, are described in Table 423 (given according to their position(s) on 
the amino acid sequence in the first column; the second column indicates whether the 
glycosylation site is present in the variant protein; and the last column indicates whether the 
position is different on the variant protein). 



Table 423 - Glycosylation site(s) 



Position(s) on known ammo 
acid sequence , 


Present in variant protem? 


Position in variant protein? 


200 


no 




167 


yes 


167 


273 


no 





Variant protein AA161187JP19 is encoded by the following transcript(s): 
AA161187JT21, for which the sequence(s) is/are given at the end of the application. The coding 
portion of transcript AA161 187JT21 is shown in bold; this coding portion starts at position 107 
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and ends at position 670. The transcript also has the following SNPs as listed in Table 424 
(given according to their position on the nucleotide sequence, with the alternative nucleic acid 
listed; the last column indicates whether the SNP is known or not; the presence of known SNPs 
in variant protein AA161187JP19 sequence provides support for the deduced sequence of this 



5 variant protein according to the present invention). 
Table 424 - Nucleic acid SNPs 



SNP%osition on nucleotide -f 
sequence • • # 


Alternative nucleic acid 

'• \ " f~ '% - J . j' 


Previously known SNP? ,C 


66 


T-> A 


No 


67 


T->G 


No 


105 


C->T 


No 


108 


T-> 


No 


154 


T-> 


No 


190 


C->G 


No 


469 


A->G 


Yes 


571 


C->T 


Yes 


719 


G->T 


Yes 



As noted above, cluster AA161 187 features 20 segment(s), which were listed in Table 
404 above and for which the sequence(s) are given at the end of the application. These 
seginent(s) are portions of nucleic acid sequence(s) which are described herein separately 



10 because they are of particular interest. A description of each segment according to the present 
invention is now provided. 

Segment cluster AA161187_node_0 according to the present invention is supported by 21 
libraries. The number of libraries was determined as previously described. This segment can be 
15 found in the following transcript(s): AA161187_T0 ? AA161187_T15 ? AA161187_T16> 
AA161187_T21 and AA161187JT22. Table 425 below describes the starting and ending 
position of this segment on each transcript. 

Table 425 - Segment location on transcripts 
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Transcript name 








starting position 


ending position 


AA161187_T0 | 


1 


170 


AA161187_T15 


1 


170 


AA161187_T16 


1 


170 


AA161187_T21 


1 


170 


AA161187_T22 


1 


170 



Segment cluster AA161 187_node_6 according to the present invention is supported by 3 
libraries. The number of libraries was determined as previously described. This segment can be 
5 found in the following transcript(s): AA161 187_T7 and AA161 187JT20. Table 426 below 
describes the starting and ending position of this segment on each transcript. 

Table 426 - Segment location on transcripts 









Transciipt name • ' ' 'Wu . 0 \ ■ " ■ - v . 


Segment " s /. § 


'Segment- ; f-P '.\ 




, starting position ,: ; 


ending position v ; 


AA161187_T7 


1 


120 


AA161187_T20 


1 


120 









10 Segment cluster AA161 187__node_14 according to the present invention is supported by 

35 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): AA161 187JT0, AA161 187JT7, AA161 187JT15, 
AA161187JN6, AA161187JT20, AA161187_T21 and AA161187_T22. Table 427 below 
describes the starting and ending position of this segment on each transcript. 

15 Table 427 - Segment location on transcripts 



Transcript name 


j Segment 
starting position 


Segment 
ending position 


AA161187JT0 


446 


656 
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AA161187_T7 


377 


587 


AA161187_T15 


446 


656 


AA161187_T16 


446 


656 


AA161187_T20 


371 


581 


AA161187_T21 


446 


656 


AA161187_T22 


446 


656 



Segment cluster AA161 187_node_16 according to the present invention is supported by 2 
libraries. The number of libraries was determined as previously described. This segment can be 



5 found in the following transcript(s): AA161 187JT22. Table 428 below describes the starting 
and ending position of this segment on each transcript. 

Table 428 - Segment location on transcripts 



'Traj^pript name . . \ -a a '% • .. - J-'- 


Segment|; r ', 


Segment | 


AA161187_T22 


657 


953 



10 Segment cluster AA161 187_node_25 according to the present invention is supported by 

13 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): AA161 187_T16 and AA161 187JT20. Table 429 below 
describes the starting and ending position of this segment on each transcript. 



Table 429 - Segment location on transcripts 



Trauseri.pt name 


Segment ; 


Segment 




starting position 


ending position 


AA161187_T16 


880 


1104 


AA161187_T20 


768 


992 









15 
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Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 430. 



5 Table 430 - Oligonucleotides related to this segment 



Oligonucleotide name ^ ; j 


Overexpressed in eancep . 


Chip reference Cf f j ]: : : 


AA161187_0J)_430 


lung malignant tumors 


LUN 



Segment cluster AA161 187_node_26 according to the present invention is supported by 
39 libraries. The number of libraries was determined as previously described. This segment can 
10 be found in the following transcript(s): AA161 187_T0 ? AA161 187JT7, AA161 187_T15 5 
AA161187JT16 and AA161187_T20. Table 431 below describes the starting and ending 
position of this segment on each transcript. 

Table 431 - Segment location on transcripts 



Transcript name ..; . . . ; .1 


Segment .W ,fj, 
starting position ? f 


' Segment.-;' •; 
ending position 


AA161187_T0 


812 


1173 


AA161187_T7 


743 


1104 


AA161187_T15 


744 


1105 


AA161187_T16 


1105 


1466 


AA161187_T20 


993 


1354 



15 

Segment cluster AA161 187__node_28 according to the present invention is supported by 4 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): AA161187JT2L Table 432 below describes the starting 
and ending position of this segment on each transcript. 

20 Table 432 - Segment location on transcripts 
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Transcript name ' : < ' ' . ■£ V " I 


Segment 
starting position 


Segment . 
ending portion 


AA161187_T21 


657 


1171 



According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

Segment cluster AA161 187_node_4 according to the present invention is supported by 22 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): AA161187JT0, AA161 187JT15, AA161187JT16, 
AA161 187JT21 and AA161 187JT22. Table 433 below describes the starting and ending 
position of this segment on each transcript. 

Table 433 - Segment location on transcripts 



■" v . ' ■ ■" ■■: jA J : .:¥0 h r 


Segment . 
starting position 


. Segment •- ■ 
■ ending position 


AA161187_T0 


171 


197 


AA161187_T15 


171 


197 


AA161187_T16 


171 


197 


AA161187_T21 


171 


197 


AA161187_T22 


171 


197 



Segment cluster AA161 187_node_7 according to the present invention can be found in 
the following transcript(s): AA161 187JT7 and AA161 187_T20. Table 434 below describes the 
starting and ending position of this segment on each transcript. 

Table 434 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


AA161187_T7 


121 


128 
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AA161187_T20 


121 


128 









Segment cluster AA161187_nodeJ5 according to the present invention is supported by 23 
libraries. The number of libraries was determined as previously described. This segment can be 
5 found in the following transcript(s): AA161187JT0, AA161187JT7, AA161187JT15, 

AA161 187JT16, AA161 187JT20, AA161 187JT21 and AA161 1 87_T22. Table 435 below 
describes the starting and ending position of this segment on each transcript. 



Table 435 - Segment location on transcripts 



Transcript tmmi ■ f- ; , - ? 


Segment j y • .% , 
' Jstartiifg position >\f 


' Segment **'-4- v .. ' ..ti ■ : 
ending position ' 


AA161187_T0 


198 


256 


AA161187_T7 


129 


187 


AA161187_T15 


198 


256 


AA161187_T16 


198 


256 


AA161187_T20 


129 


187 


AA161187_T21 


198 


256 


AA161187_T22 


198 


256 



10 

Segment cluster AA161 187_node_9 according to the present invention is supported by 24 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): AA161187_T0, AA161187_T7, AA161187_T15, 
AA161187_T16, AA161187JT20, AA161187_T21 and AA161187_T22. Table 436 below 
1 5 describes the starting and ending position of this segment on each transcript. 



Table 436 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


AA161187JT0 


257 


298 
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AA161187_T7 


188 


229 


AA161187_T15 


257 


298 


AA161187_T16 


257 


298 


AA161187_T20 


188 


229 


AA161187_T21 


257 


298 


AA161187_T22 


257 


298 



Segment cluster AA161 187_node_10 according to the present invention is supported by 
25 libraries. The number of libraries was determined as previously described. This segment can 
5 be found in the following transcript(s): AA161187JT0, AA161187_T7 ? AA161187JT15, 
AA161187_T16, AA161187JT20, AA161187JT21 and AA161187JT22. Table 437 below 
describes the starting and ending position of this segment on each transcript. 



Table 437 - Segment location on transcripts 



Transotipt liame l 1ff ' " {f ' %■ . 


•Segment f 1 


• Segment ' ' 




starting posifion /• 


ending position $ 


AA161187_T0 


299 


363 


AA161187JT7 


230 


294 


AA161187_T15 


299 


363 


AA161187_T16 


299 


363 


AA161187_T20 


230 


294 


AA161187_T21 


299 


363 


AA161187_T22 


299 


363 



10 

Segment cluster AA161 1 87_node_12 according to the present invention can be found in 
the following transcript(s): AA161187_T0, AA161187JT7, AA161187JT15, AA161187JT16, 
AA161187_T21 and AA161 187_T22. Table 438 below describes the starting and ending 
position of this segment on each transcript. 

15 Table 438 - Segment location on transcripts 
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Transcript name: ■•: , 


Segment, - : 
starting position ; 


Segment 
ending position 


AA161187_T0 


364 


369 


AA161187_T7 


295 


300 


AA161187_T15 


364 


369 


AA161187_T16 


364 


369 


AA161187_T21 


364 


369 


AA161187_T22 


364 


369 



Segment cluster AA161 187__node_13 according to the present invention is supported by 
25 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): AA161187JT0, AA161 187JT7, AA161187JT15, 
AA161187_T16, AA161187JT20, AA161187_T21 and AA161 187JT22. Table 439 below 
describes the starting and ending position of this segment on each transcript. 

Table 439 - Segment location on transcripts 



Transcript name > ; ,• ■ - 


Segment . J? ' 


Segment . . 




', starting position >• 


ending position 


AA161187_T0 


370 


445 


AA161187_T7 


301 


376 


AA161187_T15 


370 


445 


AA161187_T16 


370 


445 


AA161187_T20 


295 


370 


AA161187_T21 


370 


445 


AA161187_T22 


370 


445 



Segment cluster AA161 187_node_19 according to the present invention is supported by 4 
libraries. The number of libraries was determined as previously described. This segment can be 
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found in the following transcript(s): AA161 187 JIT 6. Table 440 below describes the starting 
and ending position of this segment on each transcript. 

Table 440 - Segment location on transcripts 





Segmenty- ^ ^ : ; 

siting positioii: .■; . 


; Segment- J*|J ' 

Siding position 

..." ■ .v ; ~ % ; 


AA161187_T16 


657 


693 



Segment cluster AA161 187_node_20 according to the present invention is supported by 
28 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): AA161187JT0, AA161187JT7, AA161187JT16 and 
AA161187JT20. Table 441 below describes the starting and ending position of this segment on 
10 each transcript. 



Table 441 - Segment location on transcripts 



Transcript name . Y^"'" '..fj,, 

- ' • : " Y' % •* y 


: Segment • .' : y 
starting position f 


Segment "- • - Y^ 
i ending position.* 


AA161187_T0 


657 


682 


AA161187_T7 


588 


613 


AA161187_T16 


694 


719 


AA161187_T20 


582 


607 



Segment cluster AA1 61 1 87_nodeJ2 1 according to the present invention is supported by 
15 31 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): AA161187JT0, AA161187JT7, AA161187_T15, 
AA161187JT16 and AA161187JT20. Table 442 below describes the starting and ending 
position of this segment on each transcript. 

Table 442 - Segment location on transcripts 
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Transcript name . ' . 


Segment : \ . ' 
starting position 


Segment yr 
ending position I 


AA161187_T0 


683 


741 


AA161187_T7 


614 


672 


AA161187_T15 


657 


715 


AA161187_T16 


720 


778 


AA161187_T20 


608 


666 



Segment cluster AA161187_node_22 according to the present invention is supported by 
34 libraries. The number of libraries was determined as previously described. This segment can 
5 be found in the following transcript(s): AA161187JT0, AA161187_T7, AA161187_T15, 
AA161 1 87_T16 and AA161 187_T20. Table 443 below describes the starting and ending 
position of this segment on each transcript. 

Table 443 - Segment location on transcripts 



TrmscBptnaine;- ;f. J 


starting position 


. Segment - & 
ending position 5 ■ . i 


AA161187_T0 


742 


769 


AA161187_T7 


673 


700 


AA161187_T15 


716 


743 


AA161187_T16 


779 


806 


AA161187_T20 


667 


694 



10 

Segment cluster AA1 61 1 87_node_23 according to the present invention is supported by 
31 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): AA161187JT0, AA161187JT7, AA161187JT16 and 
AA161 187JT20. Table 444 below describes the starting and ending position of this segment on 
15 each transcript. 



WO 2006/131783 



PCT/IB2005/004037 



539 

Table 444 - Segment location on transcripts 



Transcript name h 


Segment • , , 
starting position j 


Segment : \. ... 
ending position ;> 


AA161187_T0 


770 


811 


AA161187_T7 


701 


742 


AA161187_T16 


807 


848 


AA161187_T20 


695 


736 



Segment cluster A A 1611 87_node_24 according to the present invention is supported by 
5 12 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): AA161187_T16 and AA161 187_T20. Table 445 below 
describes the starting and ending position of this segment on each transcript. 



Table 445 - Segment location on transcripts 



Transcript name 


Segment ' ■'' 
? starting position 


Segment 

ending position V 


AA161187_T16 


849 


879 


AA161187_T20 


737 


767 



10 



Variant protein alignment to the previously known protein: 

Sequence name: TEST_HUMAN 

Sequence documentation : 

20 
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Alignment of: AA1 611 8 7_P6 x TEST__HUMAN 



Alignment segment 1/1 



5 Quality: 2894.00 

Escore: 0 

Matching length: 284 
length: 284 
Matching Percent Similarity: 100.00 
10 Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 



Total 



Matching Percent 



Total Percent 



15 Alignment: 



20 



4 3 GPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWALTA 92 

I II I I I ! I I I I I I I I! M I I I II ! I I I II II I I I i I 1 I I t I I I I I I I I I I 

31 GPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWALTA 8 0 
93 AHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRY 142 

I I I I I I II I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 

81 AHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRY 130 



25 



14 3 LGNSPYDIALVKLSAPVTYTKHIQPICLQASTFEFENRTDCWVTGWGYIK 192 

I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I i I I I 

131 LGNSPYDIALVKLSAPVTYTKHIQPICLQASTFEFENRTDCWVTGWGYIK 18 0 



30 



193 EDEALPSPHTLQEVQVAIINNSMCNHLFLKYSFRKDIFGDMVCAGNAQGG 242 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I 

181 E DEAL P S P HT LQE VQVA 1 1 NN S MCNHLFLK Y S FRKD I FG DMVC AGNAQ GG 230 
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24 3 KDACFGDSGGPLACNKNGLWYQIGVVSWGVGCGRPNRPGVYTNISHHFEW 2 92 

I I I I I I I I I I I I I 1 I I I I I I I I I 1 I I ! I I ! 11 I 1 I I I I I I I I 1 1 I I I I I I 

231 KDACFGDSGGPLACNKNGLWYQIGVVSWGVGCGRPNRPGVYTNISHHFEW 280 

29 3 IQKLMAQSGMSQPDPSWPLLFFPLLWALPLLGPV 32 6 

I I I ! I I I I I I I II i I I I 1 I 1 I I I I I I I I I I I 1 I I 

2 81 IQKLMAQSGMSQPDPSWPLLFFPLLWALPLLGPV 314 



15 Sequence name: TESTJBUMAN 



Sequence documentation : 



20 



Alignment of: AA161187_P13 x TE S T__HUMAN 



Alignment segment 1/1: 



Quality: 1829.00 

Escore: 0 
25 Matching length: 183 

length: 183 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
30 Identity: 100.00 

Gaps: 0 



Total 



Matching Percent 



Total Percent 
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Alignment : 



1 MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAEL 5 0 

I II I I II i I I I I I i I i I I I I I I I I I I I I I I I it I I I I i f I I I I I I I I I i I 
1 MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAEL 50 

51 GRWPWQGSLRLWDSHVCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQF 100 

I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I ! I I I 

51 GRWPWQGSLRLWDSHVCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQF 100 

• • • • • 

101 GQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYT 150 

I I I I M I I I II I M I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

101 GQLTSMPSFWSLQAYYTRYFVSN1YLSPRYLGNSPYDIALVKLSAPVTYT 150 

. - • 

151 KHIQPICLQASTFEFENRTDCWVTGWGYIKEDE 183 



151 



I I I I I I I I I I I I It I II I I I I I 1 II I I I M N I 
KH I QP I CLQAS T FE FENRTDCWVTGWG Y I KE DE 



183 



Sequence 



name: TEST HUMAN 



Sequence 



documentation : 



Alignment of: AA1 61187__P14 x TEST_HUMAN 



Alignment segment 1/1: 
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Quality: 1829.00 

Escore: 0 

Matching length: 183 Total 

5 length: 183 

Matching Percent Similarity: 100.00 Matching Percent 

Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 
10 Gaps: 0 



Alignment : 

1 MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAEL 50 

15 | | | | | | | | I II I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I 1 I 1 I I M I I I 

1 MGARG ALLLALLL ARAGLRKPE S QE AAPL S G P C GRR V I T S R I VGGE DAEL. 50 

51 GRWPWQGSLRLWDSHVCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQF 10 0 

I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I M I 

20 51 GRWPWQGSLRLWDSHVCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQF 100 

101 GQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYT 150 

I 1 1 | | || | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I 
101 GQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYT 150 



25 



151 KHI QP I CLQAS T FE FENRTDCWVTGWG Y I KEDE 183 

I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I 
151 KH I QP I CLQAS T FE FENRTDCWVTGWG Y IKEDE 183 



30 
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Sequence name: TEST_HUMAN 

5 

Sequence documentation : 

Alignment of: AA1 6118 7__P 1 8 x TEST_HUMAN 
10 Alignment segment 1/1: 

Quality: 1957.00 

Escore: 0 

Matching length: 203 Total 

15 length: 205 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 99.02 Total Percent 

Identity: 99.02 
20 Gaps : 1 



Alignment : 

4 3 GPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWALTA 92 
25 I I I I 1 I I I 1 I I I I I I I I I I I M I I I I I I I I I I I I I I I I ) I I I I I I I I I I I 

31 GPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWALTA 80 

93 AHCFET. . DLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRY 140 

I I I I 1 I I I I i I I I I I I I I t I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

30 81 AHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRY 130 
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141 LGNSPYDIALVKLSAPVTYTKHIQPICLQASTFEFENRTDCWVTGWGYIK 190 

1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 ii 1 1 1 1 1 1 i 1 1 1 1 1 1 i 1 1 i 1 1 1 1 1 1 1 1 

131 LGNSPYDIALVKLSAPVTYTKHIQPICLQASTFEFENRTDCWVTGWGYIK 180 

191 EDEALPSPHTLQEVQVAIINNSMCNHLFLKYSFRKDIFGDMVCAGNAQGG 24 0 

! II I I I I I I I I I I I I I I I i I I 1 I I I I I ! I i I I I I II I I II I I I I I I I I I I 

181 EDEALPSPHTLQEVQVAIINNSMCNHLFLKYSFRKDIFGDMVCAGNAQGG 23 0 

241 KDACF 245 
I I I 11 

231 KDACF 235 



Sequence name: TEST_HUMAN 
Sequence documentation : 

Alignment of: AA1 6118 7_P1 9 x TESTJHUMAN 
Alignment segment 1/1: 

Quality: 1829.00 

Escore: 0 

Matching length: 183 Total 

length: 183 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 
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Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

5 Alignment: 

1 MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAEL 50 

I I i I I I I I I I i I I I I I I I I I I t I I I i I I I I i 1 I I I I I I I I i I M I I I I I I 

1 MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAEL 50 

10 ..... 

51 GRWPWQGSLRLWDSHVCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQF 100 

I I I I I I I 1 I I I t I I I 1 I I I I f ! I I I I I I I i 1 I I I I I I I I I I I I I I I M I I 

51 GRWP WQG S LRLWDS H VC G VS LL S HRWALT AAHC FE T Y S DL S DP S GWMVQF 100 
15 101 GQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYT 150 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I ! I I I I I I I I I I I 1 

101 GQLT SMPS FWSLQAYYTRYFVSN1 YLS PRYLGNS PYD I ALVKLS APVT YT 150 

151 KH I Q P I C LQ AS T FE FENRT DC WVT GWG Y I KEDE 183 
20 | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

151 KH I QP I CLQAS T FEFENRT DCWVT GWGY I KEDE 183 



Expression of Homo sapiens protease, serine, 21 (testisin) (PRSS21) AA161187 transcripts 
25 which are detectable by amplicon as depicted in sequence name AA161 187 seg25 in normal and 

cancerous lung tissues 

Expression of Homo sapiens protease, serine, 21 (testisin) (PRSS21) transcripts 
detectable by or according to seg25, AA161187 seg25 amplicon (SEQ ID NO:1654) and primers 
AA161187 segl7F2 (SEQ ID NO:1652) and AA161187 segl7R2 (SEQ ID NO:1653) was 
30 measured by real time PCR. In parallel the expression of four housekeeping genes -PBGD 
(GenBank Accession No. BC019323; amplicon - PBGD-amplicon, SEQ ID NO:334), HPRT1 
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(GenBank Accession No. NMJ300194; amplicon - HPRT1 -amplicon, SEQ ID NO: 1297), 
Ubiquitin (GenBank Accession No. BC000449; amplicon - Ubiquitin-amplicon, SEQ ID 
NO:328) and SDHA (GenBank Accession No. NMJ304168; amplicon - SDH A- amplicon, SEQ 
ID NO:331), was measured similarly. For each RT sample, the expression of the above 
5 amplicon was normalized to the geometric mean of the quantities of the housekeeping genes. 
The normalized quantity of each RT sample was then divided by the median of the quantities of 
the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, above), to 
obtain a value of fold up -regulation for each sample relative to median of the normal PM 
samples. 

10 Figure 64 is a histogram showing over expression of the above- indicated Homo sapiens 

protease, serine, 21 (testisin) (PRSS21) transcripts in cancerous lung samples relative to the 
normal samples. 

As is evident from Figure 64, the expression of Homo sapiens protease, serine, 21 
(testisin) (PRSS21) transcripts detectable by the above amplicon(s) was higher in a few cancer 
15 samples than in the non-cancerous samples (Sample Nos. 46-50, 90-93, 96-99 Table 2). 
Notably an over-expression of at least 6 fold was found in 1 out of 15 adenocarcinoma samples, 
3 out of 16 squamous cell carcinoma samples, 1 out of 4 large cell carcinoma samples. 
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Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: AA161 187 segl7F2 forward primer; 
and AA161 187 segl7R2 reverse primer. 
5 The present invention also preferably encompasses any amplicon obtained through the 

use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a norh limiting illustrative example only of a suitable amplicon: AA 161 187 
seg25. 

10 Primers: 

Forward primer AA1 61 187 segl7F2 (SEQ IDNO:1652): 
CCCTGTGCCTTATTTGACCCT 

Reverse primer AA161 187 segl7R2 (SEQ ID NO:1653) : 
GCTGGGTAGACTGGGTGCA 
15 Amplicon AA161 187 seg25 (SEQ ID NO:1654): 

CCTGTGCCTTATTTGACCCTCATGCCAACCCCGGGAGGTGGAGACTGTTGCCCCACT 
CTGCAGATGCAGAAACGGAGGCTTGGCTGCTGCCAGGGGGAGGA 

20 DESCRIPTION FOR CLUSTER R66178 

Cluster R66178 features 3 transcript(s) and 16 segment(s) of interest, the names for which 
are given in Tables 446 and 447, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 448. 

Table 446 - Transcripts of interest 



Transcript Name 


Sequence ID No, 


R66178_T2 


48 


R66178_T3 


49 


R66178_T7 


50 



25 

Table 447 - Segments of interest 
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Segment Name , " /- . / 


Sequence ID No. 5 i - : ,> • , 


R66178_node_0 


502 


R66178_node_6 


503 


R66178_node_8 


504 


R66178_node_15 


505 


R66178_node_24 


506 


R66178_node_26 


507 


R66178_node_27 


508 


R66178_node_4 


509 


R66178_node_5 


510 


R66178_node_9 


511 


R66178_node_ll 


512 


R66178_node_16 


513 


R66178_node_18 


514 


R66178_node_19 


515 


R66178_node_20 


516 


R66178_node_21 


517 



Table 448 - Proteins of interest 



Protein Name 


Sequence ID No. ?; 


Corresponding Transcripl(s) 


R66178_P3 


1324 


R66178_T2 


R66178_P4 


1325 


R66178_T3 


R66178_P8 


1326 


R66178_T7 



These sequences are variants of the known protein Poliovirus receptor related protein 1 
5 precursor (SwissProt accession identifier PVR 1 HUMAN ; known also according to the 

synonyms Herpes virus entry mediator C; HveC; Nectin 1; Herpesvirus Ig-like receptor; HIgR; 
GDI 1 1 antigen), SEQ ID NO: 1432, referred to herein as the previously known protein. 

Protein Poliovirus receptor related protein 1 precursor is known or believed to have the 
following function(s): probably involved in cell adhesion; receptor for alphaherpesvirus (HSV- 



WO 2006/131783 



PCT/IB2005/004037 



550 

1, HSV-2 and Pseudorabies virus) entry into cells. The sequence for protein Poliovirus receptor 
related protein 1 precursor is given at the end of the application, as "Poliovirus receptor related 
protein 1 precursor amino acid sequence". Protein Poliovirus receptor related protein 1 precursor 
localization is believed to be Type I membrane protein (isoforms alpha and delta). Secreted 
5 (isoform gamma). 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: immune response; cell-cell adhesion, which are annotation(s) related 
to Biological Process; cell adhesion receptor; protein binding; coreceptor, which are 
annotation(s) related to Molecular Function; and adherens junction; integral membrane protein, 
1 0 which are annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

As noted above, cluster R66178 features 3 transcript(s), which were listed in Table 1 
15 above. These transcript(s) encode for protein(s) which are variant(s) of protein Poliovirus 
receptor related protein 1 precursor. A description of each variant protein according to the 
present invention is now provided. 

Variant protein R66178JP3 according to the present invention has an amino acid 
20 sequence as given at the end of the application; it is encoded by transcript(s) R66178_T2. An 
alignment is given to the known protein (Poliovirus receptor related protein 1 precursor) at the 
end of the application. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 
25 Comparison report between R66178_P3 and PVR1 JHOJMAN: 

l.An isolated chimeric polypeptide encoding for R66178JP3, comprising a first amino 
acid sequence being at least 90 % homologous to 

MARMGLAGAAGRWWGLALGLTAFFLPGVHSQWQWDSMYGFIGTDVVLHCSFANP 

LPSVOTQVTWQKSTNGSKQ 
30 EDEGWICEFATFPTGNRESQLNLTVMAKPT^ 

ANGKPPSWSWETRLKGEAEYQEIRl^NGTVTVISRYRLWSREAHQQSLACIVNY^ 
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DRFKESLTLNVQYEPEVTIEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLP 
KGVEAQNRTLFFKGPINYSLAGTYICEATNPIGTRSGQVEVNIT corresponding to amino 
acids 1 - 334 of PVR INHUMAN, which also corresponds to amino acids 1 - 334 of R66178_P3, 
and a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 
5 85%, more preferably at least 90% and most preferably at least 95% homologous to a 

polypeptide having the sequence GEGHSLPISPGVLQTQNCGP corresponding to amino acids 
335 - 354 of R66178_P3, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of R66178JP3, comprising a polypeptide 
10 being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
GEGHSLPISPGVLQTQNCGP in R66178_P3. 

The location of the variant protein was determined according to results from a number of 
1 5 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 
20 Variant protein R66178JP3 also has the following non-silent SNPs (Single Nucleotide 

Polymorphisms) as listed in Table 449, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein R66178_P3 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

25 Table 449 - Amino acid mutations 



SNP positions) on amino acid 
sequence 


Alternative amino acid(s) 


i Previously known SNP? 


77 


N->S 


No 



WO 2006/131783 



PCT/IB2005/004037 



552 

The glycosylation sites of variant protein R66178JP3, as compared to the known protein 
Poliovirus receptor related protein 1 precursor, are described in Table 450 (given according to 
their position(s) on the amino acid sequence in the first column; the second column indicates 
whether the glycosylation site is present in the variant protein; and the last column indicates 
5 whether the position is different on the variant protein). 

Table 450 - Glycosylation site(s) 



dPositi6ri(sj on known amino 
icid sequence /; ■ 


Present in variant protein? 


Position iti variant protein? .... 


72 


yes 


72 


297 


yes 


297 


202 


yes 


202 


307 


yes 


307 


332 


yes 


332 


139 


yes 


139 


36 


yes 


36 


286 


yes 


286 



Variant protein R66178_P3 is encoded by the following transcript(s): R66178_T2 ? for 
which the sequence(s) is/are given at the end of the application. The coding portion of transcript 

10 R66178_T2 is shown in bold; this coding portion starts at position 634 and ends at position 

1695. The transcript also has the following SNPs as listed in Table 451 (given according to their 
position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
R66178_P3 sequence provides support for the deduced sequence of this variant protein 

1 5 according to the present invention). 

Table 451 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


\ Previously known SNP? 


474 


->T 


No 
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-> c 


No 


632 


->T 


No 


633 


G->T 


No 


863 


A->G 


No 


897 


C ->T 


Yes 


2178 


A->G 


No 


2465 


G -> A 


Yes 


2687 


G -> A 


Yes 



Variant protein R66178JP4 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) R66178JT3. An 

5 alignment is given to the known protein (Poliovirus receptor related protein 1 precursor) at the 
end of the application. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 
Comparison report between R66178JP4 and PVR 1 HUMAN: 

10 l.An isolated chimeric polypeptide encoding for R66178 P4, comprising a fust amino 

acid sequence being at least 90 % homologous to 
MARMGLAGAAGRWWGLALGLTAFFLPGVH^ 

LPSVKITQVTWQKSTNGSKQNVAIYNPSMGVSVLAPYRERVEFLRPSFTDGTIRLSRLE^ 
EDEGWICEFATFPTGNRESQLNLTVMAKPTNWIEGTQAVLRAKKGQDDKVLVATCTS 
15 ANGKPPSVVSWETRLKGEAEYQEIRNPNGTVTVISRYRLVPSREAHQQSLACIVNYHM 
DRFKESLTLNVQYEPEVTIEGFDGNWYL 

KGVEAQNRTLFFKGPINYSLAGTYICEATNPIGTRSGQVEVNIT corresponding to amino 
acids 1 - 334 of PVR1HUMAN, which also corresponds to amino acids 1 - 334 of R66178JP4, 
and a second amino acid sequence being at least 70%, optionally at least 80% 3 preferably at least 
20 85%, more preferably at least 90% and most preferably at least 95% homologous to a 

polypeptide having the sequence AFCQLIYPGKGRTRARMF corresponding to amino acids 
335 - 352 of R66178JP4, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 
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2 An isolated polypeptide encoding for a tail of R66178JP4, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
AFCQLIYPGKGRTRARMF in R66178_P4. 

5 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 

10 prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein R66178_P4 also has the following norbsilent SNPs (Single Nucleotide 
Polymorphisms) as listed in Table 452, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 

15 known or not; the presence of known SNPs in variant protein R66178JP4 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

Table 452 - Amino acid mutations 



SNP position(s) on amino acid 
sequence ; 


Alterative amino acid(s) ' 


Previously Imbwh SNP? 


77 


N->S 


No 



The glycosylation sites of variant protein R66178JP4, as compared to the known protein 
20 Poliovirus receptor related protein 1 precursor, are described in Table 453 (given according to 
their position(s) on the amino acid sequence in the first column; the second column indicates 
whether the glycosylation site is present in the variant protein; and the last column indicates 
whether the position is different on the variant protein). 

Table 453 - Glycosylation site(s) 



Position(s) on known amino 


Present in variant protein? 


Position in variant protein? 


acid sequence 
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72 


yes 


72 


297 


yes 


297 


202 


yes 


202 ! 


307 


yes 


307 


332 


yes 


332 


139 


yes 


139 


36 


yes 


36 


286 


yes 


286 



Variant protein R66178JP4 is encoded by the following transcript(s): R66178JT3, for 
which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
R66178JT3 is shown in bold; this coding portion starts at position 634 and ends at position 
5 1689. The transcript also has the following SNPs as listed in Table 454 (given according to their 
position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
R66178JP4 sequence provides support for the deduced sequence of this variant protein 
according to the present invention). 

1 0 Table 454 - Nucleic acid SNPs 



SN£ position on nucleotide 
sequence • . 


Alternative nucleic acid 


[Previously known SNP? •• 


474 


->T 


No 


476 


->C 


No 


632 


->T 


No 


633 


G->T 


No 


863 


A->G 


No 


897 


C->T 


Yes 


1762 


C-> 


Yes 
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Variant protein R66178P8 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) R66178_T7. An 
alignment is given to the known protein (Poliovirus receptor related protein 1 precursor) at the 
end of the application. One or more alignments to one or more previously published protein 
5 sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 

Comparison report between R66178_P8 and PVR1__HUMAN: 

1. An isolated chimeric polypeptide encoding for R66178JP8, comprising a first amino 
acid sequence being at least 90 % homologous to 

10 MARMGLAGAAGRWWGLALGLTAFFLPGVHSQVVQVNDSMYGFIGTDVVLHCSFANP 
LPSVKITQVTWQKSTNGSKQNVAIYNPSMGVSVLAPYRERVEFLRPSFTDGTIRLSRLEL 
EDEGWICEFATFPTGNRESQLNLTVMAKPTNWIEGTQAVLRAKKGQDDKVLVATCTS 
ANGKPPSVVSWETRLKGEAEYQEIRNPNGTVTVISRYRLVPSREAHQQSLACIVNYHM 
DRFKESLTLNVQYEPEVTIEGFDGN 

15 KGVEAQNRTLFFKGPINYSLAGTYICEATNPIGTRSGQVE corresponding to amino acids 1 
- 330 of PVR INHUMAN, which also corresponds to amino acids 1 - 330 of R66178JP8, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence NSPTPRLLPNMGGAPGRCPRPSLGAWRGASCWC corresponding to 

20 amino acids 331 - 363 of R66178JP8, wherein said first amino acid sequence and second amino 
acid sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of R66178_P8, comprising a polypeptide 
being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 

25 NSPTPRLLPNMGGAPGRCPRPSLGAWRGASCWC in R66178_P8. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
30 secreted. The protein localization is believed to be secreted because both signal-peptide 
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prediction programs predict that this protein has a signal peptide, and neither trans- membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein R66178P8 also has the following noi>silent SNPs (Single Nucleotide 
Polymorphisms) as listed in Table 455, (given according to their position(s) on the amino acid 
5 sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein R66178_P8 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

Table 455 - Amino acid mutations 



SNP position(s) on amino acid 
sequence * - % ^ 


Alternative amino acM(s) i- 


Previously klipwix SNP? ?. ; ^ 

■>' Is! ■ *-"♦• ; c ' J I :. - ' • '{* ■ 


77 


N->S 


No 



10 The glycosylation sites of variant protein R66178JP8, as compared to the known protein 

Poliovirus receptor related protein 1 precursor, are described in Table 456 (given according to 
their position(s) on the amino acid sequence in the first column; the second column indicates 
whether the glycosylation site is present in the variant protein; and the last column indicates 
whether the position is different on the variant protein). 

1 5 Table 456 - Glycosylation site(s) 



Position(s) on known amino 
acid sequence *• 


: Present in variant protein?;:?; . • ■ 


Position in val iant protein? 


72 


yes 


72 


297 


yes 


297 


202 


yes 


202 


307 


yes 


307 


332 


no 




139 


yes 


139 


36 


yes 


36 


286 


yes 


286 
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Variant protein R66178_P8 is encoded by the following transcript(s): R66178JT7, for 
which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
R66178JT7 is shown in bold; this coding portion starts at position 634 and ends at position 
1722. The transcript also has the following SNPs as listed in Table 457 (given according to their 
5 position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
R66178JP8 sequence provides support for the deduced sequence of this variant protein 
according to the present invention). 

Table 457 - Nucleic acid SNPs 



SNJR position on nucleotide 
sequence ' v _', - "_, ' ... ■ 


: Allsa^ti^'iracidk; acid 


Previously Known SNP? . .: 

'" - ' /I* ' .'\,s.- : '•■ 

y If'' • ' ' ; 


474 


->T 


No 


476 


->C 


No 


632 


->T 


No 


633 


G->T 


No 


863 


A->G 


No 


897 


C->T 


Yes 


2210 


A->C 


No 


2211 


A->C 


No 



As noted above, cluster R66178 features 16 segment(s) ? which were listed in Table 2 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
provided. 

Segment cluster R66178_node_0 according to the present invention is supported by 19 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R66178_T2, R66178_T3 and R66178_T7. Table 458 below 
describes the starting and ending position of this segment on each transcript. 

Table 458 - Segment location on transcripts 
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Transcript name 


Segment 


Segment ,, 




starting position , v . 


ending position 


R66178JT2 


1 


712 


R66178_T3 


1 


712 


R66178_T7 


1 


712 



Segment cluster R66178__node_6 according to the present invention is supported by 39 
libraries. The number of libraries was determined as previously described. This segment can be 
5 found in the following transcript(s): R66178JT2, R66178JT3 and R66178JT7. Table 459 below 
describes the starting and ending position of this segment on each transcript. 



Table 459 - Segment location on transcripts 



Transcript name , '•• , 


Segment " ' 

storting position yf' „ 


Segment . / 
ending position i 


R66178_T2 


762 


1063 


R66178_T3 


762 


1063 


R66178_T7 


762 


1063 



10 Segment cluster R66178jtiode_8 according to the present invention is supported by 39 

libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R66178JT2, R66178_T3 and R66178_T7. Table 460 below 
describes the starting and ending position of this segment on each transcript. 



Table 460 - Segment location on transcripts 



Transcript name 


; Segment 


Segment 




i starting position 


ending position 


R66178_T2 


1064 


1269 


R66178_T3 


1064 


1269 


R66178_T7 


1064 


1269 
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Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 



5 were found to hit this segment (in relation to lung cancer), shown in Table 461. 
Table 461 - Oligonucleotides related to this segment 



Oligonucleotide name • i 


O verexpressed in cancers >f 


Chip reference k 


R66178_0_7_0 


lung malignant tumors 


LUN 



Segment cluster R66178_node_15 according to the present invention is supported by 40 
10 libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R66178_T2, R66178JT3 and R66178JT7. Table 462 below 
describes the starting and ending position of this segment on each transcript. 

Table 462 - Segment location on transcripts 



Transcript name 


Segment .{' _ ;•• 


Segment;.'' i' - ." 




starting ffdsitida 


ending position 


R66178_T2 


1485 


1623 


R66178_T3 


1485 


1623 


R66178_T7 


1485 


1623 



15 

Segment cluster R66178 node_24 according to the present invention is supported by 10 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R66178JT2. Table 463 below describes the starting and 
ending position of this segment on each transcript. 

20 Table 463 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 
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R66178_T2 


1637 


3110 









Segment cluster R66178_node_26 according to the present invention is supported by 24 
libraries. The number of libraries was determined as previously described. This segment can be 



5 found in the following transcript(s): R66178JT7. Table 464 below describes the starting and 
ending position of this segment on each transcript. 

Table 464 - Segment location on transcripts 



Transcript name ' % 

V f..;- : i4 


Segment v 
starting position s 


Segment * r- 
s^ndingpositipn # 


R66178_T7 


1624 


2087 



10 



15 



Segment cluster R66178_node_27 according to the present invention is supported by 12 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R66178_T7. Table 465 below describes the starting and 
ending position of this segment on each transcript. 

Table 465 - Segment location on transcripts 



Transciipt nani6/ y < . -f '. 


^Segment 
Starting position ^ 


Segment : 
ending position ^ 


R66178_T7 


2088 


2364 



the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



Segment cluster R66178 node_4 according to the present invention is supported by 21 
20 libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R66178JT2, R66178_T3 and R66178JT7. Table 466 below 
describes the starting and ending position of this segment on each transcript. 
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Table 466 - Segment location on transcripts 



Transcript name ' ' ; ;f 

■ •' 'V . - :; " ; .'v-- , 


\ Segment , . •■ 
starting position 


Segment . 
ending position 


R66178_T2 


713 


749 


R66178_T3 


713 


749 


R66178_T7 


713 


749 



Segment cluster R66178 node__5 according to the present invention can be found in the 
5 following transcript(s): R66178JT2, R66178_T3 and R66178JT7. Table 467 below describes 
the starting and ending position of this segment on each transcript. 



Table 467 - Segment location on transcripts 



Transcript name . 


; Segment [y i. ; "; . ' 


Segment : f 




starting position 


, ending position f - 


R66178_T2 


750 


761 


R66178_T3 


750 


761 


R66178_T7 


750 


761 



10 Segment cluster R66178_node_9 according to the present invention is supported by 44 

libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R66178JT2, R66178_T3 and R66178JT7. Table 468 below 
describes the starting and ending position of this segment on each transcript. 

Table 468 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


; ending position 


R66178_T2 


1270 


1366 


R66178_T3 


1270 


1366 


R66178_T7 


1270 


1366 
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Segment cluster R66178_node_l 1 according to the present invention is supported by 44 
libraries. The number of libraries was determined as previously described. This segment can be 
5 found in the following transcript(s): R66178JT2, R66178JT3 and R66178_T7. Table 469 below 
describes the starting and ending position of this segment on each transcript. 

Table 469 - Segment location on transcripts 



Transcript name 


Segment '"' 
starting position 


Segment / • " ."; : ; p 
" ending po,si§ on - 


R66178_T2 


1367 


1484 


R66178_T3 


1367 


1484 


R66178_T7 


1367 


1484 



10 Segment cluster R66178_node_16 according to the present invention can be found in the 

following transcript(s): R66178JT2 and R66178_T3. Table 470 below describes the starting and 
ending position of this segment on each transcript. 

Table 470 - Segment location on transcripts 



Ttapsqript iiame. ' . - ; " ■[,< \ : 


Segment ■ s?. 


Segment : 




starting position 


ending position 


R66178_T2 


1624 


1636 


R66178_T3 


1624 


1636 









15 

Segment cluster R66178_node_18 according to the present invention is supported by 13 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R66178JT3. Table 471 below describes the starting and 
ending position of this segment on each transcript. 

20 Table 471 - Segment location on transcripts 
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Transcript name . f : ■ 


Segment 

starting position } 


Segment , 

ending position ? - ; 


R66178JT3 


1637 


1743 



Segment cluster R66178_node_19 according to the present invention can be found in the 
following transcript(s): R66178_T3. Table 472 below describes the starting and ending position 
5 of this segment on each transcript. 

Table 472 - Segment location on transcripts 



: Tramcripliaine " : \$ : J; My' - ip 


Segment 


Segment : -#* f ; 
ending position , 


R66178_T3 


1744 


1763 



Segment cluster R66178_node_20 according to the present invention is supported by 12 
10 libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R66178JT3. Table 473 below describes the starting and 
ending position of this segment on each transcript. 

Table 473 - Segment location on transcripts 



Transcript name 


Segment % 
starting position .-? 


SegmeM , j 
ending posit&al 


R66178JT3 


1764 


1791 



15 

Segment cluster R66178_node_21 according to the present invention is supported by 1 1 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R66178_T3. Table 474 below describes the starting and 
ending position of this segment on each transcript. 

20 Table 474 - Segment location on transcripts 
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Transcript name " ' " • ; 


Segment 

starting position ^ f. 


Segment 
ending po^itiop. 


R66178_T3 


1792 


1903 



Variant protein alignment to the previously known protein: 

Sequence name: PVR1_HUMAN 

10 Sequence documentation: 

Alignment of: R66178__P3 x PVR INHUMAN 
Alignment segment 1/1: 

15 

Quality: 3286.00 

Escore: 0 

Matching length: 334 
length: 334 
20 Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 

25 

Alignment : 

1 MARMGLAGAAGRWWGLALGLTAFFLPGVHSQWQVNDSMYGFIGTDWLH 50 



Total 
Matching Percent 
Total Percent 
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I I I I I M I M I I I 1 I I i I I I I I I I I I I I I I I I 1 I I M I M I 1 I I I I I M I 

1 MARMGLAGAAGRWWGLALGLTAFFLPGVHSQVVQVNDSMYGFIGTDVVLH 5 0 

51 CSFANPLPSVKITQVTWQKSTNGSKQNVAIYNPSMGVSVLAPYRERVEFL 100 

5 I I I I I I I I I I I II ! I I I 1 II I I i I i I I I I I I I i I I M I I I 1 II M I I I i I 

51 C S FAN PLP S VK I TQ VT WQKS TNG S KQNVAI YN P SMGVS VLAP YRERVE FL 100 

101 RPSFTDGTIRLSRLELEDEGVYICEFATFPTGNRESQLNLTVMAKPTNWI 150 

I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

10 101 RPSFTDGTIRLSRLELEDEGVYICEFATFPTGNRESQLNLTVMAKPTNWI 150 

151 EGTQAVLRAKKGQDDKVLVATCTSANGKPPSWSWETRLKGEAEYQEIRN 200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 

151 EGTQAVLRAKKGQDDKVLVATCTSANGKPPSWSWETRLKGEAEYQEIRN 20 0 

15 - 

201 PNGTVTVISRYRLVPSREAHQQSLACIVNYHMDRFKESLTLNVQYEPEVT 250 

I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I M I III I I I I I I I I 

201 PNGTVTVISRYRLVPSREAHQQSLACIVNYHMDRFKESLTLNVQYEPEVT 250 
20 251 IEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLPKGVEAQNRTL 300 

I I ! I I I I 1 I I Ml I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I M I I I I I 

251 IEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLPKGVEAQNRTL 300 

301 FFKGPINYSLAGT Y I CE ATNP I G TRS GQVE VN I T 334 
25 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 FFKGPINYSLAGT YICEATNPIGTRSGQVEVN IT 334 



30 
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Sequence name: PVR1JHUMAN 
Sequence documentation : 

5 

Alignment of: R66178_P4 x PVR1_HUMAN 

Alignment segment 1/1: 

10 Quality: 3294.00 

Escore: 0 

Matching length: 336 
length: 33 6 

Matching Percent Similarity: 99.7 0 
15 Identity: 99.70 

Total Percent Similarity: 99.70 
Identity: 99.70 

Gaps : 0 

20 Alignment: 

1 MARMGLAGAAGRWWGLALGLTAFFLPGVHSQVVQVNDSMYGFIGTDWLH 50 

I I I I I I 1 t I I I I I I I M I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I 

1 MARMGLAGAAGRWWGLALGLTAFFLPGVHSQWQVNDSMYGFIGTDVVLH 50 

25 . 

51 CSFANPLPSVKITQVTWQKSTNGSKQNVAIYNPSMGVSVLAPYRERVEFL 10 0 

M | I I I t I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 1 M I I I I I I I ! I I 

51 CSFANPLPSVKITQVTWQKSTNGSKQNVAIYNPSMGVSVLAPYRERVEFL 100 

30 101 RPSFTDGTIRLSRLELEDEGVYICEFATFPTGNRESQLNLTVMAKPTNWI 150 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 



Total 
Matching Percent 
Total Percent 
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101 RPSFTDGTIRLSRLELEDEGVYICEFATFPTGNRESQLNLTVMAKPTNWI 150 
151 E G T Q A VLR AKK G Q D DK VL V AT C T S AN GK P P S V V S WE T RLK GE AE Y QE I RN 20 0 

I I I I 1 I 1 M I I I I I 1 I I i I I I I I I I I I I I i I M I I 1 I I 1 1 I M I I I I I I I 

151 EGTQAVLRAKKGQDDKVLVATCTSANGKPPSVVSWETRLKGEAEYQEIRN 200 
201 PNGTVTVISRYRLVPSREAHQQSLACIVNYHMDRFKESLTLNVQYEPEVT 250 

I | | | | I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I 1 t I I II I II I I 

201 PNGTVTVISRYRLVPSREAHQQSLACIVNYHMDRFKESLTLNVQYEPEVT 250 
251 I E G FDGNW YLQRMDVKLT CKADANP P ATE YHWT TLN G S L PKG VE AQNRTL 300 

| I I I I I I I I I I I I M I I I II I I 1 I I I I I I I I I I I I I I I I I I I I I M M I I 

2 51 IEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLPKGVEAQNRTL 300 

301 FFKGPINYSLAGTYICEATNPIGTRSGQVEVNITAF 33 6 

I I I I I I I I I I I I I M I I I I II I I I I I I II I I I I I I 
301 FFKGPINYSLAGTYICEATNPIGTRSGQVEVNITEF 336 



Sequence name: PVR1_HUMAN 
Sequence documentation : 

Alignment of: R6617 8_P8 x PVRl___HUMAN 
Alignment segment 1/1: 
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Quality: 3250.00 

Escore: 0 

Matching length: 330 Total 

length: 330 

5 Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

10 

Alignment : 

1 MARMGLAGAAGRWWGLALGLTAFFLPGVHSQVVQVNDSMYGFIGTDVVLH 5 0 

I I I t I i I M I I I i I I I I I I II I I 1 I I I I I I I I I I I I I I I I I I I I I M I I I 

15 1 MARMGLAGAAGRWWGLALGLTAFFLPGVHSQVVQVNDSMYGFIGTDVVLH 50 

. 

51 CSFANPLPSVKITQVTWQKSTNGSKQNVAIYNPSMGVSVLAPYRERVEFL 100 

I I I I I I i I I i I i I I I I I i I I I 1 I ! I I I I I i I I I I I I I I I I I I I I I I I I I I 
51 CSFANPLPSVKITQVTWQKSTNGSKQNVAIYNPSMGVSVLAPYRERVEFL 100 
20 ..... 

101 RPSFTDGTIRLSRLELEDEGVYICEFATFPTGNRESQLNLTVMAKPTNWI 150 

I I I I I I I I I I I 1 I I I I I I I I I I I I I I I 1 I I I I I I I I M I I I I I I I I I I I I 

101 RP S FT DGT I RL S RLE LE DE G V Y I CE FAT F P T GNRE S QLNL T VMAKPTNW I 15 0 
. . » . » 

25 151 EGTQAVLRAKKGQDDKVL VATC T S ANGKP PSWSWE TRLKGE AE YQE I RN 20 0 

1 1 1 1 1 1 1 1 ) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

151 EGTQAVLRAKKGQDDKVLVATCTSANGKPPSWSWETRLKGEAEYQEIRN 200 

. . - • • 

201 PNGTVTVISRYRLVPSREAHQQSLACIVNYHMDRFKESLTLNVQYEPEVT 250 

30 | | | | | | | I I I I I I I I I I I I I I I I I I II I I I I I t 1 I I I I I I I I I I I I I I I I 

201 PNGTVTVISRYRLVPSREAHQQSLACIVNYHMDRFKESLTLNVQYEPEVT 250 
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251 IEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLPKGVEAQNRTL 300 

I I I I I I I I i I I i I I I i I I I I I I I I I I ! I I I I I I 1*1 I I i ! I t i I M II II I 

251 IEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLPKGVEAQNRTL 300 

301 FFKGP INYSLAGTYI CEATNPI GTRSGQVE 330 

) I I i i I I I I I I I 11 I 1 I I I I I I II I I II I I 
301 FFKGP INYSLAGTYI CEATNPI GTRSGQVE 330 

DESCRIPTION FOR CLUSTER HUMPHOSLIP 
Cluster HUMPHOSLIP features 7 transcript(s) and 53 segment(s) of interest, the names 
for which are given in Tables 475 and 476, respectively, the sequences themselves are given at 
the end of the application. The selected protein variants are given in table 477. 

Table 475 - Transcripts of interest 



Transcript Name ( . . ... . .. 


Sequence ID No. <• , . . : 


HUMPHOSLIP_PEA_2_T6 


51 


HUMPHOSLIP_PEA_2_T7 


52 


HUMPHOSLIP_PEA_2_T14 


53 


HUMPHOSLIP_PEA_2_T16 


54 


HUMPHOSLIP_PEA_2_T 1 7 


55 


HUMPHOSLIP_PEA_2_T18 


56 


HUMPHOSLIP_PEA_2_T19 


57 


Table 476- Segments of interest 


Segment Name 


Sequence ID No. 


HUMPHOSLIP_PEA_2_node_0 


518 


HUMPHOSLIP_PEA_2_node_l 9 


519 


HUMPHOSLIP_PEA_2_node_34 


520 
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HUMPHOSLIP_PEA_2_node_68 


521 


HUMPHOSLIP_PEA_2_node_70 


522 


HUMPHOSLIP_PEA_2_node_75 


523 


HUMPHOSLIP_PEA_2_node_2 


524 


HUMPHOSLIP_PEA_2_node_3 


525 


HUMPHOSLIP_PEA_2_node_4 


526 


HUMPHOSLIP_PEA_2_no de_6 


527 


HUMPHOSLIP_PEA_2_node_7 


528 


HUMPHOSLIP_PEA_2_node_8 


529 


HUMPHOSLIP_PEA_2_node_9 


530 ! 


HUMPHOSLIP_PEA_2_node_14 


531 


HUMPHOSLIP_PEA_2_node_l 5 


532 


HUMPHOSLIP_PEA_2_node_l 6 


533 


HUMPHOSLIP_PEA_2_node_l 7 


534 


HUMPHOSLIP_PEA_2_node_23 


535 


HUMPHOSLIP_PEA_2_node_24 


536 


J 

HUMPHOSLIP_PEA_2_node_25 


537 


HUMPHOSLIP_PEA_2_node_26 


538 


HUMPHOSLIP_PEA_2_node_29 


539 


HUMPHOSLIP_PEA_2_node_30 


540 


HUMPHOSLIP_PEA_2_node_33 


541 


HUMPHOSLIP_PEA_2_node_36 


542 


HUMPHOSLIP_PEA_2_node_37 


543 


HUMPHOSLIP_PEA_2_node_39 


544 


HUMPHOSLIP_PEA_2_node_40 


545 


HUMPHOSLIP_PEA_2_node_4 1 


546 


HUMPHOSLIP_PEA_2_node_42 


547 


HUMPHOSLIP_PEA_2_node_44 


548 


HUMPHOSLIP_PEA_2_node_45 


549 


HUMPHOSLIP_PEA_2_node_47 


550 
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HUMPHOSLIP_PE A_2_node_5 1 


551 


HUMPHOSLIP_PEA_2_node_52 


552 


HUMPHOSLIP_PEA_2_node_53 


553 


HUMPHOSLIP_PEA_2_node_54 


554 


HUMPHOSLIPJPEA_2_node_55 


555 


HUMPHOSLIP_PEA_2_node_5 8 


556 


HUMPHOSLIP_PEA_2_node_59 


557 


HUMPHOSLIP_PEA_2_node_60 


558 


HUMPHOSLIP_PEA_2_node_61 


559 


HUMPHOSLIP_PEA_2_node_62 


560 


HUMPHOSLIP_PEA_2_node_63 


562 


HUMPHOSLIP_PEA_2_node_64 


562 


HUMPHOSLIP_PEA_2_node_65 


563 


HUMPHOSLIP_PEA_2_node_66 


564 


HUMPHOSLIP_PEA_2_node_67 


565 


HUMPHOSLIP_PEA_2_node_69 


566 


HUMPHOSLIP_PEA_2_node_7 1 


567 


HUMPHOSLIP_PEA_2_node_72 


568 


HUMPHOSLIP_PEA_2_node_73 


569 


HUMPHOSLIP_PEA_2_node_74 


570 



Table 477 - Proteins of interest 



Protein Name 


Sequence ID No. 


Corresponding Transcript(s) 


HUMPHOSLIP_PEA_2_P 1 0 


1327 


HUMPHOSLIP_PEA_2_T17 


HUMPHOSLIP_PEA_2_P 1 2 


1328 


HUMPHOSLIPPE A_2_T 1 9 


HUMPHOSLIP_PEA_2_P30 


1329 


HUMPHOSLIP_PEA_2_T6 


HUMPHOSLIP_PEA_2_P3 1 


1330 


HUMPHOSLIP_PEA_2_T7 


HUMPHOSLIP_PEA_2_P3 3 


1331 


HUMPHOSLIP_PEA_2_T14 


HUMPHOSLIP_PEA_2_P34 


1332 


HUMPHOSLIP_PEA_2_Tl 6 
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HUMPHOSLIP PEA 2 P35 



1333 



HUMPHOSLIP PEA 2 T18 



10 



15 



These sequences are variants of the known protein Phospholipid transfer protein precursor 
(SwissProt accession identifier PLTP_HUMAN; known also according to the synonyms Lipid 
transfer protein II), SEQ ID NO: 1433, referred to herein as the previously known protein. 

Protein Phospholipid transfer protein precursor is known or believed to have the following 
function(s): Converts HDL into larger and smaller particles. May play a key role in extracellular 
phospholipid transport and modulation of HDL particles. The sequence for protein Phospholipid 
transfer protein precursor is given at the end of the application, as "Phospholipid transfer protein 
precursor amino acid sequence". Known polymorphisms for this sequence are as shown in Table 
478. 

Table 478 - Amino acid mutations for Known Protein 



SNP position(s) on • 


Comment %p • : ,>< '.*#....;•'.'•_/..•. 1 


282 


R-> Q. /FTId=VAR_0 17020. 


372 


R -> H. /FTId=VAR_01 702 1 . 


380 


R -> W (in dbSNP:6065903). /FTId=VAR_0 17022. 


444 


F -> L (in dbSNP: 18041 61). /FTId=VAR_0 12073. 


487 


T -> K (in dbSNP: 1056929). /FTId=VAR_0 12074. 


18 


E->V 



20 



Protein Phospholipid transfer protein precursor localization is believed to be Secreted. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: lipid metabolism; lipid transport, which are annotation(s) related to 
Biological Process; lipid binding, which are annotation(s) related to Molecular Function; and 
extracellular, which are annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 
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For this cluster, at least one oligonucleotide was found to demonstrate overexpression of 
the cluster, although not of at feast one transcript/segment as listed below. Microarray (chip) 
data is also available for this cluster as follows. Various oligonucleotides were tested for being 
differentially expressed in various disease conditions, particularly cancer, as previously 



5 described. The following oligonucleotides were found to hit this cluster but not other 
segments/transcripts below, shown in Table 479, with regard to lung cancer. 

Table 479 - Oligonucleotides related to this cluster 



Oligonucleotide name ; > 


Overexpressed : ihjcahcers 




HUMPHOSLIP J)J)_1 8458 


lung malignant tumors 


LUN 


As noted above, cluster B 


[UMPHOSLIP features 7 transcript(s), which were listed in 



Table 1 above. These transcript(s) encode for protein(s) which are variant(s) of protein 
1 0 Phospholipid transfer protein precursor. A description of each variant protein according to the 
present invention is now provided. 

Variant protein HUMPHOSLIPJPEA_2JP10 according to the present invention has an 
amino acid sequence as given at the end of the application; it is encoded by transcript(s) 

15 HUMPHO SLIP JPE A_2_T 1 7 . An alignment is given to the known protein (Phospholipid 

transfer protein precursor) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

20 Comparison report between HUMPHOSLIPJPEA_2JP10 and PLTP HUMAN : 

l.An isolated chimeric polypeptide encoding for HUMPHOSLIP JPE A_2_P 10, 
comprising a first amino acid sequence being at least 90 % homologous to 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGH 
FYYNISE corresponding to amino acids 1 - 67 of PLTP_HUMAN, which also corresponds to 
25 amino acids 1 - 67 of HUMPHOSL1P„PEA_2_P10, and a second amino acid sequence being at 
least 90 % homologous to 

KVYDFLSTFITSGMRFLLNQQICPVLYHAGTVLLNSLLDTVPVRSSVDELVGIDYSLMK 
DPVASTSNLDMDFRGAFFPLTERNWSLPNRAVEPQLQEEERMVYVAFSEFFFDSAMES 
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YFRAGALQLLLVGDKVPHDLDMLLRATYFGSIVLLSPAVIDSPLKLELRVLAPPRCTIKP 
SGTTISVTASVTIALVPPDQPEVQLSSMTMDARLSAKMALRGKALRTQLDLRRFRIYSN 
HSALESLALIPLQAPLKTMLQIGVMPMLNERTWRGVQIPLPEGINFVHEVVTNHAGFLTI 
GADLHFAKGLREVIEKNRPADVRASTAPTPSTAAV corresponding to amino acids 163 - 

5 493 of PLTP JHUM AN , which also corresponds to amino acids 68 - 398 of 

HUMPHOSLIPJPEA2P10, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

2. An isolated chimeric polypeptide encoding for an edge portion of 
HUMPHOSLIPJPEA_2JP10, comprising a polypeptide having a length V, wherein n is at 

10 least about 10 amino acids in length, optionally at least about 20 amino acids in length, 

preferably at least about 30 amino acids in length, more preferably at least about 40 amino acids 
in length and most preferably at least about 50 amino acids in length, wherein at least two amino 
acids comprise EK, having a structure as follows: a sequence starting from any of amino acid 
numbers 67-x to 67; and ending at any of amino acid numbers 68+ ((n-2) - x), in which x varies 

15 from 0 to n-2. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 

20 secreted. The protein localization is believed to be secreted because both signatpeptide 

prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HUMPHOSLIPJPEA_2_P10 also has the following non- silent SNPs 
(Single Nucleotide Polymorphisms) as listed in Table 480, (given according to their position(s) 

25 on the amino acid sequence, with the alternative amino acid(s) listed; the last column indicates 
whether the SNP is known or not; the presence of known SNPs in variant protein 
HUMPHOSL1P_PEAJ2_P10 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 480 - Amino acid mutations 
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SNP position(s) oh amino acid 
sequence : 


Alternative amino acid(s) •' 


Previously known SNP? 

'■ '" 4 : 4 ' " : 


16 


H->R 


Yes 


18 


E-> V 


Yes 


113 


S->F 


Yes 


118 


v-> 


No 


140 


R-> 


No 


140 


R->P 


No 


150 


N-> 


No 


160 


P -> 


No 


201 


P-> 


No 


274 


M -> 


JNo 


285 


R->W 


Yes 


292 


Q-> 


No 


315 


L-> * 


No 


330 


M->I 


Yes 


349 


F->L 


Yes 


392 


T->K 


Yes 



The glycosylation sites of variant protein HUMPHOSLIP_PEA_2_P10 5 as compared to 
the known protein Phospholipid transfer protein precursor, are described in Table 481 (given 
according to their position(s) on the amino acid sequence in the first column; the second column 
5 indicates whether the glycosylation site is present in the variant protein; and the last column 
indicates whether the position is different on the variant protein). 



Table 481 - Glycosylation site(s) 



Positions) on known amino 
acid sequence 


Present in variant protein? : 


Position in variant protein? 


94 


no 




143 


no 
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64 


yes 


64 


245 


yes 


150 


398 


yes 


303 


117 


no 





Variant protein HUMPHOSLIP_PEA_2_P10 is encoded by the following transcript(s): 
HUMPHOSLIP_PEA_2_T17, for which the sequence(s) is/are given at the end of the 
application. The coding portion of transcript HUMPHOSLIP_PEA_2JT17 is shown in bold; this 
5 coding portion starts at position 276 and ends at position 1469. The transcript also has the 
following SNPs as listed in Table 482 (given according to their position on the nucleotide 
sequence, with the alternative nucleic acid listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein HUMPHOSLIPJPEA_2JP10 
sequence provides support for the deduced sequence of this variant protein according to the 
10 present invention). 

Table 482 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence . / ' '■■'<-':'-t . :• 


Alternative nucleic acid 

; ;'■ \;V, ■' ■ ' ■ < ■ |%; > . - 


Previously known SNP? 


174 


G->T 


No 


175 


A->T 


No 


322 


A->G 


Yes 


328 


A->T 


Yes 


431 


G->A 


Yes 


551 


C->T 


Yes 


613 


C->T 


Yes 


628 


T-> 


No 


694 


G-> 


No 


694 


G->C 


No 


723 


A-> 


No 


753 


C-> 


No 


876 


C -> 


No 



WO 2006/131783 


578 


PCT/IB2005/004037 


1037 


C->T 


Yes 


1097 


G-> 


No 


1128 


C->T 


Yes 


1149 


C-> 


No 


1219 


T-> A 


No 


1230 


C->T 


Yes 


1265 


G->C 


Yes 


1322 


T-> A 


Yes 


1450 


C->A 


Yes 


1469 


C->T 


No 


1549 


C->T 


Yes 


1565 


A->G 


No 


1565 


A->T 


No 


1630 


A->G 


Yes 


1654 


T->A 


No 


1731 


G->T 


Yes 


1864 


G->A 


Yes 


1893 


G->T 


Yes 


2073 


G-> A 


Yes 


2269 


C->T 


Yes 


2325 


G->T 


Yes 


2465 


C->T 


Yes 


2566 


C->T 


Yes 


2881 


A->G 


No 



Variant protein HUMPHOSLIP_PEA_2_P12 according to the present invention has an 
amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMPHOSLIP_PEA_2_T19. An alignment is given to the known protein (Phospholipid 

transfer protein precursor) at the end of the application. One or more alignments to one or more 
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previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

Comparison report between HUMPHOSLIP PEA 2 P12 and PLTP JHUM AN : 
5 LAn isolated chimeric polypeptide encoding for HUMPHOSLIPPEA2P12, 

comprising a first amino acid sequence being at least 90 % homologous to 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGH 

FYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLRFRRQLLYWFFYDGGYINAS 

AEGVSIRTGLELSRDPAGRMKVSNVSCQAS 

10 LLNQQICPVLYHAGTVLLNSLLDTVPVRSSVDELVGIDYSLMKDPVASTSNLDMDFRG 
AFFPLTERNWSLPNRAVEPQLQEEERMVYVAFSEFFFDSAMESYFRAGALQLLLVGDK 
VPHDLDMLLRATYFGSIVLLSPAVIDSPLKLELRVLAPPRCTIKPSGTTISVTASVTIALVP 
PDQPEVQLSSMTMDARLSAKMALRGKALRTQLDLRRFRIYSNHSALESLALIPLQAPLK 
TMLQIGVMPMLN corresponding to amino acids 1 - 427 of PLTPHUMAN, which also 

15 corresponds to amino acids 1 - 427 of HUMPHOSLIP PEA 2 P 12, and a second amino acid 

sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
GKAGV corresponding to amino acids 428 - 432 of HUMPHOSLIP_PEA_2_P12, wherein said 
first amino acid sequence and second amino acid sequence are contiguous and in a sequential 

20 order. 

2. An isolated polypeptide encoding for a tail of HUMPHOSLIP JPEA 2_P 12, comprising 
a polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence GKAGV in HUMPHOSLIPPE A_2_P 1 2. 

25 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
30 prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 
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Variant protein HUMPH OSLIPJPEA_2_P 1 2 also has the following non- silent SNPs 
(Single Nucleotide Polymorphisms) as listed in Table 483, (given according to their position(s) 
on the amino acid sequence, with the alternative amino acid(s) listed; the last column indicates 
whether the SNP is known or not; the presence of known SNPs in variant protein 
HUMPHOSLIP_PEA_2JP12 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 483 ~ Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) f ' 


Previously known SNP? J 


16 


H->R 


Yes 


18 


E-> V 


Yes 


81 


D->H 


Yes 


124 


S -> Y 


Yes 


160 


T-> 


No 


160 


T->N 


No 


208 


S->F 


Yes 


213 


V-> 


No 


235 


R->P 


No 


235 


R-> 


No 


245 


N-> 


No 


255 


P-> 


No 


296 


P-> 


No 


369 


M-> 


No 


380 


R->W 


Yes 


387 


Q-> 


No 


410 


L->* 


No 


425 


M->I 


Yes 



The glycosylation sites of variant protein HUMPHOSLIP_PEA_2_P12 ? as compared to 
the known protein Phospholipid transfer protein precursor, are described in Table 484 (given 
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according to their position(s) on the amino acid sequence in the first column; the second column 
indicates whether the glycosylation site is present in the variant protein; and the last column 
indicates whether the position is different on the variant protein). 

Table 484 - Glycosylation site(s) 



Positiqn(s) on hxpym amino 
acid sequence ; : 


Present in variant protein? ~ 


Position in variant protein? 

- , ■ : " " ■■ 


94 


yes 


94 


143 


yes 


143 


64 


yes 


64 


245 


yes 


245 


398 


yes 


398 


117 


yes 


117 



5 

Variant protein HUMPHOSLIPJPEA_2JP12 is encoded by the following transcript(s): 



HUMPHOSLIP_PEA_2_T19 5 for which the sequence(s) is/are given at the end of the 
application. The coding portion of transcript HUMPHOSLIP_PEA_2_T19 is shown in bold; this 
coding portion starts at position 276 and ends at position 1571. The transcript also has the 
10 following SNPs as listed in Table 485 (given according to their position on the nucleotide 

sequence, with the alternative nucleic acid listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein HUMPHOSLIP JPEA_2 JP 1 2 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

1 5 Table 485 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


174 


G->T 


No 


175 


A->T 


No 


322 


A->G 


Yes 


328 


A->T 


Yes 
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431 


G -> A 


Yes 


516 


G->C 


Yes 


644 


G-> A 


Yes 


646 


C -> A 


Yes 


754 


C-> 


No 


754 


C-> A 


No 


836 


C ->T 


Yes 


898 


C ->T 


Yes 


913 


T-> 


No 


979 


G-> 


No 


979 


G->C 


No 


1008 


A-> 


No 


1038 


C -> 


No 


1161 


C-> 


No 


1322 


C->T 


Yes 


1382 


G-> 


No 


1413 


C->T 


Yes 


1434 


C-> 


No 


1504 


T-> A 


No 


1515 


C->T 


Yes 


1550 


G->C 


Yes 


1690 


T-> A 


Yes 


1818 


C->A 


Yes 


1837 


C -> T 


"XT 

No 


1917 


C->T 


Yes 


1933 


A->G 


No 


1933 


A->T 


No 


1998 


A->G 


Yes 


2022 


T-> A 


No 


2099 


G->T 


Yes 
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2232 


G-> A 


Yes 


2261 


G->T 


Yes 


2441 


G-> A 


Yes 


2637 


C->T 


Yes 


2693 


G->T 


Yes 


2833 


C->T 


Yes 


2934 


C->T 


Yes 


3249 


A->G 


No 



Variant protein HUMPHOSLIPJPEA_2JP30 according to the present invention has an 
amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMPHOSLIPPEA2T6. The location of the variant protein was determined according to 
results from a number of different software programs and analyses, including analyses from 
SignalP and other specialized programs. The variant protein is believed to be located as follows 
with regard to the cell: secreted. The protein localization is believed to be secreted because both 
signal-peptide prediction programs predict that this protein has a signal peptide, and neither 
10 trans- membrane region prediction program predicts that this protein has a trans -membrane 
region. 

Variant protein HUMPHOSLIPJPEA_2_P30 also has the following non-silent SNPs 
(Single Nucleotide Polymorphisms) as listed in Table 486, (given according to their position(s) 
on the amino acid sequence, with the alternative amino acid(s) listed; the last column indicates 
15 whether the SNP is known or not; the presence of known SNPs in variant protein 

HUMPHO SLIP JPE A_2 JP3 0 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 486 - Amino acid mutations 



SNP position^) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


16 


H->R 


Yes 


18 


E->V 


Yes 
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37 



R->Q 



Yes 



10 



Variant protein HUMPHOSLIP_PEA_2JP30 is encoded by the following transcript(s): 
HUMPHOSLIP_PEA_2 JT6, for which the sequence(s) is/are given at the end of the application. 
The coding portion of transcript HUMPHOSLIPJPEA_2_T6 is shown in bold; this coding 
portion starts at position 276 and ends at position 431. The transcript also has the following 
SNPs as listed in Table 487 (given according to their position on the nucleotide sequence, with 
the alternative nucleic acid listed; the last column indicates whether the SNP is known or not; 
the presence of known SNPs in variant protein HUMPHOSLIP_PEA_2_P30 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

Table 487- Nucleic acid SNPs 



SNP 'position on nucleotide j- 
sequence . ; * • 


AltemattiYeimcleicacid ■%. 


Frevionsly known SHP? .. 


174 


G->T 


No 


175 


A->T 


No 


322 


A->G 


Yes 


328 


A->T 


Yes 


385 


G->A 


Yes 


470 


G->C 


Yes 


598 


G->A 


Yes 


600 


C->A 


Yes 


708 


C-> 


No 


708 


C-> A 


No 


790 


C->T 


Yes 


852 


C->T 


Yes 


867 


T-> 


No 


933 


G-> 


No 


933 


G->C 


No 


962 


A-> 


No 


992 


C-> 


No 
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1115 


C -> 


No 


1276 


C ->T 


Yes 


1336 


G-> 


No 


1367 


C->T 


Yes 


1388 


C-> 


No 


1458 


T -> A 


No 


1469 


C->T 


Yes 


1504 


G >C 


Yes 


1561 


T-> A 


Yes 


1689 


C -> A 


Yes 


1708 


C ->T 


No 


1788 


C->T 


Yes 


1804 


A->G 


No 


1804 


A->T 


No 


1869 


A->G 


Yes 


1893 


T-> A 


No 


1970 


G->T 


Yes 


2103 


G -> A 


Yes 


2132 


G->T 


Yes 


2312 


G-> A 


Yes 


2508 


C->T 


Yes 


2564 


G->T 


Yes 


2704 


C->T 


Yes 


2805 


C->T 


Yes 


3120 


A->G 


No 



Variant protein HUMPHOSLIP_PEA_2_P3 1 according to the present invention has an 
amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMPHO SLIPJPE A_2_T7 . An alignment is given to the known protein (Phospholipid transfer 
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protein precursor) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

Comparison report between HUMPHOSLIP_PEA_2_P31 and PLTP_HUM AN : 

1. An isolated chimeric polypeptide encoding for HUMPHOSLIPJPEA_2JP3 1 , 
comprising a first amino acid sequence being at least 90 % homologous to 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGH 
FYYNISE corresponding to amino acids 1 - 67 of PLTP JHUMAN, which also corresponds to 
amino acids 1 - 67 of HUMPHOSLIP_PEA_2_P3 1 , and a second amino acid sequence being at 
least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and 
most preferably at least 95% homologous to a polypeptide having the sequence 
PGLERG ADKFP WGGS SLFL ALDLTLRPP VG corresponding to amino acids 68 - 98 of 
HUMPHOSLIPPEA2P31, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of HUMPHOSLIP_PEA_2J > 31, comprising 
a polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95 % homologous to the 
sequence PGLERGADKFPWGGSSLFLALDLTLRPPVG in HUMPHOSLIPJPEA_2_P3 1 . 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HUMPHOSLIPJPEA_2JP31 also has the following noi> silent SNPs 
(Single Nucleotide Polymorphisms) as listed in Table 488, (given according to their position(s) 
on the amino acid sequence, with the alternative amino acid(s) listed; the last column indicates 
whether the SNP is known or not; the presence of known SNPs in variant protein 
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HUMPHOSLIP_PEA_2 JP3 1 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 488 - Amino acid mutations 



SNP ppsitiott(s) on.amino acid 
sequence . '1 


Alternative amino acid(s) 

'" > ; w^-; ' ''i' Hr •-;>, 


.. Pr evioiisly known SNP? 


16 


H->R 


Yes 


18 


E-> V 


Yes 



5 The glycosylation sites of variant protein HUMPHOSLIP JPEA_2_P3 1 , as compared to 

the laiown protein Phospholipid transfer protein precursor, are described in Table 489(given 
according to their position(s) on the amino acid sequence in the first column; the second column 
indicates whether the glycosylation site is present in the variant protein; and the last column 
indicates whether the position is different on the variant protein). 

1 0 Table 489 - Glycosylation site(s) 



*^&i1^n(j$> ; on.l^^ amino % 

acid sequence ,J, , - " :j. . ■ fe, ' 


Prelent ip variant protein? if? 


Fositiori in variant protein? 


94 


no 




143 


no 




64 


yes 


64 


245 


no 




398 


no 




117 


no 





Variant protein HUMPHOSLIP PEA 2 P3 1 is encoded by the following transcript(s): 
HUMPHOSLIP_PEA_2_T7 ? for which the sequence(s) is/are given at the end of the application. 
The coding portion of transcript TIUMPHOSLIPJPEA_2_T7 is shown in bold; this coding 
15 portion starts at position 276 and ends at position 569. The transcript also has the following 

SNPs as listed in Table 490 (given according to their position on the nucleotide sequence, with 
the alternative nucleic acid listed; the last column indicates whether the SNP is known or not; 
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the presence of known SNPs in variant protein HUMPHOSLIPJPEA_2JP3 1 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

Table 490 - Nucleic acid SNPs 



SNP position on, nucleotide 
sequence . _ *„• 


Alternative nucleic acid , 


Previously known SNP? 

• " ~ ' . • ' ' -'V-v: " • - ' 


174 


G->T 


No 


175 


A->T 


No 


322 


A->G 


Yes 


328 


A->T 


Yes 


431 


G-> A 


Yes 


608 


G->C 


Yes 


736 


G->A 


Yes 


738 


C -> A 


Yes 


846 


C-> 


No 


846 


C->A 


No 


928 


C->T 


Yes 


990 


C->T 


Yes 


1005 


T-> 


No 


1071 


G-> 


No 


1071 


G->C 


No 


1100 


A-> 


No 


1130 


C-> 


No 


1253 


C-> 


No 


1414 


C->T 


Yes 


1474 


G-> 


No 


1505 


C->T 


Yes 


1526 


C-> 


No 


1596 


T->A 


No 


1607 


C->T 


Yes 
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1642 


G->C 


Yes 


1699 


T-> A 


Yes 


1827 


C->A 


Yes 


1846 


C ->T 


No 


1926 


C ->T 


Yes 


1942 


A->G 


No 


1942 1 


A->T 


No 


2007 


A->G 


Yes 


2031 


T-> A 


No 


2108 


G->T 


Yes 


2241 


G-> A 


Yes 


2270 


G->T 


Yes 


2450 


G-> A 


Yes 


2646 


C -> T 


Yes 


2702 


G->T 


Yes 


2842 


C->T 


Yes 


2943 


C->T 


Yes 


3258 


A->G 


No 



Variant protein HUMPHOSLIPJPEAJ2JP33 according to the present invention has an 
amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMPHOSLIP_PEA_2_T14. An alignment is given to the known protein (Phospholipid 

transfer protein precursor) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 
1 0 Comparison report between HUMPHOSLIPJPEA_2_P3 3 and PLTP HUMAN: 

l.An isolated chimeric polypeptide encoding for HUMPHOSLIPJPEA_2JP33 ? 
comprising a first amino acid sequence being at least 90 % homologous to 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGH 
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FYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLRFRRQLLYWFFYDGGYINAS 
AEGVSIRTGLELSRDPAGRMKVSNVSCQASVSRMHAAFGGTFKKVYDFLSTFITSGMRF 
LLNQQ corresponding to amino acids 1-183 of PLTP_HUMAN, which also corresponds to 
amino acids 1-183 of HUMPHOSLIP_PEA_2_P33, and a second amino acid sequence being 
5 at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and 
most preferably at least 95% homologous to a polypeptide having the sequence 
VWAATGRRVARVGMLSL corresponding to amino acids 184 - 200 of 
HUMPHOSLIP PEA_2JP33, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 
10 2. An isolated polypeptide encoding for a tail of HUMPHOSLIPJPEAJ2_P33, comprising 

a polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence VWAATGRRVARVGMLSL in HUMPHOSLIP_PEA_2_P33. 

15 The location of the variant protein was determined according to results from a number of 

different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 

20 region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HUMPHOSLIPJPEA_2_P33 also has the following noivsilent SNPs 
(Single Nucleotide Polymorphisms) as listed in Table 491, (given according to their position(s) 
on the amino acid sequence, with the alternative amino acid(s) listed; the last column indicates 
whether the SNP is known or not; the presence of known SNPs in variant protein 

25 HUMPFIO SLIPJPE A__2JP3 3 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 491 - Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


16 


H->R 


Yes 
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18 


E-> V 


Yes 


81 


D->H 


Yes 


124 


S->Y 


Yes 


160 


T-> 


No 


160 


T->N 


No 



The glycosylation sites of variant protein HUMPHOSLIP_PEA_2JP33 ? as compared to 
the known protein Phospholipid transfer protein precursor, are described in Table 492 (given 
according to their position(s) on the amino acid sequence in the first column; the second column 
5 indicates whether the glycosylation site is present in the variant protein; and the last column 
indicates whether the position is different on the variant protein). 



Table 492 - Glycosylation site(s) 



Position(s) on known aiiiino 
acid sequence - i 


Preseri^iin variant protein? 


SPositioa in variant protein? 


94 


yes 


94 


143 


yes 


143 


64 


yes 


64 


245 


no 




398 


no 




117 


yes 


117 



Variant protein HUMPHOSLIP_PEA_2_P33 is encoded by the following transcript(s): 
10 HUMPHOSLIPJ?EA_2_T14 ? for which the sequence(s) is/are given at the end of the 

application. The coding portion of transcript HUMPHOSLBP_PEA_2_T14 is shown in bold; this 
coding portion starts at position 276 and ends at position 875. The transcript also has the 
following SNPs as listed in Table 493 (given according to their position on the nucleotide 
sequence, with the alternative nucleic acid listed; the last column indicates whether the SNP is 
1 5 known or not; the presence of known SNPs in variant protein HUMPHOSLIP JPEA_2 JP33 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
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Table 493 - Nucleic acid SNPs 



SNP position oh nucleotide 


Alternative nucleic acid* 


Previously known SNP? 


sequence ■ . \;% 






174 


G->T 


No 


175 


A->T 


No 


322 


A->G 


Yes 


328 


A->T 


Yes 


431 


G-> A 


Yes 


516 


G->C 


Yes 


644 


G-> A 


Yes 


646 


C-> A 


Yes 


754 


C-> 


No 


754 


C->A 


No 


921 


C->T 


Yes 


983 


C->T 


Yes 


998 


T-> 


No 


1064 


G-> 


No 


1064 


G->C 


No 


1093 


A-> 


No 


1123 


C-> 


No 


1246 


C-> 


No 


1407 


C->T 


Yes 


1467 


G-> 


No 


1498 


C->T 


Yes 


1519 


C-> 


No 


1589 


T->A 


No 


1600 


C->T 


Yes 


1635 


G->C 


Yes 


1692 


T-> A 


Yes 


1820 


C->A 


Yes 
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1839 


C->T 


No 


1919 


C ->T 


Yes 


1935 


A->G 


No 


1935 


A->T 


No 


2000 


A->G 


Yes 


2024 


T->A 


No 


2101 


G->T 


Yes 


2234 


G-> A 


Yes 


2263 


G->T 


Yes 


2443 


G-> A 


Yes 


2639 


C ->T 


Yes 


2695 


G->T 


Yes 


2835 


C->T 


Yes 


2936 


C ->T 


Yes 


3251 


A->G 


No 



Variant protein HUMPHOSLIPJPEA_2JP34 according to the present invention has an 
amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMPHOSLIP_PEA_2_T16. An alignment is given to the known protein (Phospholipid 

transfer protein precursor) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 
1 0 Comparison report between HUMPHOSLIP JPEA_2 JP34 and PLTPJHUMAN: 

l.An isolated chimeric polypeptide encoding for HUMPHOSLIP_PEA__2_P34 ? 
comprising a first amino acid sequence being at least 90 % homologous to 
MALFGALFLALLAGAHAEFP 

FYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLRFRRQLLYWFFYDGGYINAS 
15 AEGVSIRTGLELSRDPAGRM^ 

LLNQQICPVLYHAGTVLLNSLLDTVPV corresponding to amino acids 1 - 205 of 
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PLTPJHUMAN, which also corresponds to amino acids 1 - 205 of 

HUMPHOSLIPPEA2P34, and a second amino acid sequence being at least 70%, optionally 
at least 80%, preferably at least 85%, more preferably at least 90% and most preferably at least 
95% homologous to a polypeptide having the sequence LWTSLLALTIPS corresponding to 
5 amino acids 206 - 217 of HUMPHOSLIPJPEA2JP34, wherein said first amino acid sequence 
and second amino acid sequence are contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a tail of HUMPHOSLIPJPEA_2JP34, comprising 
a polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
1 0 sequence LWTSLLALTIPS in HUMPHOSLIP_PEA_2_P34. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 

15 secreted. The protein localization is believed to be secreted because both signat-peptide 

prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HUMPHOSLlP_PEA_2JP34 also has the following non-silent SNPs 
(Single Nucleotide Polymorphisms) as listed in Table 494, (given according to their position(s) 

20 on the amino acid sequence, with the alternative amino acid(s) listed; the last column indicates 
whether the SNP is known or not; the presence of known SNPs in variant protein 
HUMPHOSLIPPEA2P34 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 494 - Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino aeid(s) 


Previously known SNP? 


16 


H->R 


Yes 


18 


E->V 


Yes 


81 


D->H 


Yes 


124 


S->Y 


Yes 
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160 


T-> 


No 


160 


T->N 


No 


211 


L-> 


No 



The glycosylate on sites of variant protein HUMPHOSLIPJPEA2JP34, as compared to 
the known protein Phospholipid transfer protein precursor, are described in Table 495 (given 
according to their position(s) on the amino acid sequence in the first column; the second column 
5 indicates whether the glycosylation site is present in the variant protein; and the last column 
indicates whether the position is different on the variant protein). 



Table 495 - Glycosylation site(s) 



Position^) on hjown amino 
^' add Sequence / 


Present in variant protein? 


Position in variant prdte.in? r '; 


94 


yes 


94 


143 


yes 


143 


64 


yes 


64 


245 


no 




398 


no 




117 


yes 


117 



Variant protein HUMPHOSLIP_PEA_2JP34 is encoded by the following transcript(s): 
10 HUMPHOSLIPJPEA_2_T16, for which the sequence(s) is/are given at the end of the 

application. The coding portion of transcript HUMPHOSLIP_PEA_2JT 1 6 is shown in bold; this 
coding portion starts at position 276 and ends at position 926. The transcript also has the 
following SNPs as listed in Table 496 (given according to their position on the nucleotide 
sequence, with the alternative nucleic acid listed; the last column indicates whether the SNP is 
15 known or not; the presence of known SNPs in variant protein HUMPHOSLIPJPEA_2_P34 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 496 - Nucleic acid SNPs 
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SNTP position on nucleotide 
sequence ' \, ., •'• 


Alternative nucleic acid 

>. : - ; . . : • - j - , ' 


Previously known SNP? 


174 


G->T 


No 


175 


A->T 


No 


322 


A->G 


Yes 


328 


A->T 


Yes 


431 


G -> A 


Yes 


516 


G->C 


Yes 


644 


G-> A 


Yes 


646 


C-> A 


Yes 


754 


C-> 


No 


754 


C -> A 


No 


836 


C->T 


Yes 


891 


C->T 


Yes 


906 


T-> 


No 


972 


G-> 


No 


972 


G->C 


No 


1001 


A-> 


No 


1031 


C-> 


No 


1154 


C-> 


No 


1315 


C->T 


Yes 


1375 


G-> 


No 


1406 


C->T 


Yes 


1427 


C -> 


No 


1497 


T-> A 


No 


1508 


C->T 


Yes 


1543 


G->C 


Yes 


1600 


T-> A 


Yes 


1728 


C -> A 


Yes 


1747 


C->T 


No 
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1827 


C->T 


Yes 


1843 


A->G 


No 


1843 


A->T 


No 


1908 


A->G 


Yes 


1932 


T-> A 


No 


2009 


G->T 


Yes 


2142 


G -> A 


Yes 


2171 


G-> T 


Yes 


2351 


G-> A 


Yes 


2547 


C ->T 


Yes 


2603 


G->T 


Yes 


2743 


C->T 


Yes 


2844 


C->T 


Yes 


3159 


A->G 


No 



Variant protein HUMPHOSLlP_PEA_2JP35 according to the present invention has an 
amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMPHOSLIPJPEA_2_T18. An alignment is given to the known protein (Phospholipid 

transfer protein precursor) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 
1 0 Comparison report between HUMPHOSLIP _PEA_2_P35 and PLTP HUMAN: 

l.An isolated chimeric polypeptide encoding for HUMPHOSLIP_PEA_2__P35, 
comprising a first amino acid sequence being at least 90 % homologous to 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGH 
FYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLRFRRQLLYWF corresponding to 
15 amino acids 1-109 of PLTPHUM AN, which also corresponds to amino acids 1 - 109 of 
HUMPHOSLIP_PEA_2_P35 5 a second amino acid sequence bridging amino acid sequence 
comprising of L, a third amino acid sequence being at least 90 % homologous to 
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KVYDFLSTFITSGMRFLLNQQ corresponding to amino acids 163 - 183 of PLTP JHUMAN, 
which also corresponds to amino acids 111-131 of HUMPHOSLIPJPEA_2_P35, and a fourth 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence VWAATGRRVARVGMLSL corresponding to amino acids 132 - 148 of 
HUMPHOSLIP_PEA__2_P35, wherein said first amino acid sequence, second amino acid 
sequence, third amino acid sequence and fourth amino acid sequence are contiguous and in a 
sequential order. 

2An isolated polypeptide encoding for an edge portion of HUMPHOSLIP_PEA_2_P35, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise FLK having a 
structure as follows (numbering according to HUMPHOSLIP_PEA_2JP35): a sequence starting 
from any of amino acid numbers 109-x to 109; and ending at any of amino acid numbers 111 + 
((n-2) - x), in which x varies from 0 to n-2. 

3 An isolated polypeptide encoding for a tail of HUMPHOSLIP_PEA_2JP35, comprising 
a polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence VWAATGRRVARVGMLSL in HUMPHOSLIP_PEA_2_P35. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans- membrane region. 

Variant protein HUMPHOSLIP_PEA_2_P35 also has the following non-silent SNPs 
(Single Nucleotide Polymorphisms) as listed in Table 497, (given according to their position(s) 
on the amino acid sequence, with the alternative amino acid(s) listed; the last column indicates 
whether the SNP is known or not; the presence of known SNPs in variant protein 
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HUMPHOSLIP_PEA_2_P35 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 497 - Amino acid mutations 



Si$p pOsition(s) on amino acjd 
sequence -\ ' 


Alternative amino. acid(s) 


Previously knWrt SNP? 


16 


H->R 


Yes 


18 


E->V 


Yes 


81 


D->H 


Yes 



5 The glycosylation sites of variant protein HUMPHOSLIPJPEA_2JP35, as compared to 

the known protein Phospholipid transfer protein precursor, are described in Table 498 (given 
according to their position(s) on the amino acid sequence in the first column; the second column 
indicates whether the glycosylation site is present in the variant protein; and the last column 
indicates whether the position is different on the variant protein). 

1 0 Table 498 - Glycosylation site(s) 



Positioned (Mj^^amitM) 


Preslht ii^#aritot^rotfi|i? 


PosMqft in variant jrotein? 


94 


yes 


94 


143 


no 




64 


yes 


64 


245 


no 




398 


no 




117 


no 





Variant protein HUMPHOSLIPPE A2P3 5 is encoded by the following transcript(s): 
HUMPHOSLIP_PEA_2„T18 5 for which the sequence(s) is/are given at the end of the 
application. The coding portion of transcript HUMPHOSLIP_PEA_2_T18 is shown in bold; this 
15 coding portion starts at position 276 and ends at position 719. The transcript also has the 
following SNPs as listed in Table 499 (given according to their position on the nucleotide 
sequence, with the alternative nucleic acid listed; the last column indicates whether the SNP is 
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known or not; the presence of known SNPs in variant protein HUMPHOSLIP_PEA_2_P35 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 499 - Nucleic acid SNPs 



SNP position on nucleotide 
■ sequence ; . '" •>■>:' .■ 


>Alfernative iucleic acid 


Previously ImownSNP? -f 


174 


G->T 


No 


175 


A->T 


No 


322 


A->G 


Yes 1 


328 


A->T 


Yes 


431 


G-> A 


Yes 


516 


G->C 


Yes 


765 


C->T 


Yes 


827 


C->T 


Yes 


842 


T-> 


No 


908 


G-> 


No 


908 


G->C 


No 


937 


A-> 


No 


967 


C-> 


No 


1090 


C-> 


No 


1251 


C->T 


Yes 


1311 


G-> 


No 


1342 


C->T 


Yes 


1363 


C-> 


No 


1433 


T->A 


No 


1444 


C->T 


Yes 


1479 


G->C 


Yes 


1536 


T->A 


Yes 


1664 


C -> A 


Yes 
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1683 


C ->T 


No 


1763 


C ->T 


Yes 


1779 


A->G 


No 


1779 


A->T 


No 


1844 


A->G 


Yes 


1868 


T-> A 


No 


1945 


G->T 


Yes 


2078 


G-> A 


Yes 


2107 


G -> T 


Yes 


2287 


G->A 


Yes 


2483 


C ->T 


Yes 


2539 


G->T 


Yes 


2679 


C->T 


Yes 


2780 


C ->T 


Yes 


3095 


A->G 


No 



~~ As noted above, cluster HUMPHOSLIP features 53 segment(s), which were listed in 
Table 2 above and for which the sequence(s) are given at the end of the application. These 
segment(s) are portions of nucleic acid sequence(s) which are described herein separately 



because they are of particular interest. A description of each segment according to the present 
5 invention is now provided. 

Segment cluster HUMPHOSLIP_PEA_2_node_0 according to the present invention is 
supported by 150 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6 3 
10 HUMPHOSLIP JPEA_2_T7, HUMPHOSLIP__PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 

HUMPHOSLIP JPEA_2_T17, HUMPHOSLIPJ>EA__2JT 18 and HUMPHOSLIP_PEA_2_T19. 
Table 500 below describes the starting and ending position of this segment on each transcript. 

Table 500 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 



WO 2006/131783 



PCT/IB2005/004037 



602 



HUMPHOSLIP_PEA_2_T6 




264 


HUMPHOSLIP_PEA_2_T7 




264 


HUMPHOSLIPJPE A_2_T 1 4 




264 


HUMPHOSLIP_PEA_2_Tl 6 




264 


HUMPHOSLIP_PEA_2_T 1 7 




264 


HUMPHOSLIP_PEA_2_Tl 8 




264 


HUMPHOSLIP_PEA_2_T 1 9 




264 



Segment cluster HUMPHOSLIPJPEA_2_node_19 according to the present invention is 
supported by 186 libraries. The number of libraries was determined as previously described. 
5 This segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2 JT6, 

HUMPHOSLIPJPEA_2JT7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIPJPEA_2JT16 and 
HUMPHOSLff JPEA_2_T19. Table 501 below describes the starting and ending position of this 
segment on each transcript. 
Table 501 - Segment location on transcripts 



Transcript name . 


Segment • $ \i '■■ 
starting position 


Segment •• . - 
ending ppsitioii * 


HUMPHOSLBP_PEA_2_T6 


559 


714 


HUMPHOSLIP_PEA_2_T7 


697 


852 


HUMPHOSLIP_PEA_2_T14 


605 


760 


HUMPHOSLIP_PEA_2_T 1 6 


605 


760 


HUMPHOSLIP_PE A_2_T1 9 


605 


760 



10 

Segment cluster HUMPHOSLIP_PEA_2_node_34 according to the present invention is 
supported by 191 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2 JT6, 
15 HUMPHOSLIPJPEA_2_T7, HUMPHOSLff_PEA_2_T14 5 HUMPHOSLIP__PEA_2_T16 ? 
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HUMPHOSLIP_PEA_2_T17 ? HUMPHOSLIPJ > EA_2JT18 and HUMPHOSLIP_PEA_2JT19. 
Table 502 below describes the starting and ending position of this segment on each transcript. 

Table 502 - Segment location on transcripts 



Transcript name / 


Segment 
starting position 


Segment 

ending position >: ; . 


HUMPHOSLEP_PEA_2_T6 


971 


1111 


HUMPHOSLIP_PEA_2_T7 


1109 


1249 


HUMPHOSLIP_PEA_2_T14 


1102 


1242 


HUMPHOSLIP_PEA_2_T 1 6 


1010 


1150 


HUMPHOSLIP_PEA_2_T 1 7 


732 


872 


HUMPHOSLIP_PEA_2_T18 


946 


1086 


HUMPHOSLIP_PEA_2_T 1 9 


1017 


1157 



Segment cluster HUMPHOSLIP PEA_2_node_68 according to the present invention is 
supported by 131 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIP PEA 2 T6, 
HUMPHOSLIP_PEA_2_T7 5 HUMPHOSLIPJPEA_2_T14 5 HUMPHOSLIP_PEA_2_T16, 
10 HUMPHOSLIPJPEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA„2_T19. 
Table 503 below describes the starting and ending position of this segment on each transcript. 



Table 503 - Segment location on transcripts 



Transcript name 


Segment y 


Segment 




starting position 


ending position 


HUMPHOSLIP_PEA_2_T6 


1867 


2285 


HUMPHOSLIP_PEA_2_T7 


2005 


2423 


HUMPHOSLIP_PEA_2_T14 


1998 


2416 


HUMPHOSLIP_PEA_2_Tl 6 


1906 


2324 


HUMPHOSLIP_PEA_2_Tl 7 


1628 


2046 


HUMPHOSLIP_PEA_2_Tl 8 


1842 


2260 
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HUMPHOSLIP_PEA_2_T 1 9 


1996 


2414 









Segment cluster HUMPHOSLIP_PEA_2_node_70 according to the present invention is 
supported by 5 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2_T6, 

HUMPHOSLIPJ>EA_2JT7 5 HUMPHOSLIP_PEA__2_T14, HUMPHOSLIPJ>EA_2JT16, 
HUMPHOSLIPJPEA_2_T17, HUMPHOSLIP_PEA„2_T18 and HUMPHOSLIP_JPEA_2_T19. 
Table 504 below describes the starting and ending position of this segment on each transcript. 



Table 504 - Segment location on transcripts 



Transcript name % t - '\ 


Segment' ■ , ; , 


Segment ■ $' J*jg 




potion/; 


ending position t 


HUMPHOSLIP_PEA_2_T6 


2298 


2529 


HUMPHOSLEP_PEA_2_T7 


2436 


2667 ! 


HUMPHOSLIP_PEA_2_T14 


2429 


2660 


HUMPHOSLIP_PEA_2_T16 


2337 


2568 


HUMPHOSLIP_PEA_2_T17 


2059 


2290 


HUMPHOSLIP_PEA_2_Tl 8 


2273 


2504 


HUMPHOSLIP_PEA_2_T19 


2427 


2658 



Segment cluster HUMPHOSLIP_PEA_2_node_75 according to the present invention is 
supported by 14 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2jr6, 
15 HUMPHOSLIP_PEA_2JT7, HUMPHOSLIP_PEA__2_T14, HUMPHOSLIP_PEA_2_T16, 

HUMPHOSLIPJPEA_2_T17, HUMPHOSLIPJPEA_2JT18 and HUMPHOSLIP_PEA_2_T19. 
Table 505 below describes the starting and ending position of this segment on each transcript. 

Table 505 - Segment location on transcripts 



WO 2006/131783 



PCT/IB2005/004037 



605 



Transcript name / ' r 


Segment 


Segment ' 




starting position 


ending position 


HUMPHOSLIP_PEA_2_T6 


2846 


3125 


HUMPHOSLIP_PEA_2_T7 


2984 


3263 


HUMPHOSLIP_PEA_2_T14 


2977 


3256 


HUMPHOSLIP_PEA_2_T 1 6 


2885 


3164 


HUMPHOSLIP_PEA_2_T 1 7 


2607 


2886 


HUMPHOSLIP_PEA_2_Tl 8 


2821 


3100 


HUMPHOSLIP_PEA_2_Tl 9 


2975 


3254 



According to an optional embodiment of the present invention, short segments related to 
5 the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

Segment cluster HUMPHOSLIPJPEA_2_node_2 according to the present invention is 
supported by 159 libraries. The number of libraries was determined as previously described. 
10 This segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2 JT6, 

HUMPHOSLIPJ>EA_2JT7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIPJPEA_2_T16, 
HUMPHOSLIP_PEA_2_T17, HUMPHOSLIPJPEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 506 below describes the starting and ending position of this segment on each transcript. 



Table 506 - Segment location on transcripts 



Transcript name 


Segment 

: starting position 


; Segment 
ending position 


HUMPHOSLIP_PEA_2_T6 


265 


337 


HUMPHOSLEP_PEA_2_T7 


265 


337 


HUMPHOSLIP_PEA_2_T 14 


265 


337 


HUMPHOSLIP_PEA_2_T16 


265 


337 
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HUMPHOSLIP_PEA_2_T 1 7 


265 


337 


HUMPHOSLIP_PEA_2_Tl 8 


265 


337 


HUMPHOSLIP_PEA_2_T 1 9 


265 


337 



Segment cluster HUMPHOSLIP_PEA_2_node_3 according to the present invention can 
be found in the following transcript(s): HUMPHOSLDP_PEA„_2_T7 ? 
5 HUMPHOSLIPJPEA_2_T14 ? HUMPHOSLIPJPEA_2_T16, HUMPHOSLIP_PEA_2JT17 5 

HUMPHOSLIPJPEA2T18 and HUMPHOSLIP_PEA_2__T19. Table 507 below describes the 
starting and ending position of this segment on each transcript. 



Table 507 - Segment location on transcripts 



Transcript name •.. . *■ ' ' 


Segment ' .. 
.Starting position 


Segment 

ending position ; Ifo 


HUMPHOSLIP_PEA_2_T7 


338 


355 


HUMPHOSLIP_PEA_2_T14 


338 


355 


HUMPHOSLIP_PEA_2_T 1 6 


338 


355 


HUMPHOSLIP_PEA_2_T17 


338 


355 


HUMPHOSLIP_PEA_2_Tl 8 


338 


355 


HUMPHOSLIP_PEA_2_T19 


338 


355 



Segment cluster HUMPHOSLIP_PEA_2_node_4 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T7, 

HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, HUMPHOSLIP_PEA_2_T17, 
HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. Table 508 below describes the 
1 5 starting and ending position of this segment on each transcript. 



Table 508 - Segment location on transcripts 



Transcript name 


Segment 


! Segment 




i starting position 


; ending position 
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HUMPHOSLIP_PEA_2_T7 


356 


375 


HUMPHOSLIP_PEA_2_T 1 4 


356 


375 


HUMPHOSLIP_PEA_2_T 1 6 


356 


375 


HUMPHOSLIP_PEA_2_Tl 7 


356 


375 


HUMPHOSLIP_PEA_2_Tl 8 


356 


375 


HUMPHOSLIP_PEA_2_T 1 9 


356 


375 



Segment cluster HUMPHOSLIP_PEA_2_node_6 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T7, 
5 HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, HUMPHOSLIP_PEA_2_T17, 

HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. Table 509 below describes the 
starting and ending position of this segment on each transcript. 

Table 509 - Segment location on transcripts 





Segment ' : 0%'' \ 


' Segment - . v 1f ;; ,. 


<)■' ■ "... ' fft . . ' 


starting position .:. ': 


^dlngppsitiohj;- 


HUMPHOSLIP_PEA_2_T7 


376 


383 


HUMPHOSLEP_PEA_2_T14 


376 


383 


HUMPHOSLIP_PEA_2_Tl 6 


376 


383 


HUMPHOSLIP_PEA_2_T17 


376 


383 


HUMPHOSLIP_PEA_2_Tl 8 


376 


383 


HUMPHOSLIP_PEA_2_Tl 9 


376 


383 



Segment cluster HUMPHOSLIP_PEA_2_node_7 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 

HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 
HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
1 5 Table 510 below describes the starting and ending position of this segment on each transcript. 

Table 510 - Segment location on transcripts 
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Transcript name ■ 


Segment ; 


Segment v 




starting position 


ending position 


HUMPHOSLIP_PEA_2_T6 


338 


343 


HUMPHOSLIP_PEA_2_T7 


384 


389 


HUMPHOSLIP_PEA_2_T 1 4 


384 


389 


HUMPHOSLIP_PE A_2_T 1 6 


384 


389 


HUMPHOSLIP_PEA_2_T 1 7 


384 


389 


HUMPHOSLIP_PEA_2_Tl 8 


384 


389 


HUMPHOSLIPJPEA_2_Tl 9 


384 


389 



Segment cluster HUMPHOSLIPJPEA _2_node_8 according to the present invention is 
supported by 171 libraries. The number of libraries was determined as previously described. 
5 This segment can be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6 5 

HUMPHOSLIP_PEA m 2_T7, HUMPHOSLIP__PEA__2_T14, HUMPHOSLIPJPEA_2JT16, 
HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 511 below describes the starting and ending position of this segment on each transcript 

Table 511 - Segment location on transcripts 



TrapsGtipt name ) 


Segment : ;' 


Segment 1 . 




: starting position ; 


ending position 


HUMPHOSLIP_PEA_2_T6 


344 


378 


HUMPHOSLIP_PEA_2_T7 


390 


424 


HUMPHOSLIP_PEA_2_T14 


390 


424 


HUMPHOSLIP_PEA_2_Tl 6 


390 


424 


HUMPHOSLIP_PEA_2_T 1 7 


390 


424 


HUMPHOSLIP_PEA_2_Tl 8 


390 


424 


HUMPHOSLIP_PEA_2_Tl 9 


390 


424 
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Segment cluster HUMPHOSLIP_PEA_2_node_9 according to the present invention is 
supported by 168 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIP_PEA_2JT6, 
HUMPHOSLIPJ>EA_2_T7 ? HUMPHOSLIP_PEA_2_T14, HUMPHOSLIPJPEA_2_T16, 
HUMPHOSLIPJPEA_2_T17, HUMPHOSLIPJ>EA_2JT18 and HUMPHOSLIPJ>EA_2_T19. 
Table 512 below describes the starting and ending position of this segment on each transcript. 



Table 512 - Segment location on transcripts 





Segment 


Segment 




"starting position . 


ending position y: 


HUMPHOSLEP_PEA_2_T6 


379 


429 


HUMPHOSLIP_PEA_2_T7 


425 


475 


HUMPHOSLIP_PEA_2_T14 


425 


475 


HUMPHOSLIP_PEA_2_Tl 6 


425 


475 


HUMPHOSLIP_PEA_2_T17 


425 


475 


HUMPHOSLIP_PEA_2_Tl 8 


425 


475 


HUMPHOSLIP_PEA_2_T19 


425 


475 



Segment cluster HUMPHOSLIPJPEA_2_node_14 according to the present invention is 
supported by 6 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMPHOSLIPJ>EA_2_T7. Table 513 
below describes the starting and ending position of this segment on each transcript. 



Table 513 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


HUMPHOSLIPJPEA_2_T7 


476 


567 



Segment cluster HUMPHOSLIP_PEA_2_node_15 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
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HUMPHOSLIP_PEA_2_T7, HUMPHOSLIPJPEA_2_T14, HUMPHOSLIPJ>EA_2_T16, 
HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIPJPEA_2_T19. Table 514 below describes the 
starting and ending position of this segment on each transcript. 



Table 514 - Segment location on transcripts 



Transcript name "■ '* v >_ 


: Segment ■'■ ,/ 


Segment " ' ..\ ' - ;■; _. 




starting position 


ending position 


HUMPHOSLIP_PEA_2_T6 


430 


445 


HUMPHOSLIP_PEA_2_T7 


568 


583 


HUMPHOSLIP_PEA_2_T14 


476 


491 


HUMPHOSLIP_PE A_2_T 1 6 


476 


491 


HUMPHOSLIP_PEA_2_Tl 8 


476 


491 


HUMPHOSLIP_PEA_2_Tl 9 


476 


491 



5 



Segment cluster HUMPHOSLIP_PEA_2_node_16 according to the present invention is 
supported by 179 libraries. The number of libraries was deteraiined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
10 HUMPHOSLIP_PEA_2_T7, HUMPHOSLIPJ>EA_2JT14, HUMPHOSLIP_PEA„2_T16, 

HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA__2_T19. Table 515 below describes the 
starting and ending position of this segment on each transcript. 

Table 515 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 


HUMPHOSLIP_PEA_2_T6 


446 


534 


HUMPHOSLIP_PEA_2_T7 


584 


672 


HUMPHOSLIP_PEA_2_T14 


492 


580 


HUMPHOSLIP_PEA_2_T16 


492 


580 


HUMPHOSLIP_PEA_2_T 1 8 


492 


580 


HUMPHOSLIP_PEA_2_T19 


492 


580 
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Segment cluster HUMPHOSLIPJPEA_2_node_l 7 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
5 HUMPHOSLIPJPEA_2_T7 ? HUMPHOSLIP JPEA_2_T14 ? HUMPHOSLIP_PEA_2 JT1 6, 

HUMPHOSLIPJPEA_2__TT8 and HUMPHOSLIP__PEA_2JT19. Table 516 below describes the 
starting and ending position of this segment on each transcript. 

Table 516 - Segment location on transcripts 



Transcript name . jferf. . . " if 


Segment J jf' ; , . 
starting position j 


Segment ' '< ' 0 
ending position : / 


HUMPHOSLIP_PEA_2_T6 


535 


558 


HUMPHOSLIP_PEA_2_T7 


673 


696 


HUMPHOSLIP_PEA_2_T14 


581 


604 


HUMPHOSLIP_PEA_2_Tl 6 


581 


604 


HUMPHOSLIP_PEA_2_Tl 8 


581 


604 


HUMPHOSLIP_PEA_2_T19 


581 


604 



Segment cluster HUMPHOSLIPJPEA_2_node_23 according to the present invention is 
supported by 168 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIPJPEA2T6, 
HUMPHOSLIP_PEA„2_T7, HUMPHOSLIPJPEA_2_T14, HUMPHOSLIPJ>EA_2_T16, 
15 HUMPHOSLIP_PEA_2_T17, HUMPHOSLIPJ>EA_2JT18 and HUMPHOSLIP„PEA_2JT19. 
Table 517 below describes the starting and ending position of this segment on each transcript. 



Table 517 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


i Segment 
ending position 


HUMPHOSLIP_PEA_2_T6 


715 


766 


HUMPHOSLIP_PEA_2_T7 


853 


904 
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HUMPHOSLIP_PEA_2_Tl 4 


761 


812 


HUMPHOSLIP_PEA_2_T 1 6 


761 


812 


HUMPHOSLIP_PEA_2_T 1 7 


476 


527 


HUMPHOSLIP_PEA_2_Tl 8 


605 


656 


HUMPHOSLIP_PEA_2_Tl 9 


761 


812 



Segment cluster HUMPHOSLIP_PEA_2_node_24 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
5 HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 

HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_Tl 8 and HUMPHOSLIP_PEA_2_T19. 
Table 518 below describes the starting and ending position of this segment on each transcript. 

Table 518 - Segment location on transcripts 



Transcript name _ :,s$x,; . ■ '■■§■ 


Segment .■. 


. Segment . ' 'J 


X" '■"%-. :=;■ . " . ' T . '/. V * 


starting position "•' : 

.- <r .. • , ,., : 


ending position- . . | 


HUMPHOSLIP_PEA_2_T6 


767 


778 


HUMPHOSLIP_PEA_2_T7 


905 


916 


HUMPHOSLIP_PEA_2_T14 


813 


824 


HUMPHOSLIP_PEA_2_T16 


813 


824 


HUMPHOSLIP_PEA_2_T17 


528 


539 


HUMPHOSLIP_PEA_2_T 1 8 


657 


668 


HUMPHOSLIP_PE A_2_T 1 9 


813 


824 



10 

Segment cluster HUMPHOSLIP_PEA_2_node_25 according to the present invention is 
supported by 5 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2_T14 and 
HUMPHOSLIP PEA 2 T1 8. Table 519 below describes the starting and ending position of this 
1 5 segment on each transcript. 

Table 519- Segment location on transcripts 
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613 



Transcript name v--v ; ; ; 


Segment " 
starting position 


Segment 
ending position 


HUMPHOSLIP_PEA_2_T 1 4 


825 


909 


HUMPHOSLIP_PEA_2_T 1 8 


669 


753 



Segment cluster HUMPHOSLIP JPEA_2_node_26 according to the present invention is 
supported by 163 libraries. The number of libraries was determined as previously described. 
5 This segment can be found in the following transcript(s): HUMPHOSLIPJPEA 2 T6, 

HUMPHOSLIP_PEA_2JT7 5 HUMPHOSLIPJPEA_2_T14, HUMPHOSLIP_PEA_2_T16, 
HUMPHOSLIP_PEA_2_T17 ? HUMPHOSLIPJ>EA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 520 below describes the starting and ending position of this segment on each transcript. 

Table 520 - Segment location on transcripts 



. • ~ * v* ' : ' .-• * . 'f£ : '? 


Segment .\ : % 
starting position 


. Segment .. •' % * |, \. 
ending position;;; 1 ; 


HUMPHOSLIP_PEA_2_T6 


119 


842 


HUMPHOSLIP_PEA_2_T7 


917 


980 


HUMPHOSLIP_PEA_2_T14 


910 


973 


HUMPHOSLIP_PEA_2_Tl 6 


825 


888 


HUMPHOSLIP_PE A_2_T 1 7 


540 


603 


HUMPHOSLIP_PEA_2_T 1 8 


754 


817 


HUMPHOSLIP_PEA_2_Tl 9 


825 


888 



Segment cluster HUMPHOSLIP_PEA_2_node_29 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T17, 
1 5 HUMPHOSLIPJPEA_2_Tl 8 and HUMPHOSLIP_PEA_2_T19. Table 521 below describes the 
starting and ending position of this segment on each transcript. 

Table 521 - Segment location on transcripts 
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Transcript name •■ ■* ! \: C' 


Segment 


Segment 




starting position 


enuiBg posmuTi 


HUMPHOSLIP_PEA_2_T6 


843 


849 


HUMPHOSLIP_PEA_2_T7 


981 


987 


HUMPHOSLIP_PEA_2_T 1 4 


974 


980 


HUMPHOSLIP_PEA_2_T 1 7 


604 


610 


HUMPHOSLIP_PEA_2_Tl 8 


818 


824 


HUMPHOSLIP_PEA_2_T19 


889 


895 



Segment cluster HUMPHOSLIP_PEA_2_node_30 according to the present invention is 
supported by 181 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIP_PEA_2__T6, 
HUMPHOSLIPJ>EA_2JT7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIPJPEA_2JT16, 
HUMPHOSLIP__PEA__2_T17 ? HUMPHOSLIPJ>EA_2_T18 and HUMPHOSLIP__PEA_2_T19. 
Table 522 below describes the starting and ending position of this segment on each transcript. 



Table 522- Segment location on transcripts 



Transcript name - r >.'„ >' 


' Segment ■ 


Segment^ 




: starting position ; \ 


ending position,: 


HUMPHOSLIP_PEA_2_T6 


850 


934 


HUMPHOSLIP_PEA_2_T7 


988 


1072 


HUMPHOSLIP_PEA_2_T14 


981 


1065 


HUMPHOSLIP_PEA_2_T16 


889 


973 


HUMPHOSLIP_PEA_2_T17 


611 


695 


HUMPHOSLIP_PEA_2_Tl 8 


825 


909 


HUMPHOSLIP_PEA_2_Tl 9 


896 


980 



Segment cluster HUMPHOSLIPJPEA_2_node_33 according to the present invention is 
supported by 173 libraries. The number of libraries was determined as previously described. 
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This segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2JT6, 
HUMPHOSLIP_PEA__2_T7 3 HUMPHOSLIP__PEA_2_T14 ? HUMPHOSLIP_PEA_2JT16, 
HUMPHOSLIP_PEA_2JT17, HUMPHOSLIPJPEA_2_T18 and HUMPHOSLIPJPEA_2_T19. 
Table 523 below describes the starting and ending position of this segment on each transcript. 

Table 523 - Segment location on transcripts 



Transcript name H : , ■' 


Segment i i 
starting position 


Segment 

ending position "% . 


HUMPHOSLIP_PEA_2_T6 


935 


970 


HUMPHOSLIP_PEA_2_T7 


1073 


1108 


HUMPHOSLIP_PEA_2_T14 


1066 


1101 


HUMPHOSLIP_PEA_2_Tl 6 


974 


1009 


HUMPHOSLIP_PEA_2_Tl 7 


696 


731 


HUMPHOSLIP_PEA_2_Tl 8 


910 


945 


HUMPHOSLIP_PEA_2_T19 


981 


1016 



Segment cluster HUMPHOSLIP_PEA_2_node_36 according to the present invention is 
supported by 163 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIP PEA 2 T6, 
HUMPHOSLIPJPEA_2_T7, HUMPHOSLIP_PEA_2_T14 ? HUMPHOSLIP_PEA_2_T16 5 
HUMPHOSLIP_PEA_2JT17, HUMPHOSLIP_PEA_2JT18 and HUMPHOSLIP_PEA_2_T19 
Table 524 below describes the starting and ending position of this segment on each transcript. 



Table 524- Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 


HUMPHOSLIP_PEA_2_T6 


1112 


1156 


HUMPHOSLIP_PEA_2_T7 


1250 


1294 


HUMPHOSLIP_PEA_2_T14 


1243 


1287 


HUMPHOSLIP_PEA_2_T16 


1151 


1195 



WO 2006/131783 



PCT/IB2005/004037 



616 



HUMPHOSLIP_PEA_2_T 1 7 


873 


917 


HUMPHOSLIP_PEA_2_Tl 8 


1087 


1131 


HUMPHOSLIP_PEA_2_T 1 9 


1158 


1202 



Segment cluster HUMPHOSLIP_PEA_2_node_37 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
5 HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T 14, HUMPHOSLIP_PEA_2_T 1 6, 

HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 525 below describes the starting and ending position of this segment on each transcript. 



Table 525 - Segment location on transcripts 



»Tians(2ipt narhS" • 


Segment : ,i* 


, Segment ,V ; ' ; ' ^ 




" starting position 


ending position 


HUMPHOSLIP_PEA_2_T6 


1157 


1171 


HUMPHO SLIPJPE A_2_T7 


1295 


1309 


HUMPHOSLIP_PEA_2_T14 


1288 


1302 


HUMPHOSLIP_PEA_2_Tl 6 


1196 


1210 


HUMPHOSLIP_PEA_2_Tl 7 


918 


932 


HUMPHOSLIP_PEA_2_Tl 8 


1132 


1146 


HUMPHOSLIP_PEA_2_T19 


1203 


1217 



Segment cluster HUMPHOSLIPJPEAJ2_node_39 according to the present invention is 
supported by 166 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIP_PEA__2_T6, 
HUMPHOSLIP_PEA_2_T7, HUMPHOSLIPJPEA__2JT14, HUMPHOSLIPJPEA_2_T16, 
15 HUMPHOSLff_PEA_2JT17 ? HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 525 below describes the starting and ending position of this segment on each transcript. 

Table 525 - Segment location on transcripts 
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Traascript name . -I '<>■:; ; 


Segment; .>;}. 


Segment 




starting position 


ending pos ition . , 


HUMPHOSLIP_PEA_2_T6 


1172 


1201 


HUMPHOSLIP_PEA_2_T7 


1310 


1339 


HUMPHOSLIP_PEA_2_T14 


1303 


1332 


HUMPHOSLIP_PEA_2_Tl 6 


1211 


1240 


HUMPHOSLIPJPEA 2 T17 


933 


962 


HUMPHOSLIP_PEA_2_Tl 8 


1147 


1176 


HUMPHOSLIP_PEA_2_T19 


1218 


1247 



Segment cluster HUMPHOSLIP_PEA 2_node_40 according to the present invention is 
supported by 199 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript (s): HUMPHOSLIP_PEA_2_T6, 
HUMPHOSLIP_PEA_2__T7 ? HUMPHOSLIP_PEA_2JT14, HUMPHOSLIP„PEA_2_T16 ? 
HUMPHOSLIPJPEAJ2JT17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP„PEA_2_T19. 
Table 526 below describes the starting and ending position of this segment on each transcript. 

Table 526 - Segment location on transcripts 



Transcript name • .\ 


Segment 

; starting position 


ending position 


HUMPHOSLIP_PEA_2_T6 


1202 


1288 


HUMPHOSLIP_PEA_2_T7 


1340 


1426 


HUMPHOSLIP_PEA_2_T14 


1333 


1419 


HUMPHOSLIP_PEA_2_T16 


1241 


1327 


HUMPHOSLIP_PEA_2_Tl 7 


963 


1049 


HUMPHOSLIP_PEA_2_Tl 8 


1177 


1263 


HUMPHOSLIP_PEA_2_Tl 9 


1248 


1334 
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Segment cluster HUMPHO SLIP_PE A 2 node_4 1 according to the present invention is 
supported by 186 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2JT6 5 
HUMPHOSLIPJPEA_2JT7 ? HUMPHOSLIPJPEA_2_TT4, HUMPHOSLIPJPEA_2JT16, 
5 HUMPHOSLIP_PEA_2_T17, HUMPHOSLIPJ>EA_2JT18 and HUMPHOSLIPJ>EA_2JT19. 
Table 527 below describes the starting and ending position of this segment on each transcript. 



Table 527 - Segment location on transcripts 



Transcript name , •/ " * ; ?c 


Segment , ■ 
starting position ; 


Segment - '', 
ending position . " ; 


HUMPHOSLIP_PEA_2_T6 


1289 


1318 


HUMPHOSLIP_PEA_2_T7 


1427 


1456 


HUMPHOSLIP_PEA_2_T14 


1420 


1449 


HUMPHOSLIP_PEA_2_T16 


1328 


1357 


HUMPHOSLIP_PEA_2_T17 


1050 


1079 


HUMPHOSLIP_PEA_2_Tl 8 


1264 


1293 


HUMPHOSLIP_PEA_2_T19 


1335 


1364 



1 0 Segment cluster HUMPHOSLIP_PEA_2_node_42 according to the present invention can 

be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 

HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 
HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 528 below describes the starting and ending position of this segment on each transcript. 

1 5 Table 528 - Segment location on transcripts 



Transcript name 


Segment 

, starting position 


Segment '., 
ending position 


HUMPHOSLIP_PEA_2_T6 


1319 


1336 


HUMPHOSLIP_PEA_2_T7 


1457 


1474 


HUMPHOSLIP_PEA_2_T14 


1450 


1467 
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HUMPHOSLIP_PE A_2_T 1 6 


1358 


1375 


HUMPHOSLIP_PE A_2_T 1 7 


1080 


1097 


HUMPHOSLIP_PEA_2_T 1 8 


1294 


1311 


HUMPHOSLIP_PE A_2_T 1 9 


1365 


1382 



Segment cluster HUMPHOSLIP_PEA_2_node_44 according to the present invention is 
supported by 185 libraries. The number of libraries was determined as previously described. 
5 This segment can be found in the following transcript(s): HUMPHOSLIPPEA2T6, 

HUMPHOSLIPJ>EA_2jr7 5 HUMPHOSLIPJ>EA_2_T14, HUMPHOSLIPJPEA_2_T16, 
HUMPHOSLIP_PEA_2_T17 5 HUMPHOSLIP_PEA_2_T18 and HUWHOSLIPJ>EA_2JT19. 
Table 529 below describes the starting and ending position of this segment on each transcript. 

Table 529 - Segment location on transcripts 



Transcnpt name j^'.. ' >'v:; '■; \- 


t Segment ./? ... 
starting position 


.Segment ; 
ending position ;' 


HUMPHOSLIP_PEA_2_T6 


1337 


1363 


HUMPHOSLIP_PEA_2_T7 


1475 


1501 


HUMPHOSLIP_PEA_2_T 14 


1468 


1494 


HUMPHOSLIP JPEA_2_T 1 6 


1376 


1402 


HUMPHOSLIP_PEA_2_T17 


1098 


1124 


HUMPHOSLIP_PEA_2_Tl 8 


1312 


1338 


HUMPHOSLIP_PEA_2_T19 


1383 


1409 



Segment cluster HUMPHOSLIPJPEA_2_node__45 according to the present invention is 
supported by 197 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2_T6 J 
1 5 HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2__T16, 

HUMPHOSLIP„PEA_2_T17 3 HUMPHOSLIP JPEA_2_T1 8 and HUMPHOSLIP_PEA_2_T19. 
Table 530 below describes the starting and ending position of this segment on each transcript. 
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Table 530 - Segment location on transcripts 



Transcript name 


Segment 

starting position ^ 


Segment -v. ; 
ending position 


HUMPHOSLIP_PEA_2_T6 


1364 


1404 


HUMPHOSLIP_PEA_2_T7 


1502 


1542 


HUMPHO SLIPJPE A_2_T 1 4 


1495 


1535 


HUMPHOSLIP_PEA_2_Tl 6 


1403 


1443 


HUMPHOSLIP_PEA_2_Tl 7 


1125 


1165 


HUMPHOSLIP_PEA_2_Tl 8 


1339 


1379 


HUMPHOSLIP_PEA_2_Tl 9 


1410 


1450 



Segment cluster HUMPHOSLIPJPEA_2_node_47 according to the present invention is 
5 supported by 223 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2_T6, 
HUMPHOSLIPJPEA_2_T7, HUMPHOSLIPJPEA_2_T14, HUMPHOSLIPJ>EA__2_T16> 
HUMPHOSLIP_PEA__2_T17 5 HUMPHOSLIPJPEA_2_T18 and HUMPHOSLIP__PEA_2_T19. 
Table 531 below describes the starting and ending position of this segment on each transcript. 

1 0 Table 531 - Segment location on transcripts 



Transcript name f < 


Segment ..■ 


! Segment ; ' ; 




; starting position 


ending position - ; 


HUMPHOSLIP_PEA_2_T6 


1405 


1447 


HUMPHOSLIP_PEA_2_T7 


1543 


1585 


HUMPHOSLIP_PEA_2_T14 


1536 


1578 


HUMPHOSLIP_PEA_2_Tl 6 


1444 


1486 


HUMPHOSLIP_PEA_2_T17 


1166 


1208 


HUMPHOSLIP_PEA_2_Tl 8 


1380 


1422 


HUMPHOSLIP_PEA_2_Tl 9 


1451 


1493 
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Segment cluster HUMPHOSLIP_PEA_2_node_5 1 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIPJPEA_2JT6, 

HUMPHOSLIP_PEA_2_T7, HUMPHOSLIPJ>EA_2_T14, HUMPHOSLIPJPEA_2_T16, 
HUMPHOSLIP_PEA_2JT17 ? HUMPHOSLIPJPEA_2_T18 and HUMPHOSLIPJPEA_2_T19. 
5 Table 532 below describes the starting and ending position of this segment on each transcript. 



Table 532 - Segment location on transcripts 



.Transcript flame * 0f - ' . :^ I ' 


Segment J. ■ 


■Segment ... " ^ : 




starting position k , -; 


ending position : 


HUMPHOSLIP_PEA_2_T6 


1448 


1462 


HUMPHOSLIP_PEA_2_T7 


1586 


1600 


HUMPHOSLIP_PEA_2_Tl 4 


1579 


1593 


HUMPHOSLIP_PEA_2_Tl 6 


1487 


1501 


HUMPHOSLIP_PEA_2_T 1 7 


1209 


1223 


HUMPHOSLIP_PEA_2_Tl 8 


1423 


1437 


HUMPHOSLIP_PEA_2_T19 


1494 


1508 



Segment cluster HUMPHOSLIP JPE A_2_node_5 2 according to the present invention is 
10 supported by 235 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIP_PEA_2JT6, 
HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14 ? HUMPHOSLIP_PEA__2_T16, 
HUMPHOSLIP_PEA_2_T17, HUMPHOSLIPJPEA_2_T18 and HUMPHOSLIP_PEA_2_T19 
Table 533 below describes the starting and ending position of this segment on each transcript. 

15 Table 533 - Segment location on transcripts 



Transcript name 


-Segment 
starting position 


Segment 
ending position 


HUMPHOSLIP_PEA_2_T6 


1463 


1511 


HUMPHOSLIP_PEA_2_T7 


1601 


1649 


HUMPHOSLIP_PEA_2_T14 


1594 


1642 
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HUMPHO SLIP_PE A_2_T 1 6 


1502 


1550 


HUMPHOSLIPJPEA_2_Tl 7 


1224 


1272 


HUMPHOSLIP_PEA_2_T 1 8 


1438 


1486 


HUMPHOSLIP_PE A_2_T 1 9 


1509 


1557 



Segment cluster HUMPHO SLIPJPEA_2_node_5 3 according to the present invention is 
supported by 5 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following txanscript(s): HUMPHOSLIP_PEA__2_T19. Table 534 
below describes the starting and ending position of this segment on each transcript. 



Table 534 - Segment location on transcripts 



Transcript name ; ; : * r: , ., 


Segment $*: ' 4;[ 
stalttrig position ,.| >< 


Segment - 
ending position - ? 


HUMPHOSLIP_PEA_2_Tl 9 


1558 


1640 



10 Segment cluster HUMPHOSLIPJPEA_2_node_54 according to the present invention is 

supported by 236 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIP PEA 2 T6, 
HUMPHOSLIP_PEA_2_T7 5 HUMPHOSLIP JPEA_2_T1 4, HUMPHOSLIP_PEA_2_T16 ? 
HUMPHOSLIP_PEA_2JT17, HUMPHO SLIP_PEA_2 JIT 8 and HUMPHOSLIP_PEA_2_T19. 

15 Table 535 below describes the starting and ending position of this segment on each transcript. 

Table 535 - Segment location on transcripts 



Transcript name , 


Segment 


Segment 




starting position 


ending position 


HUMPHOSLIP_PEA_2_T6 


1512 


1552 


HUMPHOSLIP_PEA_2_T7 


1650 


1690 


HUMPHOSLIP_PEA_2_T14 


1643 


1683 


HUMPHOSLIP_PEA_2_T16 


1551 


1591 
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HUMPHOSLIP_PEA_2_T 1 7 


1273 


1313 


HUMPHOSLIP_PEA_2_T 1 8 


1487 


1527 


HUMPHOSLIP_PEA_2_T19 


1641 


1681 



Segment cluster HUMPHOSLIPJPEA_2_node_55 according to the present invention is 
supported by 232 libraries. The number of libraries was determined as previously described. 
5 This segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2_T6, 

HUMPHOSLIPJPEA_2_T7, HUMPHOSLIPJPEAJ2_T14, HUMPHOSLIP_PEA_2_Tl 6, 
HUMPHOSLIPJ ) EA_2jri7, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIPJPEA_2_T19 
Table 536 below describes the starting and ending position of this segment on each transcript. 



Table 536 - Segment location on transcripts 



^xmst^W^m y, ■ ; : ;/ • '^y ^ ym 


Segment ■ ■;, 
starting position 


Segment ^ 
^ttdinf posMori i 


HUMPHOSLIP_PEA_2_T6 


1553 


1588 


HUMPHOSLIP_PEA_2_T7 


1691 


1726 


HUMPHOSLIP_PEA_2_T14 


1684 


1719 


HUMPHOSLIP_PEA_2_T16 


1592 


1627 


HUMPHOSLIP_PEA_2_T17 


1314 


1349 


HUMPHOSLIP_PEA_2_Tl 8 


1528 


1563 


HUMPHOSLIP_PEA_2_T19 


1682 


1717 



Segment cluster HUMPHOSLIPJPE A_2_node_5 8 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 
15 HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 537 below describes the starting and ending position of this segment on each transcript. 

Table 537 - Segment location on transcripts 
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Transcript name • '. ' : ' y 


Segment 
starting position 


Segment 

ending position - 


HUMPHOSLIP_PEA_2_T6 


1589 


1612 


HUMPHOSLIPJPEA_2_T7 


1727 


1750 


HUMPHOSLIP_PEA_2_T14 


1720 


1743 


HUMPHOSLIP_PEA_2_Tl 6 


1628 


1651 


HUMPHOSLIP_PEA_2_Tl 7 


1350 


1373 


HUMPHOSLIP_PEA_2_Tl 8 


1564 


1587 


HUMPHOSLIP_PEA_2_T 1 9 


1718 


1741 



Segment cluster HUMPHO SLIP JPE A_2_node_5 9 according to the present invention is 
supported by 230 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2_T6 5 
HUMPHOSLIPJPEA_2JT7, HUMPHOSLIP_PEA__2_T14, HUMPHOSLIP_PEA_2_T16, 
HUMPHOSLIPJ>EA_2_T17, HUMPHOSLIP_PEA„2„T18 and HUMPHOSLIPJ>EA_2_Tl 9 
Table 538 below describes the starting and ending position of this segment on each transcript. 



Table 538 - Segment location on transcripts 



Transcript name ^ . f 


. Segment ■ p - . 
starting position 


Segment . \ * 
ending position . < 


HUMPHOSLIP_PEA_2_T6 


1613 


1648 


HUMPHOSLIP_PEA_2_T7 


1751 


1786 


HUMPHOSLIP_PEA_2_T14 


1744 


1779 


HUMPHOSLIP_PEA_2_T16 


1652 


1687 


HUMPHOSLIP_PEA_2_T17 


1374 


1409 


HUMPHOSLIP_PEA_2_Tl 8 


1588 


1623 


HUMPHOSLIP_PEA_2_Tl 9 


1742 


1777 
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Segment cluster HUMPHOSLIP__PEA_2_node_60 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIPJPEAJ2 JT6, 

HUMPHOSLIP_PEA_2JT7, HUMPHOSLIPJPEA_2JT14, HUMPHOSLIP_PEA__2JT16, 
HUMPHOSLIPJ>EA_2_T17 5 HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 539 below describes the starting and ending position of this segment on each transcript. 



Table 539 - Segment location on transcripts 



Transcript name ' - v 


Segment . ? ' / _ 


Segment ; . R : 




starting position 


.ending position 


HUMPHOSLIPJPEA_2_T6 


1649 


1671 


HUMPHOSLIP_PEA_2_T7 


1787 


1809 


HUMPHOSLIP_PEA_2_T14 


1780 


1802 


HUMPHOSLIP_PEA_2_T 1 6 


1688 


1710 


HUMPHOSLIP_PEA_2_T 1 7 


1410 


1432 


HUMPHOSLIP_PEA_2_Tl 8 


1624 


1646 


HUMPHOSLIP_PEA_2_Tl 9 


1778 


1800 



Segment cluster HUMPHOSLIP_PEA_2_node_61 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIPJPEA_2_T6, 

HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 
HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 540 below describes the starting and ending position of this segment on each transcript. 



Table 540 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 


HUMPHOSLIP_PEA_2_T6 


1672 


1680 


HUMPHOSLIP_PEA_2_T7 


1810 


1818 


HUMPHOSLIP_PEA_2_T 14 


1803 


1811 


HUMPHOSLIP_PEA_2_T 1 6 


1711 


1719 
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HUMPHOSLIP_PEA_2_T 1 7 


1433 


1441 


HUMPHOSLIP_PEA_2_Tl 8 


1647 


1655 


HUMPHOSLIP_PEA_2_Tl 9 


1801 


1809 



Segment cluster HUMPHOSLIP_PEA_2_node_62 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
5 HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_Tl 6, 

HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 541 below describes the starting and ending position of this segment on each transcript. 



Table 541 - Segment location on transcripts 



Transcript name W, / v • 

:■■ . ■ . . %< . ■ ; . -My.,:.;. . ■;:,:)'. 

r. '■% M. " •»•' ' ' i ' K ■ i% 


' Segment . ''-V>. .? 
: starting position 


Segment |- . Tp t 
ending position 


HUMPHOSLIP_PEA_2_T6 


1681 


1703 


HUMPHOSLIP_PEA_2_T7 


1819 


1841 


HUMPHOSLIP_PEA_2_T14 


1812 


1834 


HUMPHOSLIP_PEA_2_T16 


1720 


1742 


HUMPHOSLIP_PEA_2_T17 


1442 


1464 | 


HUMPHOSLIP_PEA_2_Tl 8 


1656 


1678 


HUMPHOSLIP_PE A_2_T 1 9 


1810 


1832 



Segment cluster HUMPHOSLIP_PEA_2_node_63 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 

HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 
HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
1 5 Table 542 below describes the starting and ending position of this segment on each transcript. 

Table 542 - Segment location on transcripts 
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Transcript name • * ; 


Segment 

starting position ■„ 


Segment 
ending position 1 


HUMPHOSLIP_PEA_2_T6 


1704 


1727 


HUMPHOSLIP_PEA_2_T7 


1842 


1865 


HUMPHOSLIP_PEA_2_T14 


1835 


1858 


HUMPHOSLIP_PEA_2_Tl 6 


1743 


1766 


HUMPHO SLIP_PEA_2_T 1 7 


1465 


1488 


HUMPHOSLIP_PEA_2_Tl 8 


1679 


1702 


HUMPHOSLIP_PEA_2_T 1 9 


1833 


1856 



Segment cluster HUMPHOSLIP_PEA_2_node_64 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 

HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 
HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 543 below describes the starting and ending position of this segment on each transcript. 

Table 543 - Segment location on transcripts 



Transcript, name 4'- 


Segment ■ 
starting position . 


Segment J - • 
ending position '.ft 


HUMPHOSLEP_PEA_2_T6 


1728 


1734 


HUMPHOSLIP_PEA_2_T7 


1866 


1872 


HUMPHOSLIP_PEA_2_T14 


1859 


1865 


HUMPHOSLIP_PEA_2_Tl 6 


1767 


1773 


HUMPHOSLIP_PEA_2_T17 


1489 


1495 


HUMPHOSLIP_PE A_2_T 1 8 


1703 


1709 


HUMPHOSLIP_PE A_2_T 1 9 


1857 


1863 



Segment cluster HUMPHOSLIP_PEA_2_node_65 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
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HUMPHOSLIPJ>EA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 
HUMPHOSLIPJ>EA_2_T17, HUMPHOSLIP_PEA_2__T18 and HUMPHOSLIP_PEA_2_T19. 
Table 544 below describes the starting and ending position of this segment on each transcript. 



Table 544 - Segment location on transcripts 



Transcript name .r . 5 .jf 


..Segment . ,. ; J ■ ■ 
starting position 


Segment 

ending position . y\ 


HUMPHOSLIP_PEA_2_T6 


1735 


1754 


HUMPHOSLIP_PEA_2_T7 


1873 


1892 


HUMPHOSLIP_PE A_2_T 1 4 


1866 


1885 


HUMPHOSLIPJPE A_2_T 1 6 


1774 


1793 


HUMPHOSLIP_PEA_2_T 1 7 


1496 


1515 


HUMPHOSLIP_PEA_2_Tl 8 


1710 


1729 


HUMPHOSLIP_PEA_2_Tl 9 


1864 


1883 



5 



Segment cluster HUMPHOSLIP_PEA_2_node_66 according to the present invention is 
supported by 180 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMPHOSLIPJPEA_2_T6, 
10 HUMPHOSLIPJ>EA_2_T7, HUMPHOSLIPJPEA_2JT14, HUMPHOSLIPJPEA_2_T16, 

HUMPHOSLIP_PEA_2JT17 5 HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19 
Table 545 below describes the starting and ending position of this segment on each transcript. 



Table 545 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




i starting position 


ending position ? 


HUMPHOSLIP_PEA_2_T6 


1755 


1844 


HUMPHOSLIP_PEA_2_T7 


1893 


1982 


HUMPHOSLIP_PEA_2_T14 


1886 


1975 


HUMPHOSLIP_PEA_2_T16 


1794 


1883 


HUMPHOSLIP_PEA_2_T17 


1516 


1605 
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HUMPHOSLIP_PEA_2_T 1 8 


1730 


1819 


HUMPHOSLIP_PEA_2_T 1 9 


1884 


1973 



Segment cluster HUMPHOSLIPJPEA_2_node_67 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
5 HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 

HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
Table 546 below describes the starting and ending position of this segment on each transcript. 



Table 546 - Segment location on transcripts 



■ * ■ &c :■ .*> ' < . ' J % ■■■<' s^" ' ■ 'S- 


Segment .... / 
starting position < f 


. Segment ; '. 
ending position £■ 


HUMPHOSLIP_PEA_2_T6 


1845 


1866 


HUMPHOSLIP_PEA_2_T7 


1983 


2004 


HUMPHOSLIP_PEA_2_T14 


1976 


1997 


HUMPHOSLIP_PEA_2_Tl 6 


1884 


1905 


HUMPHOSLIP_PEA_2_T17 


1606 


1627 


HUMPHOSLIP_PEA_2_Tl 8 


1820 


1841 


HUMPHOSLIP_PEA_2_Tl 9 


1974 


1995 



Segment cluster HUMPHOSLIP_PEA_2_node_69 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 
HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_T19. 
15 Table 547 below describes the starting and ending position of this segment on each transcript. 



Table 547 - Segment location on transcripts 



Transcript name 


\ Segment 


! Segment 




i starting position 


ending position 
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HUMPHOSLIP_PEA_2_T6 


2286 


2297 


HUMPHOSLIP_PEA_2_T7 


2424 


2435 


HUMPHOSLIP_PEA_2_T 1 4 


2417 


2428 


HUMPHOSLIP_PE A_2_T 1 6 


2325 


2336 


HUMPHOSLIP_PEA_2_Tl 7 


2047 


2058 


HUMPHOSLIPJPEA_2_Tl 8 


2261 


2272 


HUMPHOSLIP JPE A_2_T 1 9 


2415 


2426 



Segment cluster HUMPHOSLIP_PEA_2_node_71 according to the present invention can 
be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 

HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 
HUMPHOSLIP_PEA_2_T17, HUMPHOSLIP_PEA_2_T18 and HUMPHOSLIP_PEA_2_Tl 9. 
Table 548 below describes the starting and ending position of this segment on each transcript. 



Table 548 - Segment location on transcripts 



Tr&sc%t name .f . , 


Segment > 
starting position 


Segment ' . jkv — v ; , ; 
*' eiidihg position^' 


HUMPHOSLIP_PEA_2_T6 


2530 


2542 


HUMPHOSLIP_PEA_2_T7 


2668 


2680 


HUMPHOSLIPJPEA_2_T14 


2661 


2673 


HUMPHOSLIP_PEA_2_T16 


2569 


2581 


HUMPHOSLIP_PEA_2_T17 


2291 


2303 


HUMPHOSLIP_PEA_2_Tl 8 


2505 


2517 


HUMPHOSLIP_PEA_2_T19 


2659 


2671 



Segment cluster HUMPHOSLIP_PEA_2_node_72 according to the present invention is 
supported by 7 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6, 
HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP_PEA_2_T16, 
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HUMPHOSLIP_PEA_2JT17, HUMPHOSLIPJ>EA_2_T18 and HUMPHOSLIPJPEA_2JT19. 
Table 549 below describes the starting and ending position of this segment on each transcript. 

Table 549 - Segment location on transcripts 



Transcript name. V ,. s . ,J f 


Segment, "J;-; 


Segment \ 




\ starting position 


■ ending position . 


HUMPHOSLIP_PEA_2_T6 


2543 


2647 


HUMPHOSLIP_PEA_2_T7 


2681 


2785 


HUMPHOSLIP_PEA_2_T14 


2674 


2778 


HUMPHOSLIP_PEA_2_Tl 6 


2582 


2686 


HUMPHOSLIP_PEA_2_T 1 7 


2304 


2408 


HUMPHOSLIP_PEA_2_Tl 8 


2518 


2622 


HUMPHOSLIP_PEA_2_Tl 9 


2672 


2776 



Segment cluster HUMPHOSLIPJPEA2_node_73 according to the present invention is 
supported by 5 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMPHOSLIP_PEA_2_T6 ? 
HUMPHOSLIP_PEA_2_T7, HUMPHOSLIP_PEA_2_T14, HUMPHOSLIP__PEA_2_T16, 
1 0 HUMPHOSLIPJPEA_2_Tl 7 5 HUMPHOSLIPJ>EA_2_Tl 8 and HUMPHOSLIP_PEA_2_Tl 9. 



Table 550 below describes the starting and ending position of this segment on each transcript. 
Table 550 - Segment location on transcripts 



Transcript name ; ' , 


Segment 


Segment 




starting position 


ending position 


HUMPHOSLIP_PEA_2_T6 


2648 


2755 


HUMPHOSLIP_PEA_2_T7 


2786 


2893 


HUMPHOSLIP_PEA_2_T14 


2779 


2886 


HUMPHOSLIP_PEA_2_Tl 6 


2687 


2794 


HUMPHOSLIP_PEA_2_T17 


2409 


2516 


HUMPHOSLIP_PEA_2_Tl 8 


2623 


2730 
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HUMPHOSLIP_PEA_2_T 1 9 


2777 


2884 










T 



Segment cluster HUMPHOSLIP_PEA_2_nodeJ74 according to the present invention is 
supported by 10 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): HUMPHOSLIPJPEA__2JT6, 

HUMPHOSLIP_JPEA_2JT7, HUMPHOSLIPJ>EA_2_T14 ? HUMPHOSLIPJPEA_2JT16 ? 
HUMPHOSLIPJPEAJ2JT17, HUMPHOSLIPJPEA_2_T18 and HUMPHOSLIPJPEA_2JT19. 
Table 551below describes the starting and ending position of this segment on each transcript. 



Table 551 - Segment location on transcripts 



Traiiscript-.iiame'' * '"' f • • 


Segment } , f : 
'starting. position T f .. 


; ; Segment / ' y-j- .'; 
ending position ; 


HUMPHOSLIP_PEA_2_T6 


2756 


2845 


HUMPHOSLIP_PEA_2_T7 


2894 


2983 


HUMPHOSLIP_PEA_2_T14 


2887 


2976 


HUMPHOSLIP_PEA_2_Tl 6 


2795 


2884 


HUMPHOSLIP_PEA_2_Tl 7 


2517 


2606 


HUMPHOSLIP_PEA_2_Tl 8 


2731 


2820 


HUMPHOSLIP_PEA_2_Tl 9 


2885 


2974 



15 

Variant protein alignment to the previously known protein: 

Sequence name: PLTP__HUMAN 



20 Sequence documentation: 
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Alignment of: HUMPH0SLIP_PEA_2_P1 0 x PLTP_HUMAN 
Alignment segment 1/1: 

5 

Quality: 3716.00 

Escore: 0 

Matching length: 398 
length: 493 
10 Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 80.73 
Identity: 80.73 

Gaps : 1 

15 

Alignment : 

1 MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETIT 50 

I I I I I I I I I I I I I II i i I I I I I I I I I I ! I I I I I I I I I I I I 1 1 I I I I I I M 

20 1 MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETIT 5 0 

. • • * * 
51 I PDLRGKEGHFYYN I SE 67 

I I I I I I 1 I I I I I I I I I I 

51 IPDLRGKEGHFYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLR 100 

25 ..... 

67 67 

101 FRRQLLYWFFYDGGYINASAEGVSIRTGLELSRDPAGRMKVSNVSCQASV 150 
30 68 KVYDFLS T FI T S GMRFLLNQQ I C P VL YH AGTVLLNS LL 105 

I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I! I I I 



Total 
Matching Percent 
Total Percent 
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151 SRMHAAFGGTFKKVYDFLSTFITSGMRFLLNQQICPVLYHAGTVLLNSLL 20 0 
10 6 DTVPVRSSVDELVGIDYSLMKDPVASTSNLDMDFRGAFFPLTERNWSLPN 155 

I M I I I I I I I 1 M i I I I I I M I I I I I I i I 1 I I I I II M I I I I II I I I I I I 

201 DTVPVRSSVDELVGIDYSLMKDPVASTSNLDMDFRGAFFPLTERNWSLPN 250 

15 6 RAVE PQLQEEERMVYVAFSEFFFDS AMES YFRAGALQLLLVGDKVPHDLD 205 

I I I I I I M I I I I I I I I I I I I I I I I I I 1 I I I I I I I I 1 I I I I I I I I I I 1 I I I 
251 RAVEPQLQEEERMVYVAFSEFFFDS AMES YFRAGALQLLLVGDKVPHDLD 300 

206 MLLRATYFGSIVLLSPAVIDSPLKLELRVLAPPRCTIKPSGTTISVTASV 255 

I I I I I 1 I I ! II ! I I i I I I I I I I I I ! I I 1 ! I I I II I I M I I i I I I I I I I I I 

301 MLLRATYFGSIVLLSPAVIDSPLKLELRVLAPPRCTIKPSGTTISVTASV 350 

25 6 TIALVPPDQPEVQLSSMTMDARLSAKMALRGKALRTQLDLRRFRIYSNHS 305 

I I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
351 TIALVPPDQPEVQLSSMTMDARLSAKMALRGKALRTQLDLRRFRIYSNHS 400 

306 ALESL ALI PLQAPLKTMLQIGVMPMLNERTWRGVQI PLPEGINFVHEWT 355 

| | | | | | I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I M I I II I I I I I 
401 ALES LALI PLQAPLKTMLQIGVMPMLNERTWRGVQI PLPEGINFVHEWT 450 

35 6 NH AG FLT I GADLH FAKGLREV I E KNRP ADVRAS T APT P S T AAV 398 

I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I I I I I I I I 
451 NHAGFLT I GADLH FAKGLRE VI EKNRP ADVRAS TAPT PSTAAV 4 93 
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Sequence name: P LT P__HUMAN 
Sequence documentation : 
5 Alignment of: HUMPHOSLIP_PEA_2_P12 x PLTP_HUMAN 
Alignment segment 1/1: 

Quality: 4101.00 

10 Escore: 0 

Matching length: 427 Total 

length: 427 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 
15 Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

Alignment : 

20 . 

1 MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETIT 50 

| I I I I I M I I I I I I I t I I I I I I I I I I I I I I I ! I ! I I I I I 1 I I I M I I I I I 

1 MAL F G AL FL ALL AGAH AE F PG CK I RVT S KALE LVKQE GLRFLE QE LE TIT 50 
25 51 IPDLRGKEGHFYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLR 100 

I I I t I I I I I I I I I ! I I I I I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I 

51 I PDLRGKEGHFYYN I SEVKVTELQLTS SELDFQPQQELMLQ I TNASLGLR 100 

101 FRRQLLYWFFYDGGYINASAEGVSIRTGLELSRDPAGRMKVSNVSCQASV 150 

30 | | | | | | | I I I I I I I I I I I I II I I I I I I I t I I I I I I I I I I I I I I I I I I I I I 

101 FRRQLLYWFFYDGGYINASAEGVSIRTGLELSRDPAGRMKVSNVSCQASV 150 
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151 SRMHAAFGGTFKKVYDFLSTFITSGMRFLL1SIQQICPVLYHAGTVLLNSLL 200 

I | | | I I I I I i I ! 1 i I I I I I I I ! I I I I I I I i I I I I I ! I I 1 I I I I I I M I II 

151 S RMHAAFGGT FKKVYDFL S T F I T S GMRFLLNQQ I C P VL YHAGT VLLN S LL 200 

201 DTVPVRSSVDELVGIDYSLMKDPVASTSNLDMDFRGAFFPLTERNWSLPN 250 

| | I I I I I I I I I I I I I I I I I I 1 I I I II I I I I I I I I I I I I I I I I I I I II I I I 
201 DTVPVRSSVDELVGIDYSLMKDPVASTSNLDMDFRGAFFPLTERNWSLPN 250 

251 RAVEPQLQEEERMVYVAFSEFFFDSAMESYFRAGALQLLLVGDKVPHDLD 300 

I | I I I I I I I 1 I I I I I I I I I I I I I I I I II I I II M II I II I I I I I I I I I I I 

251 RAVEPQLQEEERMVYVAFSEFFFDSAMESYFRAGALQLLLVGDKVPHDLD 30 0 

301 MLLRAT YFGS I VLLS PAVI DS PLKLELRVLAP PRCT IKPS GTT I S VT AS V 350 

| | i II | | | | | I I I I I I I I I I I I I I I II I I I I I I 1 I I I I I I I I I I I I I I I I 
301 MLLRAT YFGS I VLLS PAVI DSPLKLELRVLAPPRCTIKPSGTTISVT AS V 350 

351 TIALVPPDQPEVQLSSMTMDARLSAKMALRGKALRTQLDLRRFRIYSNHS 400 

| | I I I I I I I I I I II I I I 1 I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I 

351 TIALVPPDQPEVQLSSMTMDARLSAKMALRGKALRTQLDLRRFRIYSNHS 4 00 



4 01 ALE S L AL I PLQAPLKTMLQ I GVMPMLN 
I I I I I I I I I I I I I I I I I I I I I II I I I I 
401 ALE S L AL I PLQAP LKTMLQ I GVMPMLN 



427 
427 



Sequence name: P L T P_HUMAN 
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Sequence documentation : 



Alignment of: HUMPHOSLIP_PEA_2_P31 x PLTP_HUMAN 



Alignment segment 1/1: 



Quality : 

Escore: 0 

Matching length: 
length: 67 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



639.00 



67 



100.00 



Total 



100.00 Matching Percent 



Total Percent 



Alignment : 



1 MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETIT 50 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 i m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1 MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETIT 50 



51 I PDLRGKEGHFYYNI SE 

I I I I II ! I II I I I I I I 1 

51 I PDLRGKEGHFYYNI SE 



67 



67 
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Sequence name: PLTP_HUMAN 
Sequence documentation : 

Alignment of: HUMPHOSLIP_PEA_2_P33 x PLTP_HUMAN 
Alignment segment 1/1: 

Quality: 1767.00 

Escore: 0 

Matching length: 184 
length: 184 

Matching Percent Similarity: 100.00 
Identity: 99.46 

Total Percent Similarity: 100.00 
Identity: 99.46 

Gaps : 0 

Alignment : 

1 MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETIT 50 

I | | | | | I I I I I I I II I I ! I I I I I I I I I II I ! M M I I ! I I I I I I I I I I I I 

1 MAL FG AL F L ALL AGAH AE F P GCK I RVT S KALE L VKQE GLRFLEQE LE TIT 50 
51 IPDLRGKEGHFYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLR 100 

I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 M 

51 IPDLRGKEGHFYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLR 100 

101 FRRQLLYWFFYDGGYINASAEGVSIRTGLELSRDPAGRMKVSWSCQASV 150 
| | I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Total 
Matching Percent 
Total Percent 
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101 FRRQLLYWFFYDGGYINASAEGVS1RTGLELSRDPAGRMKVSNVSCQASV 150 

151 S RMHAAFGGT FKKVYDFLS T FI T SGMRFLLNQQV 184 

I I I I I I I I I 1 I I I I I I I 1 I I I I I I I I I I I I I I I : 

151 SRMHAAFGGTFKKVYDFLSTFITSGMRFLLNQQI 184 



10 



Sequence name: PLTP_HUMAN 



Sequence documentation : 



15 



Alignment of: HUMPHOSLIP_PEA_2_P34 x PLTP_HUMAN 



Alignment segment 1/1: 



20 Quality: 1971.00 

Escore: 0 

Matching length: 205 
length: 205 
Matching Percent Similarity: 100.00 
25 Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 



Total 



Matching Percent 



Total Percent 



30 Alignment: 
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1 MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETIT 5 0 

I I I I I I I I I ! M I I I I I I I i I I I I I I 1 I I I I 1 M i I i I I M 1 I I I I I I I 1 

1 MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETIT 5 0 

5 51 IPDLRGKEGHFYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLR 100 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
51 IPDLRGKEGHFYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLR 100 

101 FRRQLLYWFFYDGGYINASAEGVSIRTGLELSRDPAGRMKVSNVSCQASV 150 

10 I I I I I I I I I I I I I I I I I i I I I M I I I M I M I I I N ! I I I I I I 1 ! I I I M 

101 FRRQLLYWFFYDGGYINASAEGVSIRTGLELSRDPAGRMKVSNVSCQASV 150 
151 S RMHAAFGGT FKKVYDFL S T F I T S GMRFLLNQQ I C P VL YHAGT VLLN S LL 200 

I I I I I M I I I I I I i I I I I I I M I I I I I I II I I I I I I I I II I I I I I ! I I M 

15 151 SRMHAAFGGTFKKVYDFLSTFITSGMRFLLNQQICPVLYHAGTVLLNSLL 200 

201 DTVPV 205 



201 DTVPV 



205 



20 



25 

Sequence name: PLTP_HUMAN 
Sequence documentation : 
30 Alignment of: HUMPHOSLIP__PEA_2_P35 x PLTP_HUMAN 
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Alignment segment 1/1 



Quality: 1158.00 

Escore: 0 

Matching length: 132 
length: 184 

Matching Percent Similarity: 100.00 
Identity: 98.48 

Total Percent Similarity: 71.74 
Identity: 70.65 

Gaps : 1 



Total 



Matching Percent 



Total Percent 



Alignment : 



1 MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETIT 5 0 

I | I i I I I I I I I I I I I I I I I I I I I I i I I 11 i i I I I I ! I M I I I I I I I I I I I 

1 MAL FGAL FL ALL AGAH AE F P G CK I RVT S KALE LVKQE GLRFLE QE L E T I T 50 
51 IPDLRGKEGHFYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLR 100 

I I I I I I I I I I I I II I I I I I I 1 I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

51 IPDLRGKEGHFYYNISEVKVTELQLTSSELDFQPQQELMLQITNASLGLR 100 

- 

101 FRRQLLYWFL HO 

MINIMI: 

101 FRRQLLYWFFYDGGYINASAEGVSIRTGLELSRDPAGRMKVSNVSCQASV 150 



111 KVYDFLSTFITSGMRFLLNQQV 

II II II M I I I I I II II I II I : 

151 SRMHAAFGGTFKKVYDFLSTFITSGMRFLLNQQI 



132 



184 



DESCRIPTION FOR CLUSTER AI076020 
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Cluster AI076020 features 1 transcript(s) and 8 segment(s) of interest, the names for 
which are given in Tables 552 and 553, respectively, the sequences themselves are given at the 
end of the application. The selected protein variants are given in table 554. 



Table 552 - Transcripts of interest 


TJ&a^^ , v /;'' \ 7'V \. % 


Sequence ID No. V 


AI076020JTO 


58 


Table 553 - Segments of interest 


Segment Name 1 f \ ' - }j: ■ 


Sequence ID No. h.- v . jr 


AI076020_node_0 


571 


AI076020_node_3 


572 


AI076020_node_8 


573 


AI076020_node_l 


574 


AI076020_node_4 


575 


AI076020_node_5 


576 


AI076020_node_6 


577 


AI076020_node_7 


578 



Table 554 - Proteins of interest 



PrOtein Name ,f . 


Sequence ID No. 


Corresponding TtanscMpt(s) 


AI076020_P1 


1334 ^ 


AI076020_T0 



These sequences are variants of the known protein Clq-related factor precursor 
(SwissProt accession identifier C1RFHUMAN), SEQ ID NO: 1434, referred to herein as the 
previously known protein. 

The sequence for protein Clq-related factor precursor is given at the end of the 
application, as "Clq-related factor precursor amino acid sequence". 
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The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: locomotory behavior, which are annotation(s) related to Biological 
Process. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
5 Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Loeuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

Cluster AI076020 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
10 according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 31 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

15 Overall, the following results were obtained as shown with regard to the histograms in 

Figure 31 and Table 555. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: brain malignant tumors and a mixture of malignant tumors 
from different tissues. 

20 Table 555 - Normal tissue distribution 



Name ofTissiie \ 


Number 


bone 


0 


brain 


9 


epithelial 


0 


general 


4 


kidney 


2 


lung 


o ! 


ovary 


0 


pancreas 


30 


uterus 


0 
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Table 556 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI : : c: \ 


P2 . 


SP1 v- 


R3 


SP2 -ji 




bone 


3.3e-01 


5.9e-02 


4.0e-01 


2.5 


2.4e-01 


3.0 


brain 


8.8e-04 


2.2e-03 


5.5e-ll 


14.2 


4.6e-08 


8.7 


epithelial 


2.6e-01 


8.6e-02 


2.8e-01 


2.4 


1.8e-02 


4.5 


general 


2.1e-03 


3.0e-04 


2.0e-06 


4.3 


8.4e-06 


3.5 


kidney 


5.5e-01 


3.3e-01 


3.4e-01 


2.3 


8.2e-02 


3.3 


lung 


1 


6.3e-01 


1 


1.0 


3.8e-01 


2.2 


ovary 


4.2e-01 


4.5e-01 


0.0e+00 


0.0 


0.0e+00 


0.0 


pancreas 


6.0e-01 


7.1e-01 


8.9e-01 


0.6 


9.5e-01 


0.5 


uterus 


1 


4.0e-01 


1 


1.0 


6.4e-01 


1.5 



5 

As noted above, cluster AI076020 features 1 transcript(s), which were listed in Table 1 
above. These transcript(s) encode for protein(s) which are variant(s) of protein Clq-related 
factor precursor. A description of each variant protein according to the present invention is now 
10 provided. 

Variant protein AI076020_P1 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) AI076020_T0. The 
location of the variant protein was determined according to results from a number of different 
15 software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 
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Variant protein AI076020JP1 also has the following non- silent SNPs (Single Nucleotide 
Polymorphisms) as listed in Table 557, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein AI076020_P1 sequence provides 
5 support for the deduced sequence of this variant protein according to the present invention). 



Table 557 - Amino acid mutations 



SNP position(s) on amino acid 

sequence-'?;;' '""f ' "*f-i 


Alternative amino acid(s) . ; 


Previously known SJJB? 


36 


P->R 


Yes 


66 


Q->R 


Yes 


165 


K->R 


Yes 



Variant protein AI076020JP1 is encoded by the following transcript(s): AI076020 T0, 
for which the sequence(s) is/are given at the end of the application. The coding portion of 

10 transcript AI076020_T0 is shown in bold; this coding portion starts at position 261 and ends at 
position 1034. The transcript also has the following SNPs as listed in Table 558(given according 
to their position on the nucleotide sequence, with the alternative nucleic acid listed; the last 
column indicates whether the SNP is known or not; the presence of known SNPs in variant 
protein AI076020JP1 sequence provides support for the deduced sequence of this variant 

1 5 protein according to the present invention). 

Table 558 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


367 


C->G 


Yes 


457 


A->G 


Yes 


464 


C->A 


Yes 


754 


A->G 


Yes 


1265 


C->T 


Yes 


1384 


C->T 


Yes 
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1402 


G->C 


Yes 


1452 


T->C 


Yes 



~^ As noted above, cluster AI076020 features 8 segments), which were listed in Table 2 
above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 



5 provided. 

Segment cluster AI076020_node_0 according to the present invention is supported by 28 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): AI076020JT0. Table 559 below describes the starting and 
10 ending position of this segment on each transcript. 



Table 559 - Segment location on transcripts 



Transcript name . J „ V --;'<V. 


Segment f :/ 

starting pb^Mon : • 


'Segment' l; ; 

U fencing position ' -^gj 


AI076020_T0 


1 


774 



Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
15 expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 560. 

Table 560 - Oligonucleotides related to this segment 



Oligonucleotide name 


Overexpressed in cancers 


Chip reference ^ 


AI076020J)_3_0 


lung malignant tumors 


LUN 



20 Segment cluster AI076020_node_3 according to the present invention is supported by 30 

libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): AI076020__TO. Table 561 below describes the starting and 
ending position of this segment on each transcript. 



WO 2006/131783 



PCT/IB2005/004037 



647 

Table 561 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment i 
ending position 


AI076020JT0 


858 


1027 



Segment cluster AI076020_node_8 according to the present invention is supported by 35 
5 libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript (s): AI076020JT0. Table 562 below describes the starting and 
ending position of this segment on each transcript. 



Table 562 - Segment location on transcripts 



Transcript name v ; ' < 


Segmpftt ' : 
starting position , f 


/'Segment :. ^ / y 
wding position p 


AI076020JT0 


1359 


1533 



According to an optional embodiment of the present invention, short segments related to 



10 the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



Segment cluster AI076020_node_l according to the present invention is supported by 19 
libraries. The number of libraries was determined as previously described. This segment can be 
15 found in the following transcript(s): AI07602O_T0. Table 563 below describes the starting and 
ending position of this segment on each transcript. 



Table 563 - Segment location on transcripts 



Transcript name 


\ Segment V 
starting position 


Segment 
[ ending position 


AI076020JT0 


775 


857 
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Segment cluster AI076020_node_4 according to the present invention is supported by 28 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): AI076020JT0. Table 564 below describes the starting and 
ending position of this segment on each transcript. 



5 Table 564 - Segment location on transcripts 



Transcript n^rne ; 4 


Segrpent ; r 
starting position • 


Segment. 

ending position ?. f 


AI076020JT0 


1028 


1129 



Segment cluster AI076020_node_5 according to the present invention is supported by 31 
libraries. The number of libraries was determined as previously described. This segment can be 
10 found in the following transcript(s): AI076020JT0. Table 565 below describes the starting and 
ending position of this segment on each transcript. 



Table 565 - Segment location on transcripts 



Transcript name %\ , ] ■ 


: Segmerit ; ; ... > 
starting position / 


Segment;- ■■ S ' ' : 'f ' -- 
lading positibii : >" : 


AI076020_T0 


1130 


1244 



15 Segment cluster AI076020jtiode_6 according to the present invention is supported by 32 

libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): AI076020JT0. Table 566 below describes the starting and 
ending position of this segment on each transcript. 



Table 566 - Segment location on transcripts 



Transcript name 


[Segment 
starting position 


\ Segment 
ending position 


AI076020JT0 


1245 


1320 



20 
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Segment cluster AI076020_nodeJ7 according to the present invention is supported by 33 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): AI076020_T0. Table 567below describes the starting and 
ending position of this segment on each transcript. 



Table 567 - Segment location on transcripts 



Transcript name - /, v & 


: Segment , ~r,/ , ~ 
stp rting pofitipti ( V ^ 


Segment v v e v 
ending position / ' 


AI076020 TO 


1321 


1358 



DESCRIPTION FOR CLUSTER T23580 
Cluster T23580 features 1 transcript(s) and 5 segment(s) of interest, the names for which 
are given in Tables 568 and 569, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 570. 

Table 568 - Transcripts of interest 



Transcript Name : " < '%* , 


Sequence ID No. ' 


T23580_T10 


1626 


Table 569 - Segments of interest 


Segment Name 


Sequence ID No. 


T23580_node_17 


579 


T23580__node__18 


580 


T23580_node_21 


581 


T23580_node_19 


582 


T23580_node_20 


583 
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Table 570- Proteins of interest 



Protein Name 


Sequence ID No. 


Corresponding Transcript(s) 


T23580_P5 


1335 


T23580_T10 



These sequences are variants of the known protein Neuronal protein NP25 (SwissProt 
accession identifier T AG3 JHTUM AN ; known also according to the synonyms Neuronal protein 
5 22; NP22; Transgelin-3), SEQ ID NO: 1435, referred to herein as the previously known protein 
and also as NP25_HUMAN, which is the former SwissProt accession identifier. 

The sequence for protein Neuronal protein NP25 is given at the end of the application, as 
"Neuronal protein NP25 amino acid sequence". 

The following GO Annotation(s) apply to the previously known protein. The following 
10 annotation(s) were found: central nervous system development, which are annotation(s) related 
to Biological Process. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

15 

For this cluster, at least one oligonucleotide was found to demonstrate overexpression of 
the cluster, although not of at least one transcript/segment as listed below. Microarray (chip) 
data is also available for this cluster as follows. Various oligonucleotides were tested for being 
differentially expressed in various disease conditions, particularly cancer, as previously 
20 described. The following oligonucleotides were found to hit this cluster but not other 
segments/transcripts below, shown in Table 571, with regard to lung cancer. 

Table 571 - Oligonucleotides related to this cluster 



Oligonueleotide name 


Overexpressed in cancers 


Chip reference 


T23580J)J)_902 


lung malignant tumors 


LUN 



above. These transcript(s) encode for protein(s) which are variant(s) of protein Neuronal protein 
25 NP25. A description of each variant protein according to the present invention is now provided. 
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Variant protein T23580_P5 according to the present invention has an amino acid sequence 
as given at the end of the application; it is encoded by transcript(s) T23580JT10. The location of 
the variant protein was determined according to results from a number of different software 
programs and analyses, including analyses from SignalP and other specialized programs. The 
variant protein is believed to be located as follows with regard to the cell: secreted. The protein 
localization is believed to be secreted because one of the two signal-peptide prediction programs 
(HMM:Signal peptide,NN:NO) predicts that this protein has a signal peptide. 

Variant protein T23580_P5 also has the following non-silent SNPs (Single Nucleotide 
Polymorphisms) as listed in Table 572 5 (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein T23580JP5 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

Table 572 - Amino acid mutations 



sequence - } •■: >• : 


Alternative amino i^id(s}5 


Previdtisly known. 3HP?;^ — f 


129 


V->I 


Yes 



Variant protein T23580P5 is encoded by the following transcript(s): T23580_T10, for 
which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
T23580JT10 is shown in bold; this coding portion starts at position 1066 and ends at position 
1485. The transcript also has the following SNPs as listed in Table 573 (given according to their 
position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
T23580_P5 sequence provides support for the deduced sequence of this variant protein 
according to the present invention). 

Table 573 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


37 


A->C 


Yes 
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320 
371 
372 
441 
"699" 
744 
862 



1450 



G-> A 
G->T 
G -> A 
A->G 
G->C 
C ->G 
G->T 



Yes 
Yes 
Yes 
Yes 
Yes 
Yes 
Yes 
Yes 



G-> A 

As noted above, cluster T23580 features 5 segment(s), which were listed in Table 2 above 
and for which the sequence(s) are given at the end of the application. These segment(s) are 
portions of nucleic acid sequence(s) which are described herein separately because they are of 
particular interest. A description of each segment according to the present invention is now 
provided. 



10 



Segment cluster T23580_node_17 according to the present invention is supported by 10 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T23580_T10. Table 574 below describes the starting and 
ending position of this segment on each transcript. 

Table 574- Segment location on transcripts 





. Segment ■ :? [ ■ ■ 
staxtmg position 


■ Segment . 
ending position , 


T23580_T10 




1098 



Segment cluster T23580_node_18 according to the present invention is supported by 102 
15 libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T23580JT10. Table 575 below describes the starting and 
ending position of this segment on each transcript. 

Table 575 - Segment location on transcripts 
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TraiascMpt hamfe ^ ; 


Segment 
starting position 


Segment 
ehding position 


T23580_T10 


1099 


1357 



Segment cluster T23580_node_21 according to the present invention is supported by 79 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): T23580_T10. Table 576 below describes the starting and 
ending position of this segment on each transcript. 

Table 576 - Segment location on transcripts 





Segment v rV " ^ 
starting position • 


Se^i>ent % ' 
-ending position^ - 


T23580JT10 


1382 


1582 



the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



Segment cluster T23580_node_19 according to the present invention can be found in the 
following transcript(s): T23580_T10. Table 577 below describes the starting and ending 
position of this segment on each transcript. 

Table 577 - Segment location on transcripts 



Transcript name - : . "~K 


[Segment,, ; 
I starting position 


Segment 
ending position 


T23580JT10 


1358 


1370 



Segment cluster T23580_node_20 according to the present invention can be found in the 
following transcript(s): T23580JT10. Table 578 below describes the starting and ending 
position of this segment on each transcript. 
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Table 578 ~ Segment location on transcripts 



Transcript name / v . -/ . \ 


Segment 

starting position ; 


Segment : /. / 
ending position ; 


T23580_T10 


1371 


1381 



DESCRIPTION FOR CLUSTER M79217 
5 Cluster M79217 features 6 transcript(s) and 32 segment(s) of interest, the names for 

which are given in Tables 579 and 580, respectively, the sequences themselves are given at the 
end of the application. The selected protein variants are given in table 581. 



Table 579 - Transcripts of interest 



Transcript Name . . j ; : _ - . 


Sequence ID No,. f: "if : "• \. 


M79217_PEA_1_T1 


59 


M79217_PEA_1_T3 


60 


M79217_PEA_1_T8 


61 


M79217_PEA_1_T10 


62 


M792 17_PEA_1_T1 5 


63 


M79217_PEA_1_T18 


64 


Table 580 - Segments of interest 


Segment Name ■ -. 


Sequence ID No. '- 


M792 1 7_PEA_l_node_2 


584 


M792 1 7JPEA_l_node_4 


585 


M792 1 7_PEA_l_node_9 


586 


M792 1 7_PEA_l_node_l 0 


587 


M792 1 7_PEA_l_node_l 1 


588 


M792 1 7_PEA_l_node_l 3 


589 


M79217_PEA_l_node_14 


590 
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M792 1 7_PEA_l_node_l 6 


591 


M792 1 7_PEA_l_node_23 


592 


M792 1 7_PEA_l_node_24 


593 


M792 1 7_PEA_l_node_3 1 


594 


M792 1 7_PEA_l_node_33 


595 


M792 1 7_PEA_l_node_34 


596 


M79217_PEA_l_node_35 


597 


M792 1 7_PEA_l_node_37 


598 


M792 1 7_PEA_ l_node_3 8 


599 


M79217_PEA_l_node_41 


600 


M792 1 7_PEA_l_node_44 


601 


M792 1 7_PEA_l_node_0 


602 


M79217_PEA_l_node_7 


603 


M792 1 7_PEA_l_node_l 2 


604 


M792 1 7_PEA_l_node_l 9 


605 


M792 1 7_PEA_l_node_2 1 


606 


M792 1 7_PEA_l_node_26 


607 


M792 1 7_PEA_l_node_27 


608 


M792 1 7_PEA_l_node_30 


609 


M792 1 7_PEA_l_node_32 


610 


M792 1 7_PEA_l_node_3 6 


611 


M792 1 7_PEA_l_node_3 9 


612 


M792 1 7_PEA_l_node_40 


613 


M79217_PEA_l_node_42 


614 


M792 1 7_PEA_l_node_43 


615 



Table 581 - Proteins of interest 



Protein. Name 


Sequence ID No. 


Corresponding Transcript(s) 


M79217_PEA_1_P1 


1336 


M79217_PEA_1_T1; 
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M79217_PEA_1_T3 


M79217_PEA_1_P2 


1337 


M79217_PEA_1_T8 


M79217_PEA_1_P4 


1338 


M79217_PEA_1_T10 


M79217_PEA_1_P8 


1339 


M79217_PEA_1_T15 


M79217_PEA_1_P11 


1340 


M79217_PEA_1_T18 



These sequences are variants of the known protein Exostosin- like 3 (SwissProt accession 
identifier EXL3_HUMAN; known also according to the synonyms EC 2.4.1.223; Glucuronyl- 
galactosyl-proteoglycan 4- alpha~N-acetylglucosaminyltransferase; Putative tumor suppressor 
5 protein EXTL3; Multiple exostosis-like protein 3; Hereditary multiple exostoses gene isolog; 
EXT-related protein 1), SEQ ID NO: 1436, referred to herein as the previously known protein. 

Protein Exostosin- like 3 is known or believed to have the following function(s): Probable 
glycosyltransferase (By similarity). The sequence for protein Exostosin- like 3 is given at the end 
of the application, as "Exostosin- like 3 amino acid sequence". Protein Exostosin- like 3 
10 localization is believed to be Type II membrane protein. Endoplasmic reticulum. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: cell growth and/or maintenance, which are annotation(s) related to 
Biological Process; transferase, transferring glycosyl groups, which are annotation(s) related to 
Molecular Function; and endoplasmic reticulum; integral membrane protein, which are 
1 5 annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from<h1ip://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

As noted above, cluster M79217 features 6 transcript(s), which were listed in Table 1 
20 above. These transcript(s) encode for protein(s) which are variant(s) of protein Exostosin- like 3. 
A description of each variant protein according to the present invention is now provided. 

Variant protein M79217_PEA_1_P1 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
25 M79217JPEA_1 JT1. An alignment is given to the known protein (Exostosin- like 3) at the end 
of the application. One or more alignments to one or more previously published protein 
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sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 
Comparison report between M79217_PEA_1_P1 and BAA25445 (SEQ ID NO: 1437): 
1 An isolated chimeric polypeptide encoding for M79217_PEA_1_P1, comprising a first 
amino acid sequence being at least 90 % homologous to 

MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYYLTTLDEAD 

EAGKRIFGPRVGNELCEVKHVLDLCPaRESVSEELLQLEAKRQELNSEIAKLNLKIEACK 

KSIENAKQDLLQLKNVISQTEHSYKELMAQNQPKLSLPIRLLPEKDDAGLPPPKATRGC 

RLHNCFDYSRCPLTSGFPVYVYDSDQFVFGSYLDPLVKQAFQATARANVYVTENADIA 

CLYVILVGEMQEPVVLRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTG 

RAMVAQSTFYTVQYRPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKIESL 

RSSLQEARSFEEEMEGDPPADYDDRIIATLKAVQDSKLDQVLVEFTCKNQPKPSLPTEW 

ALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATRLFEALEVGAVPWLGEQVQLPY 

QDMLQWNEAALVVPKPRVTEVHFLLRSLSDSDLLAMRRQGRFLWETYFSTADSIFNTV 

LAMIRTRIQIPAAPIREEAAAEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYL 

RNFTLTVTDFYRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 

QAALGGNWREQFTVVMLTYEREEVLMNSLERLNGLPYLNKVVVVWNSPKLPSEDLL 

WPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLRHDEIMFGFRVWREARD 

RIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHKYYAYLYSYVMPQAIRD 

MVDEYINCEDIAMNFLVSHITRKPPIKVTSRWTFRCPGCPQALSHDDSHFHERHKCINFF 

VKVYGYMPLLYTQFRVDSVLFKTRLPHDKTKCFKFI corresponding to amino acids 13 - 

931 of BAA25445, which also corresponds to amino acids 1 - 919 of M79217_PEA_1_P1. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
membrane. The protein localization is believed to be membrane because the Signalp_hmm 
software predicts that this protein has a signal anchor region. 

Variant protein M79217_PEA_1_P1 is encoded by the following transcript(s): 
M79217_PEA_1_T1, for which the sequence(s) is/are given at the end of the application. The 
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coding portion of transcript M79217JPEA_1_T1 is shown in bold; this coding portion starts at 
position 1074 and ends at position 3830. The transcript also has the following SNPs as listed in 
Table 582 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
5 known SNPs in variant protein M79217_PEA_1 JP1 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 



Table 582 - Nucleic acid SNPs 



SNP position on nucleotide; ~ 
sequence . • • J*~ -„ 


Alternative nucleic acid . 


Previously known SNP? 

■C. r* '']■■■_ >. ' 


1014 


C->T 


No 


1015 


T-> 


No 


1072 


T->C 


No 


1232 


T-> A 


No 


1383 


A->G 


No 


1440 


A->G 


No 


1544 


C-> 


No 


1546 


G -> A 


No 


1685 


T->G 


No 


2215 


C-> 


No 


2300 


A->G 


Yes 


2483 


T->C 


No 


2518 


C-> 


No 


2632 


T->G 


No 


3190 


T->C 


Yes 


3352 


T->C 


No 


3373 


G->T 


No 


3386 


C-> 


No 


3449 


C->T 


Yes 


3618 


A->G 


No 


3733 


A->G 


No 
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4021 


c-> 


No 


4021 


C->T 


No 


4086 


G->A 


No 


4087 


G -> A 


No 


4416 


T-> A 


No 


4586 


G->A 


Yes 


4772 


C ->T 


No 


5110 


C ->T 


Yes i 


5219 


C ->T 


Yes 


5437 


G->A 


No 


5645 


G->A 


No 


5743 


G->A 


Yes 


5887 


G->T 


Yes 


6143 


A->C 


No 


6277 


G-> 


No 


6277 


G->C 


No 


6295 


C->G 


Yes 


6308 


T -> A 


No 


6403 


G->A 


Yes 


6442 


G-> 


No 


6495 


C->T 


No 



Variant protein M79217_PEA_1 JP2 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 M79217JPEA_1„T8. An alignment is given to the known protein (Exostosin-like 3) at the end 
of the application. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 
Comparison report between M79217JPEA_1 JP2 and EXL3 HUMAN: 
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l.An isolated chimeric polypeptide encoding for M79217JPEA_1 JP2, comprising a first 
amino acid sequence being at least 90 % homologous to 

MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYYLTTLDEAD 

EAGKRIFGPRVGNELCEVKHVLDLCmRESVSEELLQLEAK^QELNSEIAKLNLKIEACK 

KSIENAKQDLLQLKNVISQTEHSYKELMAQNQPKLSLPIRLLPEKDDAGLPPPKATRGC 

RLHNCFDYSRCPLTSGFPVYVYDSDQFVFGSYLDPLVKQAFQATARANVYVTENADIA 

CLYVILVGEMQEPVVLRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTG 

RAMVAQSTFYTVQYRPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKIESL 

RSSLQEARSFEEEMEGDPPADYDDRIIATLKAVQDSKLDQVLVEFTCKNQPKPSLPTEW 

ALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATRLFEALEVGAVPWLGEQVQLPY 

QDMLQWNEAALVVPKPRVTEVHFLLRSLSDSDLLAMRRQGRFLWETYFSTADSIFNTV 

LAMIRTRIQIPAAPIREEAAAEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYL 

RNFTLTVTDFYRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 

QAALGGNVPREQFTVVMLTYEREEVLMNSLERLNGLPYLNKVVVVWNSPKLPSEDLL 

WPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLRHDEIMFGFRVWREARD 

RIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHK corresponding to amino 

acids 1 - 807 of EXL3JHUMAN, which also corresponds to amino acids 1 - 807 of 

M79217_PEA_1JP2, and a second amino acid sequence being at least 90 % homobgous to 

AIRDMVDEYINCEDIAMNFLVSHITRKPPIKVTSRWTFRCPGCPQALSHDDSHFHERHK 

CINFFVKVYGYMPLLYTQFRVDSVLFKTRLPHDKTKCFKFI corresponding to amino acids 

820 - 919 of EXL3 HUMAN, which also corresponds to amino acids 808 - 907 of 

M79217JPEA_1 JP2 5 wherein said first amino acid sequence and second amino acid sequence 

are contiguous and in a sequential order. 

2.An isolated chimeric polypeptide encoding for an edge portion of M79217_PEA_1_P2, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise KA, having a 
structure as follows: a sequence starting from any of amino acid numbers 807-x to 807; and 
ending at any of amino acid numbers 808+ ((n-2) - x), in which x varies from 0 to n-2. 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
membrane. The protein localization is believed to be membrane because the Signalp_hmm 
software predicts that this protein has a signal anchor region. 

Variant protein M79217JPEA_1JP2 also has the following noi>silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 583, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M79217_PEA_1__P2 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 583 - Amino acid mutations 



S^ posMoriCsJ) on amino aeid 
sequence , "Ji „ : 


Alternative amino aeid(s) J 


Previously known SNP? « 


104 


N->D 


No 


123 


N->D 


No 


157 


I-> 


No 


158 


R->Q 


No 


204 


F->L 


No 


381 


A-> 


No 


482 


A-> 


No 


520 


F->C 


No 


706 


L->P 


Yes 


760 


V-> A 


No 


767 


R->L 


No 


771 


F-> 


No 


837 


I-> V 


No 


875 


Y->C 


No 
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The glycosylation sites of variant protein M79217JPEA_1JP2, as compared to the known 
protein Exostosin- like 3, are described in Table 584 (given according to their position(s) on the 
amino acid sequence in the first column; the second column indicates whether the glycosylation 
site is present in the variant protein; and the last column indicates whether the position is 



5 different on the variant protein). 
Table 584 - Glycosylation site(s) 



Rositiori(s) on known amino/ 
ae|d sequence : . : -V v V , ; 


present in, variant protein? € 


Position in yariant prof ein? 


290 


yes 


290 


592 1 


yes 


592 


790 


yes 


790 


277 


yes 


277 



Variant protein M79217JPEA_1 JP2 is encoded by the following transcript(s): 
M79217_PEA_1_T8, for which the sequence(s) is/are given at the end of the application. The 

10 coding portion of transcript M79217_PEA_1_T8 is shown in bold; this coding portion starts at 
position 748 and ends at position 3468. The transcript also has the following SNPs as listed in 
Table 585 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein M79217JPEA_1_P2 sequence provides support for the deduced 

1 5 sequence of this variant protein according to the present invention). 

Table 585 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence . 


Alternative nucleic acid 


Previously known SNP? 


688 


C ->T 


No 


689 


T-> 


No 


746 


T->C 


No 


906 


T-> A 


No 


1057 


A->G 


No 
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1114 


A->G 


No 


1218 


C-> 


No 


1220 


G-> A 


No 


1359 


T->G 


No 


1889 


C -> 


No 


1974 


A->G 


Yes 


2157 


T->C 


No 


2192 


C-> 


No 


2306 


T->G 


No 


2864 


T->C 


Yes 


3026 


T->C 


No 


3047 


G->T 


No 


3060 


C-> 


No 


3123 


C ->T 


Yes 


3256 


A->G 


No 


3371 


A->G 


No 


3659 


C -> 


No 


3659 


C ->T 


No 


3724 


G-> A 


No 


3725 


G-> A 


No 


4054 


T-> A 


No 


4224 


G-> A 


Yes 


4410 


C->T 


No 


4748 


C->T 


Yes 


4857 


C->T 


Yes 


5075 


G->A 


No 


5283 


G-> A 


No 


5381 


G-> A 


Yes 


5525 


G->T 


Yes 


5781 


A->C 


No 



WO 2006/131783 


664 


PCT/IB2005/004037 


5915 


G-> 


No 


5915 


G->C 


No 


5933 


C->G 


Yes 


5946 


T-> A 


No 


6041 


G -> A 


Yes ! 


6080 


G-> 


No ' 


6133 


C ->T 


No 



Variant protein M792 1 7 JPEA_1 JP4 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 

5 M7 92 1 7 JPE A_l _T 1 0 . An alignment is given to the known protein (Exostosin-like 3) at the end 
of the application. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 
Comparison report between M79217_PEA_1_P4 and EXL3JHUMAN: 

10 1 .An isolated chimeric polypeptide encoding for M79217 JPEA_1_P4, comprising a first 

amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence 

PELRQPARLGLPECWDYRHEPRCPAQMGSHFIVQAGLKLLASSKPPKCWDY 
15 corresponding to amino acids 1-51 of M79217J>EA_1JP4, and a second amino acid sequence 
being at least 90 % homologous to 
RVWREARDRIVGFPGRYHAWDIPHQSW 

VMPQAIRDMVDEYINCEDIAMNFLVSHITRKPPIKVTSRWTFRCPGCPQALSHDDSHFH 
ERHKC1NFFVKWGYMPLLYTQFRVDSVLFKTRLPHDKTKCFKFI corresponding to 
20 amino acids 759 - 919 of EXL3HUMAN, which also corresponds to amino acids 52 - 212 of 
M79217J?EA_1 JP4, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a head of M79217JPEA_1 JP4, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 



WO 2006/131783 



PCT/IB2005/004037 



665 

more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence PELRQPARLGLPECWDYRHEPRCPAQMGSHFIVQAGLKLLASSKPPKCWDY 

of M792 17JPEAJ JP4. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
membrane. The protein localization is believed to be membrane because although it is a partial 
protein, because both trans-membrane region prediction programs predict that this protein has a 
trans -membrane region. 

Variant protein M79217_PEA_1_P4 also has the following none silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 586, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M79217JPEA_1 JP4 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 586 - Amino acid mutations 



SNP position(s) on amino acid 
sequence / ; .,.' i:V -' ~;' 


Alternative amino acid(s) 


Previously known SNP? 


53 


V->A 


No 


60 


R->L 


No 


64 


F-> 


No 


142 


I->V 


No 


180 


Y->C 


No 



The glycosylation sites of variant protein M79217_PEA_1_P4, as compared to the known 
protein Exostosin-like 3, are described in Table 587 (given according to their position(s) on the 
amino acid sequence in the first column; the second column indicates whether the glycosylation 
site is present in the variant protein; and the last column indicates whether the position is 
different on the variant protein). 



WO 2006/131783 



PCT/IB2005/004037 



666 



Table 587 - Glycosylation site(s) 



Posifioix(s) ; on known amino 
acid ifequence 


present in variant protein? 


Position in variant protein? 

■ .f ■ ' ..... . ' J 'J- 


290 


no 




592 


no 




790 


yes 


83 


277 


no 





Variant protein M79217JPEA_1_P4 is encoded by the following transcript(s): 
M79217JPEA_1_T10, for which the sequence(s) is/are given at the end of the application. The 
5 coding portion of transcript M79217_PEA_1_T10 is shown in bold; this coding portion starts at 
position 1 and ends at position 637. The transcript also has the following SNPs as listed in Table 
588 (given according to their position on the nucleotide sequence, with the alternative nucleic 
acid listed; the last column indicates whether the SNP is known or not; the presence of known 
SNPs in variant protein M79217_PEA_1JP4 sequence provides support for the deduced 
10 sequence of this variant protein according to the present invention). 

Table 588 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence :% 


Alternative nucleic acid 


Previously known SNP? :! 


159 


T->C 


No 


180 


G->T 


No 


193 


C-> 


No 


256 


C->T 


Yes 


425 


A->G 


No 


540 


A->G 


No 


828 


C-> 


No 


828 


C->T 


No 


893 


G-> A 


No 


894 


G-> A 


No 
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1223 


T->A 


No 


1393 


G->A 


Yes 


1579 


C ->T 


No 


1917 1 


C->T 


Yes 


2026 


C->T 


Yes 


2244 


G -> A 


No 


2452 


G-> A 


No 


2550 


G -> A 


Yes 


2694 


G->T 


Yes 


2950 


A->C 


No 


3084 


G-> 


No 


3084 


G->C 


No 


3102 


C->G 


Yes 


3115 


T->A 


No 


3210 


G->A 


Yes 


3249 


G-> 


No 


3302 


C->T 


No 



Variant protein M792 1 7 JPEA_1 JP8 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 M792 1 7JPE A_l JT 1 5 . An alignment is given to the known protein (Exostosin-like 3) at the end 
of the application. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 
Comparison report between M79217JPEA_1 JP8 and EXL3 JHUMAN: 
10 l.An isolated chimeric polypeptide encoding for M79217_PEA_1 J>8 ? comprising a first 

amino acid sequence being at least 90 % homologous to 

MTGYTMLRNGGAGNGGQTCMLRWSNMRETWLSFTLFVILVFFPLIAHYYLTTLDEAD 
EAGKRIFGPRVGNELCEVKHVLDL^ 

KSIENAKQDLLQLKNVISQTEHSYKELMAQNQPKLSLPIRLLPEKDDAGLPP 
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RLHNCFDYSRCPLTSGFPVYVYDSDQFVFGSYLDPLVKQAFQATARANVYVTENADIA 
CLYVILVGEMQEPVVLRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTG 
RAMVAQSTFYTVQYRPGFDLWSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKIESL 
RSSLQEARSFEEEMEGDPPADYDDRIIATLKAVQDSKLDQVLVEFTCKNQPKPSLPTEW 
ALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATRLFEALEVGAVPVVLGEQVQLPY 
QDMLQWNEAALVVPKPRVTEVHFLLRSLSDSDLLAMRRQGRFLWETYFSTADSIFNTV 
LAMIRTRIQIPAAPIREEAAAEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYL 
RNFTLTVTDFYRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 
QAALGGNWREQFTVVMLTYEREEVLMNSLERLNGLPYLNKVVVVWNSPKLPSEDLL 
WTDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLRHDEIMFGFRVWREARD 
RIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHK corresponding to amino 
acids 1 - 807 of EXL3_HUMAN, which also corresponds to amino acids 1 - 807 of 
M79217_PEA_1_P8, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence VRKSW corresponding to amino acids 808 - 
812 of M79217JPEA1P8, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a tail of M79217_PEA_1_P8, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence VRKSW in M79217_PEA_1_P8. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
membrane. The protein localization is believed to be membrane because the Signalp_hmm 
software predicts that this protein has a signal anchor region. 

Variant protein M79217_PEA_1_P8 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 589, (given according to their positions) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M79217_PEA_1_P8 
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sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 589 - Amino acid mutations 



SNP position(s) on toino acid 
sequence ';%"*'_ |' 


Alternative amino acid(s) 


Previously known SNP? 


104 


N->D 


No 


123 


N->D 


No 


157 


I-> 


No 


158 


R->Q 


No 


204 


F->L 


No 


381 


A-> 


No 


482 


A-> 


No 


520 


F->C 


No 


706 


L->P 


Yes 


760 


V -> A 


No 


767 


R->L 


No 


771 


F-> 


No 



5 The glycosylation sites of variant protein M79217JPEA_1 JP8, as compared to the known 

protein Exostosin-like 3, are described in Table 590 (given according to their position(s) on the 
amino acid sequence in the first column; the second column indicates whether the glycosylation 
site is present in the variant protein; and the last column indicates whether the position is 
different on the variant protein). 

1 0 Table 590 - Glycosylation site(s) 



Position(s) on known amino 
acid sequence 


Present in variant protein? 


Position in variant protein? 


290 


yes 


290 


592 


yes 


592 


790 


yes 


790 
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277 



yes 



277 



Variant protein M79217_PEA_1 _P8 is encoded by the following transcript(s): 
M79217JPEA_1_T15, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M79217_PEA_1 JT15 is shown in bold; this coding portion starts at 
position 748 and ends at position 3183. The transcript also has the following SNPs as listed in 
Table 591 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein M79217_PEA_1 JP8 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 591 - Nucleic acid SNPs 



SNP position on nucleotide 

sequence.;. " . 


' Alternative nucleic acid ,,, ' i 

'S-i " ;' ' v ' ■ . 


Previously known SNP? 5 ■/ 


688 


C->T 


No 


689 


T-> 


No 


746 


T->C 


No 


906 


T-> A 


No 


1057 


A->G 


No 


1114 


A->G 


No 


1218 


C-> 


No 


1220 


G-> A 


No 


1359 


T->G 


No 


1889 


C-> 


No 


1974 


A->G 


Yes 


2157 


T->C 


No 


2192 


C-> 


No 


2306 


T->G 


No 


2864 


T->C 


Yes 


3026 


T->C 


No 


3047 


G->T 


No 
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3060 


C-> 


No 


3123 


C->T 


Yes 


3391 


C->T 


No 


3560 


T->C 


No 



Variant protein M79217_PEA_1_P1 1 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 

5 M79217JPEAJLJT18. The location of the variant protein was determined according to results 
from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: secreted. The protein localization is believed to be secreted because one of the 
two signatpeptide prediction programs (HMM:Signal peptide,NN:NO) predicts that this protein 

10 has a signal peptide. 

Variant protein M79217 PEA1P1 1 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 592, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M79217__PEA_1_P1 1 

1 5 sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 592 - Amino acid mutations 



SNP position(s) on amino acid 
sequence :. 


Alternative amino acid(s) 


Previously known SNP? 


17 


P-> 


No 


28 


C->S 


No 


72 


V-> 


No 


90 


S->F 


No 



Variant protein M79217_PEA_1_P11 is encoded by the following transcript(s): 
20 M79217_PEA_1_T18, for which the sequence(s) is/are given at the end of the application. The 
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coding portion of transcript M79217JPEA_1 JT18 is shown in bold; this coding portion starts at 
position 1354 and ends at position 1674. The transcript also has the following SNPs as listed in 
Table 593 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
5 known SNPs in variant protein M79217JPEA_1 JP1 1 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

Table 593 - Nucleic acid SNPs 



, SNP position on nucleotide? 
sequence \ ' V 


' AJternative nucleic acidl* 

<:? : . <f : . "", ; ' " '■ 


Previously known SNP? ; i 


688 


C->T 


No 


689 


T-> 


No 


746 


T->C 


No 


772 


G-> A 


No 


870 


G -> A 


Yes 


1014 


G->T 


Yes 


1270 


A->C 


No 


1404 


G-> 


No 


1404 


G->C 


No 


1422 


C->G 


Yes 


1435 


T-> A 


No 


1530 


G-> A 


Yes 


1569 


G-> 


No 


1622 


C->T 


No j 



10 

As noted above, cluster M79217 features 32 segment(s), which were listed in Table 2 
above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
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of particular interest. A description of each segment according to the present invention is now 
provided. 

Segment cluster M79217JPEA_l_node_2 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M79217_PEA_1 JT3. Table 594 below describes the 
starting and ending position of this segment on each transcript. 

Table 594 - Segment location on transcripts 



Transcript name? y '/'j-^-.: m " \# , 


Segment ^ r : > 
startitig position 


Segment ; "" 
ending position^ - V 


M79217_PEA_1_T3 


50 


177 



Segment cluster M79217_PEA_l__node_4 according to the present invention is supported 
by 8 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M79217JPEA_1_T8, M79217 J>EA_1_T15 and 
M79217JPEA_1_T18. Table 595 below describes the starting and ending position of this 
segment on each transcript. 

Table 595 - Segment location on transcripts 



Transcript name ; '[ '(., 


-Segment y 


Segment ' -:f 




i starting position , 


ending position 


M79217_PEA_1_T8 


1 


177 


M79217_PEA_1_T15 


1 


177 


M79217_PEA_1_T18 


1 


177 



Segment cluster M79217JPEA_l_node_9 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M79217JPEA_1_TL Table 596 below describes the 
starting and ending position of this segment on each transcript. 
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Table 596 - Segment location on transcripts 



Transcript name - 


Segment • ; . 
starting position f 


Segment 
ending position 


M79217JPEA_1_T1 


1 


597 



Segment cluster M79217_PEA_l_node_10 according to the present invention is 
5 supported by 33 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217_PEA_1_T1, M79217JPEA_1_T3, 
M79217_PEA_1 JT8, M79217JPEA_1_T15 and M79217JPEA_1 JT18. Table 597 below 
describes the starting and ending position of this segment on each transcript. 

Table 597- Segment location on transcripts 



Transcript name 1 - w 


Segment :£ 
staiting position 


Segment • f,_ W 
ending position 


M79217_PEA_1_T1 


598 


1080 


M79217_PEA_1_T3 


272 


754 


M79217_PEA_1_T8 


272 


754 


M79217_PEA_1_T15 


272 


754 


M79217_PEA_1_T18 


272 


754 



10 

Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 598. 

1 5 Table 598 - Oligonucleotides related to this segment 



Oligonucleotide name 


Overexpressed in cancers 


Chip reference 


M79217_0_9_0 


lung malignant tumors 


LUN 
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Segment cluster M79217_PEA_l_node_l 1 according to the present invention is 
supported by 42 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following trans cript(s): M79217_PEA_1_T1, M79217JPEA_1 JT3, 
M79217 PEA 1 T8 and M79217_PEA_1_T 15. Table 599 below describes the starting and 



5 ending position of this segment on each transcript. 
Table 599 - Segment location on transcripts 



"Transcript name 


Segment '¥7;. . "" 


Segment,; " ,[ -0' 




starting position ; 


ending position ■ 


M79217_PEA_1_T1 


1081 


1523 


M79217_PEA_1_T3 


755 


1197 


M79217_PEA_1_T8 


755 


1197 


M79217_PEA_1_T15 


755 


1197 



Segment cluster M79217_PEA_l_node_13 according to the present invention is 
10 supported by 35 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217JPEA__1_T1, M79217JPEA_1_T3, 
M79217JPEA_1_T8 and M79217JPEA__1 JT15. Table 600 below describes the starting and 
ending position of this segment on each transcript. 

Table 600 - Segment location on transcripts 



Transcript name % ' 


Segment 


Segment -. 




starting position 


ending position 


M79217_PEA_1_T1 


1548 


2075 


M79217_PEA_1_T3 


1222 


1749 


M79217_PEA_1_T8 


1222 


1749 


M79217_PEA_1_T15 


1222 


1749 



Segment cluster M79217_PEAJ_node_14 according to the present invention is 
supported by 65 libraries. The number of libraries was determined as previously described. This 
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segment can be found in the following transcript(s): M79217JPEA_1_T1, M79217_PEA_1_T3 5 
M79217_PEA_1_T8 and M79217JPEA_1 JT15. Table 601 below describes the starting and 
ending position of this segment on each transcript. 

Table 601 - Segment location on transcripts 



■ ■ :f: -if- ' 4 / i,'v". '■""'V. 


Segment . ■■■■ - 
starting position 


Segment > . 
ending position 


M79217_PEA_1_T1 


2076 


3221 


M79217_PEA_1_T3 


1750 


2895 


M79217_PEA_1_T8 


1750 


2895 


M79217_PEA_1_T15 


1750 


2895 



5 



Segment cluster M792 1 7_PEA_l_node_l 6 according to the present invention is 
supported by 51 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217JPEAJMT1, M79217JPEAJLT3, 
10 M792 1 7JPEA_1_T8 and M792 1 7JPEA1 T 1 5 . Table 602 below describes the starting and 
ending position of this segment on each transcript. 

Table 602 - Segment location on transcripts 



Transcript name '"'%,■ ■ ' ',• 


.Segment ; ":' 
starting position 


Segment 
ending position 


M79217_PEA_1_T1 


3222 


3349 


M79217_PEA_1_T3 


2896 


3023 S 


M79217_PEA_1_T8 


2896 


3023 


M79217_PEA_1_T15 


2896 


3023 



15 Segment cluster M792 1 7_PEA_J_node__23 according to the present inventbn is 

supported by 50 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217_PEA_1JT1, M79217JPEA_1„T3, 
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M79217J>EA_1 JT8, M79217J>EA_1_T10 and M79217_PEA„1_T15. Table 603 below 
describes the starting and ending position of this segment on each transcript. 

Table 603 - Segment location on transcripts 



Transdipt name ') • ;\ %r. 


.Segment .'. - : <{■.■ 
starting position 


Segment • 
ending position 


M79217_PEA_1_T1 


3350 


3494 


M79217_PEA_1_T3 


3024 


3168 


M79217_PEA_1_T8 


3024 


3168 


M79217_PEA_1_T10 


157 


301 


M79217_PEA_1_T15 


3024 


3168 



Segment cluster M79217_PEA_l_node_24 according to the present invention is 
supported by 2 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217JPEA_1 JT15. Table 604 below 
describes the starting and ending position of this segment on each transcript. 

10 Table 604 - Segment location on transcripts 





SSgoient. ^/ < , .f ' 
s&rting position , 


ending position ;r|-; 


M79217_PEA_1_T15 


3169 


3580 



Segment cluster M79217_PEA_l_node_31 according to the present invention is 
supported by 50 libraries. The number of libraries was determined as previously described. This 
15 segment can be found in the following transcript(s): M79217_PEA_1 JIT, M79217_PEA_1_T3, 
M79217_PEA_1„T8 and M79217_PEA_1__T10. Table 605 below describes the starting and 
ending position of this segment on each transcript. 

Table 605 - Segment location on transcripts 
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Transcript name -J 


Segment 
starting position 


Segment 
ending position 


M79217_PEA_1_T1 


3716 


3960 


M79217_PEA_1_T3 


3390 


3634 


M79217_PEA_1_T8 


3354 


3598 


M79217_PEA_1_T10 


523 


767 



Segment cluster M792 1 7_PEA_l_node_33 according to the present invention is 
supported by 71 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M79217J > EA_1JIT, M79217_PEA_1 JT3, 
M79217_PEA_1_T8 and M79217_PEA_1_T10. Table 606 below describes the starting and 
ending position of this segment on each transcript. 

Table 606 - Segment location on transcripts 



Transcript name £ % r 


Segment :fs 
! starting position 1 \ 


Segment ^ v _ "| 
ending position ; 


M79217_PEA_1_T1 


4015 


4631 


M79217_PEA_1_T3 


3689 


4305 


M79217_PEA_1_T8 


3653 


4269 


M79217_PEA_1_T10 


822 


1438 



10 

Segment cluster M79217JPEA_l_node_34 according to the present invention is 
supported by 51 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217J>EAJLjn, M79217JPEA_1_T3, 
M792 1 7_PEA_1_T8 and M7 9 2 1 7_PE A_ 1 JIT 0 . Table 607 below describes the starting and 
15 ending position of this segment on each transcript. 

Table 607 - Segment location on transcripts 
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Transcript name , - 1 

- -V "" * . 


Segment -. 


Segment 




starting position 


ending position 


M79217_PEA_1_T1 


4632 


4869 


M79217_PEA_1_T3 


4306 


4543 


M79217_PEA_1_T8 


4270 


4507 


M79217_PEA_1_T10 


1439 


1676 



Segment cluster M79217JPEA_ljttodeJ35 according to the present invention is 
supported by 53 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M79217JPEA_1_T1 ? M79217JPEA_1_T3, 
M79217JPEA_1_T8 and M79217_JPEA_1_T10. Table 608 below describes the starting and 
ending position of this segment on each transcript. 

Table 608 - Segment location on transcripts 



jTwnscriptnatne : - ,' j 

: ' ' " .3?; ' i -P' ' ,W '' ■ 


Segment. ••'*•' *- ; 
starting'position - 


j Segment 

ending position {, 


M79217_PEA_1_T1 


4870 


4997 


M79217_PEA_1_T3 


4544 


4671 


M79217_PEA_1_T8 


4508 


4635 


M79217_PEA_1_T10 


1677 


1804 



Segment cluster M79217JPEA_l_nodeJ37 according to the present invention is 
supported by 58 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217JPEA_1_T1, M79217_PEA_1_T3, 
M79217_PEA_1_T8 and M79217_PEAJ_T10. Table 609 below describes the starting and 
1 5 ending position of this segment on each transcript. 

Table 609 - Segment location on transcripts 
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Transcript name ' ; - . r 


Segment 
starting position 


Segment 
ending position 


M79217_PEA_1_T1 


5039 


5280 


M79217_PEA_1_T3 


4713 


4954 


M79217_PEA_1_T8 


4677 


4918 


M79217_PEA_1_T10 


1846 


2087 



Segment cluster M79217_PEA_i_node_38 according to the present invention is 
supported by 62 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M79217JPEA_1JT1, M79217_PEA_1_T3 ? 
M79217JPEAJLT8 and M79217JPEA_1 JT10. Table 610 below describes the starting and 
ending position of this segment on each transcript. 



Table 610 - Segment location on transcripts 



Transcript name 


Segment J * % * ' 
starting position J 


Segment ';*?% „ 
ending position • 


M79217_PEA_1_T1 


5281 


5436 


M79217_PEA_1_T3 


4955 


5110 


M79217_PEA_1_T8 


4919 


5074 


M79217_PEA_1_T10 


2088 


2243 



10 

Segment cluster M792 1 7JPE A_l _node_4 1 according to the present invention is 
supported by 171 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): M79217JPEA__1_T1, 
M79217_PEA_1 JT3, M79217JPEA_1_T8, M79217J>EA_1_T10 and M79217_PEA__1_T18. 
15 Table 61 1 below describes the starting and ending position of this segment on each transcript. 

Table 611- Segment location on transcripts 
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TranScri.pt name .'t '^ 


Segment ' . 
starting position . 


Segment 

tending position v 


M79217_PEA_1_T1 


5628 


6357 


M79217_PEA_1_T3 


5302 


6031 


M79217_PEA_1_T8 


5266 


5995 


M79217_PEA_1_T10 


2435 


3164 


M79217_PEA_1_T18 


755 


1484 



Segment cluster M79217JPEA_l_node_44 according to the present invention is 
supported by 89 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M79217JPEAJMT1, M79217J > EA_1_T3, 
M79217JPEAJLT8, M79217JPEA_1_TT0 and M79217_PEA_1_T18. Table 612 below 
describes the starting and ending position of this segment on each transcript. 

Table 612 - Segment location on transcripts 



Transcript name Jf, t'rMi 


Segment 


Segment i 




starting position 


pending jppsition 


M79217_PEA_1_T1 


6472 


6659 


M79217_PEA_1_T3 


6146 


6333 


M79217_PEA_1_T8 


6110 


6297 


M79217_PEA_1_T10 


3279 


3466 


M79217_PEA_1_T18 


1599 


1786 



According to an optional embodiment of the present invention, short segments related to 
10 the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

Segment cluster M7921 7 PEA I node O according to the present invention is supported 
by 4 libraries. The number of libraries was determined as previously described. This segment 
15 can be found in the following transcript(s): M79217JPEA_1_T3. Table 613 below describes the 
starting and ending position of this segment on each transcript. 
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Table 613 - Segment location on transcripts 



Transcript name / . ; /\ ■/ 


Segment 
starting position 


Segment y\v 
ending position 


M79217J>EAJ_T3 


1 


49 



Segment cluster M79217_PEA_l_nodeJ7 according to the present invention is supported 
5 by 1 1 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M79217_PEA_1_T3, M79217_PEA_1_T8 ? 
M79217JPEA_1_T15 and M79217JPEA_1_T18. Table 614 below describes the starting and 
ending position of this segment on each transcript. 

Table 614 - Segment location on transcripts 



Transcript name ■ ' 


Segment 55 ; 


Segment ;fv.- ,/ ; ' J 




. starting position • £ 


ending position f ; 


M79217_PEA_1_T3 


178 


271 


M79217_PEA_1_T8 


178 


271 


M79217_PEA_1_T15 


178 


271 


M79217_PEA_1_T18 


178 


271 



Segment cluster M79217JPEA__l_node_12 according to the present invention can be 
found in the following transcript(s): M79217JPEA_1_T1, M79217JPEA_1_T3, 
M79217JPEA_1_T8 andM79217JPEA_lJT15. Table 615 below describes the starting and 
15 ending position of this segment on each transcript. 
Table 615 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


M79217_PEA_1_T1 


1524 


1547 


M79217_PEA_1_T3 


1198 


1221 
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M79217_PEA_1_T8 


1198 


1221 


M79217_PEA_1_T15 


1198 


1221 



Segment cluster M79217J 5 EA_l_node__19 according to the present invention is 
supported by 1 libraries. The number of libraries was determined as previously described. This 



5 segment can be found in the following transcript(s): M79217JPEA_1 JT10. Table 616 below 
describes the starting and ending position of this segment on each transcript. 

Table 616 - Segment location on transcripts 



Transcript name r< j\j / f " 

: " f,-. ..i> ' / ' ; / -;#v •■ ; 


^stmto^osition i 


Segment ,/\ * % ' 


M79217J > EA_1_T10 


1 


79 



10 Segment cluster M79217JPEA_l_node_21 according to the present invention is 

supported by 1 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217JPEA_1_T10. Table 617 below 
describes the starting and ending position of this segment on each transcript. 

Table 617 - Segment location on transcripts 



Tt^criptname V ;\ ^ 


S Segment f 

i starting position , 


"Segments 
endiBg position 


M79217JPEA_1_T10 


80 


156 



15 

Segment cluster M79217_PEA_l_node_26 according to the present invention is 
supported by 40 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217J>EA_1_T1 5 M79217_PEA_1 JT3 
20 and M79217_PEA_1_T10. Table 618 below describes the starting and ending position of this 
segment on each transcript. 
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Table 618 - Segment location on transcripts 



Transcript name . ' 


Segment 
starting position 


| Segment " ; f-_ ' 
ending position 


M79217_PEA_1_T1 


3495 


3530 


M79217_PEA_1_T3 


3169 


3204 


M79217_PEA_1_T10 


302 


337 



Segment cluster M79217_PEA_l__node_27 according to the present invention is 
5 supported by 46 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217_PEA_1_T1, M79217JPEA_1 JT3, 
M79217_PEA_1 T8 and M79217JPEA_1_T10. Table 619 below describes the starting and 
ending position of this segment on each transcript. 

Table 619 - Segment location on transcripts 



Transcript nkme fJ , /• -'^-j,. '!'$'. 


Segment 


, ^Sgmejot '\f . ' '■;£'<■■' 




%tartii%posffibn " 


ending position | 


M79217_PEA_1_T1 


3531 


3623 


M79217_PEA_1_T3 


3205 


3297 


M79217_PEA_1_T8 


3169 


3261 


M79217_PEA_1_T10 


338 


430 



10 

Segment cluster M79217_PEA_l_node_30 according to the present invention is 
supported by 47 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217JPEA_1_T1, M79217_PEA_1_T3, 
1 5 M79217_PEA_1_T8 and M79217 JPEAJMTIO. Table 620 below describes the starting and 
ending position of this segment on each transcript. 

Table 620 - Segment location on transcripts 
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Transcript name 


Segment 
starting position 


Segment 
ending position 


M79217_PEA_1_T1 


3624 


3715 


M79217_PEA_1_T3 


3298 


3389 


M79217_PEA_1_T8 


3262 


3353 


M79217_PEA_1_T10 


431 


522 



Segment cluster M79217_PEA_l_node_32 according to the present invention is 
supported by 40 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M79217J > EA_1_T1, M79217_PEA_1_T3, 
M79217JPEA_1_T8 and M79217JPEA_1_T10. Table 621 below describes the starting and 
ending position of this segment on each transcript. 

Table 621 - Segment location on transcripts 



T^s#ptn^ J] v / 


Segment ; ^.f, 
starting position # : 


Segment '** Jf 
ending position ;? 


M79217_PEA_1_T1 


3961 


4014 


M79217_PEA_1_T3 


3635 


3688 


M79217_PEA_1_T8 


3599 


3652 


M79217_PEA_1_T10 


768 


821 



10 

Segment cluster M79217JPEA_l_node_36 according to the present invention is 
supported by 42 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217JPEA_1_T1 ? M79217JPEAJJT3, 
M79217_PEA_1JT8 andM79217_PEA_l_T10. Table 622 below describes the starting and 
15 ending position of this segment on each transcript. 

Table 622 - Segment location on transcripts 
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Segment 
starting position 


- Segment 
^ending position 


M79217_PEA_1_T1 


4998 


5038 


M79217_PEA_1_T3 


4672 


4712 


M79217_PEA_1_T8 


4636 


4676 


M79217_PEA_1_T10 


1805 


1845 



Segment cluster M79217JPEA_l_node_39 according to the present invention is 
supported by 57 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M79217_PEA_1_T1, M79217J>EA_1 JT3, 
M79217JPEA_1_T8 and M79217JPEA_1_T10. Table 623 below describes the starting and 
ending position of this segment on each transcript. 

Table 623 - Segment location on transcripts 



Transcript name 0 '■ f'f. ■ Mf f - 


Segment /V 


. Segment \ 'a ' 
ending position ; A ■ 


M79217_PEA_1_T1 


5437 


5520 


M79217_PEA_1_T3 


5111 


5194 


M79217_PEA_1_T8 


5075 


5158 


M79217_PEA_1_T10 


2244 


2327 



10 

Segment cluster M79217JPEA__l_node_40 according to the present invention is 
supported by 59 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217JPEAJMT1, M79217JPEA_1_T3, 
M79217_PEA_1_T8 and M79217JPEA_1_T10. Table 624 below describes the starting and 
15 ending position of this segment on each transcript. 

Table 624 - Segment location on transcripts 
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Transcript name ' ./ 1 ~ 


Segment 
starting position 


• Segment ; 
ending position 


M79217_PEA_1_T1 


5521 


5627 


M79217_PEA_1_T3 


5195 


5301 


M79217_PEA_1_T8 


5159 


5265 


M79217_PEA_1_T10 


2328 


2434 



Segment cluster M79217_PEA_lnode_42 according to the present invention is 
supported by 99 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M79217_PEA_1„T1, M79217J>EA_1_T3 ? 
M79217JPEA_1_T8 5 M79217_PEA_1_T10 and M79217JPEAJLT18. Table 625 below 
describes the starting and ending position of this segment on each transcript. 

Table 625 - Segment location on transcripts 



Ifii^script namer 

...... - ■■ "V,.-;... • - -i . 


Segment V .-',.^E£ ' '" 
starting position 


Segment ; - .; 
ending position , 


M79217_PEA_1_T1 


6358 


6443 


M79217_PEA_1_T3 


6032 


6117 


M79217_PEA_1_T8 


5996 


6081 


M79217_PEA_1_T10 


3165 


3250 


M79217_PEA_1_T18 


1485 


1570 



10 

Segment cluster M79217JPEA_l_node_43 according to the present invention is 
supported by 90 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M79217_PEA_1_T1, M79217_PEA_1_T3, 
M79217JPEAJLT8, M79217_PEA_1_T10 and M79217JPEA_1_T18. Table 626 below 
1 5 describes the starting and ending position of this segment on each transcript. 

Table 626 - Segment location on transcripts 
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Transcript name v l '?*r 


Segment 
starting position 


Segment , 
ending position . ' - 


M79217_PEA_1_T1 


6444 


6471 


M79217_PEA_1_T3 


6118 


6145 


M79217_PEA_1_T8 


6082 


6109 


M79217_PEA_1_T10 


3251 


3278 


M79217_PEA_1_T18 


1571 


1598 



Variant protein alignment to the previously known protein: 
5 Sequence name: BAA25445 

Sequence documentation : 

Alignment of: M7 9217_PEA_1__P1 x BAA25445 

10 

Alignment segment 1/1: 

Quality: 9101.00 

Escore: 0 

15 Matching length: 919 Total 

length: 919 
Matching Percent Similarity: 100.00 Matching Percent 

Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

20 Identity: 100.00 

Gaps : 0 

Alignment : 
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1 MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYY 50 

I | | | | | I I I I I I I I 11 M I I I I I I I I 1 I t I I M I I I ! I I I I I I t I M I I ! 

13 MTGYTMLRNGGAGN GGQTCMLRWSNRI RLTWLS FTLFVI LVFFPLI AHYY 62 
51 LTTLDEADEAGKRIFGPRVGNELCEVKHVLDLCRIRESVSEELLQLEAKR 10 0 

I M | | | | | | | I I I I I I I I ) I I I I I ! I M I I II I I I I I I I I i I I M I I I I I 

63 LTTLDEADEAGKRIFGPRVGNELCEVKHVLDLCRIRESVSEELLQLEAKR 112 

101 QELN SE I AKLNLK I EACKKS I ENAKQDLLQLKNVI SQTEH S YKBLMAQNQ 150 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 1 I I I I I I I M I I 

113 QELNSEIAKLNLKIEACKKSIENAKQDLLQLKNVISQTEHSYKELMAQNQ 162 

151 PKLSLPIRLLPEKDDAGLPPPKATRGCRLHNCFDYSRCPLTSGFPVYVYD 200 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I! I I 1 I I 
163 PKLSLPIRLLPEKDDAGLPPPKATRGCRLHNCFDYSRCPLTSGFPVYVYD 212 

201 SDQFVFGSYLDPLVKQAFQATARANVYVTENADIACLYVILVGEMQEPW 250 

| | | | | M | | | I I I I I 1 I I I I I I I I I I I I I I I I I I I M I I I I II I M I I I I 

213 SDQFVFGSYLDPLVKQAFQATARANVYVTENADIACLYVILVGEMQEPVV 2 62 

251 LRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTGRAMVAQ 300 

I I Ml I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I M I N I I I 

2 63 LRP AE LEKQLYSLP HWRT D GHNH V 1 1 NL S RK S D T QN LL YN V S T GRAMVAQ 312 

301 STFYTVQYRPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKI 350 

| | | | | | | | | | I I I I I I I I I I I I I II I I I I I I I I IN I I II I I I I I I I I I I 

313 STFYTVQYRPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKI 362 

351 ESLRSSLQEARSFEEEMEGDPPADYDDRIIATLKAVQDSKLDQVLVEFTC 400 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
363 ESLRSSLQEARSFEEEMEGDPPADYDDRIIATLKAVQDSKLDQVLVEFTC 412 
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401 KNQPKPSLPTEWALCGEREDRLELLKLSTFALI ITPGDPRLVISSGCATR 450 

I I I I I I I i I II I I ! I I I i I I I I M I I I I I I ! I ! I I I I IN M I I 1 I I I I I 

413 KNQPKPSLPTEWALCGEREDRLELLKLSTFALI ITPGDPRLVISSGCATR 462 

5 . 

4 51 LFEALEVGAVPWLGEQVQLPYQDMLQWNEAALVVPKPRVTEVHFLLRSL 50 0 

I I M I I I I I I I I I I I I I I I I I I 1 I I II I I I I I I I I I I I I I I I I I I I M 1 I 

4 63 LFEALEVGAVPVVLGEQVQLPYQDMLQWNEAALVVPKPRVTEVHFLLRSL 512 

. - - * 

10 501 SDSDLLAMRRQGRFLWETYFSTADSIFNTVLAMIRTRIQIPAAPIREEAA 550 

I I I I I M I I I I I M I I I II I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I 

513 SDSDLLAMRRQGRFLWETYFSTADSIFNTVLAMIRTRIQIPAAPIREEAA 562 

551 AEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYLRNFTLTVTDF 600 

15 | I I I I I I I I I I 1 I I I 1 M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

5 63 AEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYLRNFTLTVTDF 612 

601 YRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 650 

II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

20 613 YRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 6 62 

651 QAALGGNVPREQFTVVMLTYEREEVLMNSLERLNGLPYLNKVVVVWNSPK 700 

I I I I I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I 

6 63 QAALGGNVPREQFTVVMLTYEREEVLMNSLERLNGLPYLNKVVWWNSPK 712 
25 ..... 

701 LPSEDLLWPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLR 750 

I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I 

713 LPSEDLLWPDI GVP IMWRTEKNSLNNRFLPWNE IETEAILS I DDDAHLR 762 
30 751 HDEIMFGFRVWREARDRIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLT 800 

I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I M I I I I 
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7 63 HDEIMFGFRVWREARDRIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLT 812 
801 GAAFFHKYYAYLYSYVMPQAIRDMVDEYINCEDIAMNFLVSHITRKPPIK 850 

I I I I I I I I I I I I I I I 1 I I I I I I I I I I ! I I I I ! I I I I I I I I i I I I I M i I I 

813 G AAFFHKY Y AYL Y S Y VMPQA I RDMVDE Y INCE D I AMNFLV SHI TRKP P I K 862 
851 VTSRWTFRCPGCPQALSHDDSHFHERHKCINFFVKVYGYMPLLYTQFRVD 90 0 

I I I I I I I I I I i I I M I I I I I M I 1 I I I I I I I I I I I I I M I I I I I I I I i I I 

8 63 VTSRWTFRCPGCPQALSHDDSHFHERHKCINFFVKVYGYMPLLYTQFRVD 912 

901 SVLFKTRLPHDKTKCFKFI 919 

I 1 I I 11 I I I I I I I I I I I 1 I 

913 SVLFKTRLPHDKTKCFKFI 931 



Sequence name: EXL3_HUMAN 
Sequence documentation : 

Alignment of: M7 9217_PEA_1_P2 x EXL 3_HUMAN 
Alignment segment 1/1: 

Quality: 8873.00 

Escore: 0 

Matching length: 907 Total 

length: 919 
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Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 98.69 Total Percent 

Identity: 98.69 

Gaps : 1 



Alignment : 

1 MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYY 50 

| | | I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 1 I I I I I M I I I 

1 MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYY 50 
51 LTTLDEADEAGKRIFGPRVGNELCEVKHVLDLCRIRESVSEELLQLEAKR 10 0 

I I I I I I I I I I I I I I I I I I I 1 I I I I I I t I I I I I I M I I I M I I I I t I I M I 

51 LTTLDEADEAGKRIFGPRVGNELCEVKHVLDLCRIRESVSEELLQLEAKR 100 
101 QELNSEIAKLNLKIEACKKSIENAKQDLLQLKNVISQTEHSYKELMAQNQ 150 

I I I I I I 1 I II I I I I M I M I I I I I 1 I I I I I I I I I 1 I I I I I I I II I I I I I I 

101 QELNSEIAKLNLKIEACKKSIENAKQDLLQLKNVISQTEHSYKELMAQNQ 150 
151 PKLSLPIRLLPEKDDAGLPPPKATRGCRLHNCFDYSRCPLTSGFPVYVYD 200 

| | | I | I I I I I I I I [ I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

151 PKLSLPIRLLPEKDDAGLPPPKATRGCRLHNCFDYSRCPLTSGFPVYVYD 200 
201 SDQFVFGSYLDPLVKQAFQATARA1SIVYVTENADIACLYVILVGEMQEPW 250 

| I I I | I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I III I I I 

201 S DQF VFG S YL D P L VKQAFQ AT ARANV Y VTEN AD I AC L Y V I LVGEMQE P W 250 

251 LRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTGRAMVAQ 300 

| | | | | | M I I I I I I I I t I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I N I 
251 LRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTGRAMVAQ 30 0 
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301 STFYTVQYRPGFDLVVSPLVHAMSEPNFMEI PPQVPVKRKYLFTFQGEKI 350 

I | | | | I I I I ! I I I I I I I II I I I I I I I I I I I 1 I i M 1 I I II M I I I I I I I I 

301 STFYTVQYRPGFDLVVSPLVHAMSEPNFMEI PPQVPVKRKYLFTFQGEKI 350 
351 ESLRSSLQEARSFEEEMEGDPPADYDDRI IATLKAVQDSKLDQVLVEFTC 400 

I I I ! I M I I I I I t I I 1 I I I I I I I I I I I I ! I I I I I M I I I I I II I I I I I 1 I 

351 ESLRSSLQEARSFEEEMEGDPPADYDDRI IATLKAVQDSKLDQVLVEFTC 400 

401 KNQPKPSLPTEWALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATR 450 

| I I I I M I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

401 KNQPKPSLPTEWALCGEREDRLELLKLSTFALI ITPGDPRLVI S SGCATR 450 

451 LFEALEVGAVPVVLGEQVQLPYQDMLQWNEAALVVPKPRVTEVHFLLRSL 50 0 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I 1 I 1 I I I I I I I 1 

451 LFEALEVGAVPVVLGEQVQLPYQDMLQWNEAALVVPKPRVTEVHFLLRSL 50 0 

501 SDSDLLAMRRQGRFLWETYFSTADSIFNTVLAMIRTRIQIPAAPIREEAA 550 

| I I I I I I I I I I I II I I I I I I I I I I I I 1 I I M II I I I I I I I I I I I I I I I I I 
501 SDSDLLAMRRQGRFLWETYFSTADSIFNTVLAMIRTRIQIPAAPIREEAA 550 

551 AEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYLRNFTLTVTDF 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I 

551 AEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYLRNFTLTVTDF 600 

601 YRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 650 

| | | I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I 
601 YRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 650 

651 QAALGGNVPREQFTWMLTYEREEVLMNSLERLNGLPYLNKWVVWNSPK 700 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I ! II I II I M I I 
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651 Q AAL G GN V PRE Q F T V VML T Y E RE E VLMN S L E RLN G L P Y LN K V V V V WN S P K 700 
701 LPSEDLLWPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLR 750 

| | | | | | | | | I I t 1 I I I I I I I ! I I M I I I I I M I I I i I I I I I I M I II M I 

701 LPSEDLLWPDIGVPIMWRTEKNSLNNRFLPWNEIETEAILSIDDDAHLR 750 
751 HDE1MFGFRVWREARDRIVGFPGRYHAWD1PHQSWLYNSNYSCELSMVLT 8 00 

| I I I I I ! I M I I I I I I I I I I I I 1 I I I I 1 I I I I I I I I I I I I I I I I I I I I I I 

751 HDEIMFGFRVWREARDRIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLT 800 

801 GAAFFHK AIRDMVDEYINCEDIAMNFLVSHITRKPPIK 83 8 

| | t | | | ! I I I I I I I I I 1 I I I I I I I I I I I I I II I I I I I I 

801 GAAFFHKYYAYLYSYVMPQAIRDMVDEYINCEDIAMNFLVSH1TRKPPIK 85 0 

83 9 VTSRWTFRCPGCPQALSHDDSHFHERHKCINFFVKVYGYMPLLYTQFRVD 888 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I! I I I I I I I I II I I I I 

851 VT SRWT FRC PGC P QAL S H D D S H FHERHKC I NFF VKV YG YMPLL Y T QFRVD 900 

889 SVLFKTRLPHDKTKCFKFI 907 

I I I I I I I I I I I I I I I I I I I 
901 SVLFKTRLPHDKTKCFKFI 919 



Sequence name: EXL 3__HUMAN 
Sequence documentation : 
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Alignment of: M7 9217_PEA_1_P4 x EXL3_HUMAN 



Alignment segment 1/1: 



5 Quality: 1668.00 

Escore: 0 

Matching length: 162 
length: 162 
Matching Percent Similarity: 100.00 
10 Identity: 99.38 

Total Percent Similarity: 100.00 
Identity: 99.38 

Gaps : 0 



Total 



Matching Percent 



Total Percent 



15 Alignment: 



20 



51 YRVWREARDRIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHK 100 

: I I M I I I I I I I I I M I M I I I I I I I I I I I I I I I I I M 1 I I I I I I I I I M 

758, FRVWREARDRIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHK 807 

101 YYAYLYSYVMPQAIRDMVDEYINCEDIAMNFLVSHITRKPPIKVTSRWTF 150 

I I I I 1 I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
808 YYAYLYSYVMPQAIRDMVDEYINCEDIAMNFLVSHITRKPPIKVTSRWTF 857 



25 



151 RCPGCPQALSHDDSHFHERHKCINFFVKVYGYMPLLYTQFRVDSVLFKTR 200 

I I I I I i I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 

858 RCPGCPQALSHDDSHFHERHKCINFFVKVYGYMPLLYTQFRVDSVLFKTR 907 



30 



201 LPHDKTKCFKFI 
I I I I I I I I I I I I 
908 LPHDKTKCFKFI 



212 



919 
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5 

Sequence name: EXL3_HUMAN 
Sequence documentation : 

10 

Alignment of: M7 92 1 7_PEA_1_P8 x EXL3_HUMAN 

Alignment segment 1/1: 

15 Quality: 7947.00 

Escore: 0 

Matching length: 807 
length: 807 
Matching Percent Similarity: 100.00 
20 Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 

25 Alignment: 

1 MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYY 50 

I I I ! I I I I I I I I I I i I I I I I I I I I I I I It I I I I I t I I I I ! I I I I I I ) I I I 

1 MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYY 50 
30 . . . . . 

51 LTTLDEADEAGKRIFGPRVGNELCEVKHVLDLCRIRESVSEELLQLEAKR 100 



Total 
Matching Percent 
Total Percent 
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I I Ml I 1 I I I I I I I I I I I I I I I I 1 I I 1 I Ml I M I 1 I I I I I I I I I I I I I I 

51 LTTLDEADEAGKRIFGPRVGNELCEVKHVLDLCRIRESVSEELLQLEAKR 100 
101 QELNSEIAKLNLKIEACKKSIENAKQDLLQLKNVISQTEHSYKELMAQNQ 150 

I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I 

101 QELNSEIAKLNLKIEACKKSIENAKQDLLQLKNVISQTEHSYKELMAQNQ 150 
151 PKLSLPIRLLPEKDDAGLPPPKATRGCRLHNCFDYSRCPLTSGFPVYVYD 200 

| | I I I I I I I I I I I 1 1 I I I I I I I! I I I 1 I I I I I I I I I M I I I I I I I I I I I I 

151 PKLSLPIRLLPEKDDAGLPPPKATRGCRLHNCFDYSRCPLTSGFPVYVYD 200 
201 SDQFVFGSYLDPLVKQAFQATARANVYVTENADIACLYVILVGEMQEPW 250 

I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 1 I I I I I 

201 SDQFVFGSYLDPLVKQAFQATARANVYVTENADIACLYVILVGEMQEPW 250 

251 LRPAELEKQLYSLPHWRTDGHNHVI INLSRKSDTQNLLYNVSTGRAMVAQ 300 

M | | | | | | | | | | I I I I I I I I I I I I I I I I I I I M I I I I I I 1 I I I I I I I I II 
251 LRPAELEKQLYSLPHWRTDGHNHVI INLSRKSDTQNLLYNVSTGRAMVAQ 30 0 

301 STFYTVQYRPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKI 350 

| I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I 

301 STFYTVQYRPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKI 350 

351 ESLRSSLQEARSFEEEMEGDPPADYDDRIIATLKAVQDSKLDQVLVEFTC 400 

| | | | | | | | I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
351 ESLRSSLQEARSFEEEMEGDPPADYDDRIIATLKAVQDSKLDQVLVEFTC 400 

401 KNQPKPSLPTEWALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATR 450 

| | I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I II I I 1 I I I I I I I I 
401 KNQPKPSLPTEWALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATR 450 
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451 LFEALEVGAVPVVLGEQVQLPYQDMLQWNEAALVVPKPRVTEVHFLLRSL 5 0 0 

| | | | | | M | t I I I I I I I I I I I I I II I I I I I 1 M I ! M I 1 II ! II I I I 1 i I 

451 LFEALEVGAVPVVLGEQVQLPYQDMLQWNEAALVVPKPRVTEVHFLLRSL 500 

501 SDSDLLAMRRQGRFLWETYFSTADSIFNTVLAMIRTRIQIPAAPIREEAA 550 

| | | | | | I I I I I I I I I 1 I I I I I I I I I I I 1 I 1 M I M • I I I I I I I I I I I 1 I I 
501 SDSDLLAMRRQGRFLWETYFSTADSIFNTVLAM1RTRIQIPAAPIREEAA 550 

551 AEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYLRNFTLTVTDF 600 

| | | | | M | I I I I I I I I I II I I I I I II I I I 1 I I I I I I I I I I I I I I I 1 I I I I 

551 AEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYLRNFTLTVTDF 600 

601 YRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 650 

| I | | | | | | I I I I 1 I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I f I I I I I I I I 

601 YRSWNCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEF 650 

* • * 

651 QAALGGNVPREQFTVVMLTYEREEVLMNSLERLNGLPYLNKVVVVWNSPK 7 00 

| | | | | | | | | | | I 1 I I I I I I I I I I I I I I I I II I I I I I M Ml I I I I I I I I I 

651 QAALGGNVPREQFTWMLTYEREEVLMN^ 700 
701 LPSEDLLWPDIGVPIMWRTEKNSLNNRFLPWNEIETEAILSIDDDAHLR 750 

I M | | | | | | | | I I I II I II III I I I I I ! I I I I I I I M I I I I I I I I I I I I I 

701 LPSEDLLWPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLR 750 
751 HDEIMFGFRVWREARDRIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLT 80 0 

| | | | | | | | | I I I I I I I I I I I I M I I I I I I I I I II ! II I I II 1 I II I II M 

751 HDEIMFGFRVWREARDRIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLT 800 



8 01 GAAFFHK 
I I! I I I I 
801 GAAFFHK 



807 
807 
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DESCRIPTION FOR CLUSTER M62096 
Cluster M62096 features 9 transcripts) and 42 segment(s) of interest, the names for 
which are given in Tables 627 and 628, respectively, the sequences themselves are given at the 
5 end of the application. The selected protein variants are given in table 629. 

Table 627 - Transcripts of interest 



Transcript Name \i \ 


Sequence ID No. ; V ■ . 


M62096_PEA_1_T4 


65 


M62096_PEA_1_T5 


66 


M62096_PEA_1_T6 


67 


M62096_PEA_1_T7 


68 


M62096_PEA_1_T9 


69 


M62096_PEA_1_T11 


70 


M62096_PEA_1_T13 


71 


M62096_PEA_1_T14 


72 


M62096_PEA_1_T15 


73 


Table 628 - Segments of interest 


Segment Name f'.-... 


Sequence ID No. g ; ''->:./ 


M62096_PEA_l_node_0 


616 


M62096_PEA_l_node_2 


617 


M62096_PEA_l_node_l 5 


618 


M62096_PEA_l_node_l 7 


619 


M62096_PEA_l_node_l 9 


620 


M62096_PEA_l_node_23 


621 


M62096_PEA_l_node_27 


623 


M62096_PEA_l_node_29 


624 


M62096_PEA_l_node_3 1 


625 


M62096_PEA_l_node_34 


626 
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M62096JPEA_l_node_36 


627 


M62096_PEA_l_node_38 


628 


M62096_PEA_l_node_40 


629 


M62096_PEA_l_node_48 


630 


M62096_PEA_l_node_50 


631 


M62096_PEA_l_node_56 


632 


M62096_PEA_l_node_60 


633 


M62096_PEA_l_node_65 


634 


M62096_PEA_l_node_69 


635 


M62096JPEA_l_node_7 1 


636 


M62096_PEA_l_node_l 


637 


M62096_PEA_l_node_4 


638 


M62096_PEA_l_node_6 


639 


M62096_PEA_l_node_7 


640 


M62096_PEA_l_node_9 


641 


M62096_PEA_l_node_l 1 


642 


M62096_PEA_l_node_l 3 


643 


M62096_PEA_l_node_2 1 


644 


M62096_PEA_l_node_25 


645 


M62096_PEA_l_node_3 3 


646 


M62096_PEA_l_node 42 


647 


M62096_PEA_l_node_44 


648 


M62096_PEA_l_node_47 


649 


M62096_PEA_l_node_5 1 


650 


M62096_PEA_l_node_53 


651 


M62096_PEA_l_node_55 


652 


M62096_PEA_l_node_58 


653 


M62096_PEA_l_node_62 


654 


M62096_PEA_l_node_66 


655 


M62096_PEA_l_node_67 


656 
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M62096_PEA_l_node_68 


657 


M62096_PEA_l_node_70 


658 



Table 629 - Proteins of interest 



Protein Name : ... . ' 


Sequence ID No. : 


Corresponding trans6ript(s) 


M62096_PEA_1_P4 


1341 


M62096_PEA_1_T6 


M62096_PEA_1_P5 


1342 


M62096_PEA_1_T7 


M62096_PEA_1_P3 


1343 


M62096_PEA_1_T9 


M62096_PEA_1_P7 


1344 


M62096_PEA_1_T1 1 


M62096_PEA_1_P8 


1345 


M62096_PEA_1_T13 


M62096_PEA_1_P9 


1346 


M62096_PEA_1_T14 


M62096_PEA_1_P10 


1347 


M62096_PEA_1_T15 


M62096_PEA_1_P11 


1348 


M62096_PEA_1_T4 


M62096_PEA_1 _P 1 2 


1349 




M62096_PEA_1_T5 



Tliese sequences are variants of the known protein Kinesin heavy chain isoform 5C 
5 (SwissProt accession identifier KF5C_HUMAN; known also according to the synonyms 

Kinesin heavy chain neuron-specific 2), SEQ ID NO: 1438, referred to herein as the previously 
known protein. 

Protein Kinesin heavy chain isoform 5C is known or believed to have the following 
function(s): Kinesin is a microtubule- associated force-producing protein that may play a role in 
1 0 organelle transport. The sequence for protein Kinesin heavy chain isoform 5C is given at the en 
of the application, as "Kinesin heavy chain isoform 5C amino acid sequence". Known 
polymorphisms for this sequence are as shown in Table 630. 



Table 630 - Amino acid mutations for Known Protein 



SNP positions) on 
amino acid sequence 


Comment 


355 - 360 


TLKNVI -> STHASV 


583 - 585 


EFT -> DRV 



WO 2006/131783 



PCT/IB2005/004037 



702 



The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: organelle organization and biogenesis, which are annotation(s) related 
to Biological Process; microtubule motor; ATP binding, which are annotation(s) related to 
5 Molecular Function; and kinesin, which are annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

As noted above, cluster M62096 features 9 transcript(s), which were listed in Table 1 
10 above. These transcript(s) encode for protein(s) which are variant(s) of protein Kinesin heavy 
chain isoform 5C. A description of each variant protein according to the present invention is 
now provided. 

Variant protein M62096_PEA_1P4 according to the present invention has an amino acid 
1 5 sequence as given at the end of the application; it is encoded by transcript(s) 

M62096_PEA_1_T6. An alignment is given to the known protein (Kinesin heavy chain isoform 
5C) at the end of the application. One or more alignments to one or more previously published 
protein sequences are given at the end of the application. A brief description of the relationship 
of the variant protein according to the present invention to each such aligned protein is as 
20 follows: 

Comparison report between M62096JPEAJ JP4 and KF 5 CHUM AN : 
l.An isolated chimeric polypeptide encoding for M62096JPEA_1_P4, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
25 the sequence MATYIH corresponding to amino acids 1-6 of M62096_PEA_1_P4, and a 
second amino acid sequence being at least 90 % homologous to 

VSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHWYRDSKMTRILQDSLG 
RTTIVICCSPSVFNEAETKSTLM^ 
LKNVIQHLEMELNRWRNGEAWEDEQISA 
30 KYDEEISSLYRQLDDKDDE1NQQSQ 

IENEAAKDEVKEVLQALEELAVOTDQ 



WO 2006/131783 



PCT/IB2005/004037 



703 

LSQLQELSNHQKKRATEILNLLLKDLGEIGGIIGTNDVKTLADVNGVIEEEFTMARLYIS 
KMKSEVKSLWRSKQLESAQMDSNRKMNASERELAACQLLISQHEAKIKSLTDYMQN 
MEQKRRQLEESQDSLSEELAKLRAQEKMHEVSFQDKEKEHLTRLQDAEEMKKALEQQ 
MESHREAHQKQLSRLRDEIEEKQKIIDEIRDLNQKLQLEQEKXSSDYNKLKIEDQEREM 

5 KLEKLLLLNDKREQAREDLKGLEETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDD 
GGGSAAQKQKISFLENNLEQLTKVHKQLVRDNADLRCELPKLEKRLRATAERVKALES 
ALKEAKENAMRDRKRYQQEVDRIKEAVRAKNMARRAHSAQIAK^ 
VHAIRGGGGSSSNSTHYQK corresponding to amino acids 239 - 957 of KF5CJHUMAN, 
which also corresponds to amino acids 7 - 725 of M62096JPEAJ JP4, wherein said first amino 

10 acid sequence and second amino acid sequence are contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a head of M62096JPEA_1 JP4, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence MATYIH of M62096J>EA_1 JP4. 

15 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
intracellularly. The protein localization is believed to be intracellularly because neither of the 
20 trans- membrane region prediction programs predicted a trans- membrane region for this protein. 
In addition both signatpeptide prediction programs predict that this protein is a non- secreted 
protein. 

Variant protein M62096_PEA_1_P4 is encoded by the following transcript(s): 
25 M62096JPEA_1_T6, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M62096JPEA_1_T6 is shown in bold; this coding portion starts at 
position 108 and ends at position 2282. The transcript also has the following SNPs as listed in 
Table 631 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
30 known SNPs in variant protein M62096_PEA_1JP4 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 
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Table 631 - Nucleic acid SNPs 



SNP position on nucleotide ; . , 
sequence ' : ; ;. . 4. ' . 


Alternative nucleic acid 


Previously tow SNP? 


5757 


G->T 


No 



Variant protein M62096_PEA_1_P5 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
M62096_PEA_1_T7. An alignment is given to the known protein (Kinesin heavy chain isoform 
5C) at the end of the application. One or more alignments to one or more previously published 
protein sequences are given at the end of the application. A brief description of the relationship 
of the variant protein according to the present invention to each such aligned protein is as 
follows: 

Comparison report between M62096_PEA_1_P5 and KF5C_HUMAN: 
l.An isolated chimeric polypeptide encoding for M62096_PEA_1JP5, comprising a first 
amino acid sequence being at least 90 % homologous to 

MTPaLQDSLGGNCRTTmCCSPSWNEAETKSTLlvlFGQRAKTIKNTVSVNLELTAEEWK 

KKYEKEKEKNKLTLKNVIQHLEMELr^WRNGEAWEDEQISAKDQKNLEPCDNTPIIDNI 

APWAGISTEEKEKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQMLDQDELLASTRR 

DYEKIQEELTRLQIENEAAKDEVKEVLQALEELAVNYDQKSQEVEDKTRANEQLTDEL 

AQKTTTLTTTQRELSQLQELSNHQKKRATEILNLLLKDLGEIGGIIGTNDVKTLADVNG 

VIEEEFTMARLYISKlvlKSEVKSLWRSKQLESAQMDSNRKMNASERELAACQLLISQHE 

AKIKSLTDYMQmiEQKPaiQLEESQDSLSEELAKXRAQEKMHEVSFQDKEKEHLTRLQ 

DAEEMKKALEQQMESHREAHQKQLSRLRDEIEEKQKIIDEIRDLNQKLQLEQEKLSSDY 

NKLKIEDQEREMKLEKXLLLNDKP^QAREDLKGLEETVSPJELQTLHNLRKLFVQDLTT 

RVKKSVELDNDDGGGSAAQKQKISFLENNLEQLTKVHKQLVRDNADLRCELPKLEKRL 

RATAERVKALESALK^AXENAMPJDRKRYQQEVDRIKEAVRAKNMARRAHSAQIAKPI 

RPGHYPASSPTAVHAIRGGGGSSSNSTHYQK corresponding to amino acids 284 - 957 of 

KF5 C_HUMAN, which also corresponds to amino acids 1 - 674 of M62096_PEA_1_P5. 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
intracellularly. The protein localization is believed to be intracellularly because neither of the 
5 trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signal-peptide prediction programs predict that this protein is a non-secreted 
protein. 

Variant protein M62096JPEA_1_P5 is encoded by the following transcript(s): 
10 M62096_PEA_1_T7, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M62096JPEA_1 JT7 is shown in bold; this coding portion starts at 
position 283 and ends at position 2304. The transcript also has the following SNPs as listed in 
Table 632 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
15 known SNPs in variant protein M62096_PEA__1_P5 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 632 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence ^ . -S/ f 


Alternative nucleic acid p 


: Previously known SNP? 


5779 


G->T 


No 



20 Variant protein M62096JPEAJJP3 according to the present invention has an amino acid 

sequence as given at the end of the application; it is encoded by transcript(s) 
M62096J?EA_1_T9. An alignment is given to the known protein (Kinesin heavy chain isoform 
5C) at the end of the application. One or more alignments to one or more previously published 
protein sequences are given at the end of the application. A brief description of the relationship 

25 of the variant protein according to the present invention to each such aligned protein is as 
follows: 

Comparison report between M62096_PEA_1 _P3 and KJF5C HUMAN: 
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l.An isolated chimeric polypeptide encoding for M62096JPEA_1__P3, comprising a first 
amino acid sequence being at least 90 % homologous to 

MELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIIDNIAPVVAGISTEEKEKYDEEISSL 

YRQLDDKDDEINQQSQLAEKLKQQMLDQDELLAS TRRDYEKIQEELTRLQIENEAAKD 

EVKEVLQALEELAVNYDQKSQEVEDKTRANEQLTDELAQKTTTLTTTQRELSQLQELS 

NHQKKRATEILNLLLKDLGEIGGIIGTNDVKTLADVNGVIEEEFTMARLYISKMKS 

LVNRSKQLESAQMDSNRKMNASERELAACQLLISQHEAKIKSLTDYMQNMEQKRRQL 

EESQDSLSEELAKLRAQEKMHEVSFQDKEKEHLTRLQDAEEMKKALEQQMESHREAH 

QKQLSRLRDEIEEKQKIIDEIRDLNQKLQLEQEEXSSDYNKLKIEDQEREM 

DKREQAREDLKGLEETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQK 

QKISFLENNLEQLTKVHKQLVRDNADLRCEL^ 

AMRDRKJRYQQEVDRIKEAVRAKNMARRAHSAQIAKPIRPGHWASSPTAVHAIR 
SSSNSTHYQK corresponding to amino acids 365 - 957 of KF5CJHUMAN, which also 
corresponds to amino acids 1 - 593 of M62096JPEA_1JP3. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
intracellular^. The protein localization is believed to be intracellularly because neither of the 
trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signal-peptide prediction programs predict that this protein is a non-secreted 
protein. 

Variant protein M62096JPEA_1_P3 is encoded by the following transcript(s): 
M62096JPEA_1_T9, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M62096_PEA_1_T9 is shown in bold; this coding portion starts at 
position 565 and ends at position 2343. The transcript also has the following SNPs as listed in 
Table 633 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein M62096JPEA_1_P3 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 
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Table 633 - Nucleic acid SNPs 



SHP position on nucleotide . 
sequence ; ^ • ' 


Alternative nucleic acid 


Previously known SNP? 


5818 


G->T 


No 



Variant protein M62096JPEA_1 JP7 according to the present invention has an amino acid 
5 sequence as given at the end of the application; it is encoded by transcript(s) 

M62096JPEA_1_T1 1. An alignment is given to the known protein (Kinesin heavy chain 
isoform 5C) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
10 is as follows: 

Comparison report between M62096J>EA_1 J>7 and KP5CJHUMAN: 

1. An isolated chimeric polypeptide encoding for M62096JPEA_1 JP7, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 

15 the sequence MTQNFRLMWNILLFPLNFS corresponding to amino acids 1 - 19 of 

M62096_PEA_1 JP7, and a second amino acid sequence being at least 90 % homologous to 
LNQKLQLEQEKLSSDYNK^ 

QTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQKQKISFLENNLEQLTKVHKQLVR 

DNADLRCELPKLEKRLRATAER 
20 KNMARRAHSAQIAKPIRPGHYPASSPTAVHAIRGGGGSSSNSTHYQK corresponding to 

amino acids 738 - 957 of KF5C_HUMAN, which also corresponds to amino acids 20 - 239 of 
M62096JPEA_1 JP7, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a head of M62096_PEAJ_P7, comprising a 
25 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 

more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence MTQNFRLMWNILLFPLNFS of M62096JPEA_1_P7. 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because one of the two signal- 
5 peptide prediction programs (HMM :Non- secretory protein,NN:YES) predicts that this protein 
has a signal peptide. 

Variant protein M62096_PEA_1_P7 is encoded by the following transcript(s): 
M62096_PEA_1JT11 ? for which the sequence(s) is/are given at the end of the application. The 

1 0 coding portion of transcript M62096JPEA_1_T1 1 is shown in bold; this coding portion starts at 
position 633 and ends at position 1349. The transcript also has the following SNPs as listed in 
Table 634 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein M62096JPEA_1JP7 sequence provides support for the deduced 

15 sequence of this variant protein according to the present invention). 

Table 634 - Nucleic acid SNPs 



SNP position onS nucleotide : . 
sequence J,.' >,*v 




Previously known SNP? i 


4824 


G->T 


No 



Variant protein M62096_PEA_1_P8 according to the present invention has an amino acid 
20 sequence as given at the end of the application; it is encoded by transcript(s) 

M62096_PEA_1_T13. An alignment is given to the known protein (Kinesin heavy chain 
isoform 5C) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
25 is as follows: 

Comparison report between M62096_PEA_1 JP8 and KF5C_HUMAN: 
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1 .An isolated chimeric polypeptide encoding for M62096JPEA_1 JP8, comprising a first 
amino acid sequence being at least 90 % homologous to 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFDRVLPPNTTQ 
EQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIAHDIFD 
5 HIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLAVHEDKNRVPYVKGCTERFVSSPE 
EVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLmiKQENVETEKKLSGKLYLVDLAGSE 
KVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGN 
CRTTIVICCSPSVFNEAETKSTLMFGQRAKTIKNTVSWLELTAEEWKKKYEKEKEKNK 
TLKNVIQHLEMELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIIDNIAPVVAGISTEEK 

10 EKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQMLDQDELLASTRRDYEKIQEELTRL 
QIENEAAKDEVKEVLQALEELAVNYDQKSQEVEDKTRANEQLTDELAQKTTTLTTTQR 
ELSQLQELSNHQKKRATEILNLLLKDLGEIGGIIGTNDVKTLADVNGVIEEEFTMARL 
SKMKSEVKSLWRSKQLESAQMDSNRKMNASERELAACQLLISQHEAKIKSLTDYMQN 
MEQKRRQLEESQDSLSEELAKLRAQEKMHEVSF^^ 

15 MESHREAHQKQLSRLRDEIEEKQKIIDEIR corresponding to amino acids 1 - 736 of 

KF5CJHUMAN, which also corresponds to amino acids 1 - 736 of M62096_PEA_1 JP8, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence E corresponding to amino acids 737 - 737 of M62096JPEA_1_P8, wherein 

20 said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
25 programs. The variant protein is believed to be located as follows with regard to the cell: 

intracellularly. The protein localization is believed to be intracellularly because neither of the 
trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signatpeptide prediction programs predict that this protein is a non-secreted 
protein. 

30 Variant protein M62096JPEA_1_P8 also has the following non- silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 635, (given according to their position(s) on the 
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amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M62096_PEA_1_P8 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 



5 Table 635 - Amino acid mutations 



SNP position(s) on amino acid 
. sequencer ~ .< - " . >, ; \ \^ 


.Alternative amino acid(s) 


Previously known SOT 


5 


A->T 


Yes 



Variant protein M62096JPEA_1_P8 is encoded by the following transcript(s): 
M62096_PEA_1_T13 ? for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M62096JPEA_1 JT13 is shown in bold; this coding portion starts at 
10 position 396 and ends at position 2606. The transcript also has the following SNPs as listed in 
Table 636 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein M62096_PEA_1_P8 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 5 Table 636 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence - .v . . 


Alternative nucleic acid 


Previously known SNP? 


92 


C-> A 


Yes 


408 


G-> A 


Yes 



Variant protein M620 96_PE A_ 1 _P 9 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
M62096_PEA_1_T14. An alignment is given to the known protein (Kinesin heavy chain 
isoform 5C) at the end of the application. One or more alignments to one or more previously 



20 published protein sequences are given at the end of the application. A brief description of the 

relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between M62096_PEA_1_P9 and KF5C HUMAN: 



WO 2006/131783 



PCT/IB2005/004037 



711 

1. An isolated chimeric polypeptide encoding for M62096_PEA_1_P9, comprising a first 
amino acid sequence being at least 90 % homologous to 

MADPAECSIKVMCRFRPLNEAEILRGDKPIPKFKGDETWIGQGKPYVFDRVLPPNTTQ 

EQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIAHDIFD 

HIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLAVHEDKNRVPYVKGCTERFVSSPE 

EVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLmiKQENVETEKKLSGKLYLVDLAGSE 

KVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGN 

CRTTIVICCSPSVFNEAETKSTLMFGQRAKTIKNW^ 

TLKNVIQHLEMELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIIDNIAPVVAGISTEEK 
EKYDEEIS SL YRQLDDKDDEINQQ SQL AEKLKQQMLD QDE corresponding to amino acids 
1 - 454 of KF5CHUMAN, which also corresponds to amino acids 1 - 454 of 
M62096_PEA__1 JP9, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 
VKNAIYFFFHKV^ 

corresponding to amino acids 455 - 514 of M62096JPEA_1 _P9, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of M62096JPEA_1 JP9, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

VKNAIYFFFHKVLLLLFVVDVCSRNLIGIEAFHNYRIMWKFLGRCPFTAS 
in M62096JPEA_1_P9. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
intracellularly. The protein localization is believed to be intracellularly because neither of the 
trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signal-peptide prediction programs predict that this protein is a non-secreted 
protein. 
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Variant protein M62096_PEA_1_P9 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 637, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M62096_PEA_1_P9 
5 sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 637 -Amino acid mutations 



SNP ^osition(s) on amino acid 
^cpence • • ;?7 >" W'.' "5 


Artern^ttive amino acid(s) 


Previously knpwn SNP? 


5 


A->T 


Yes 



Variant protein M62096JPEA_1_P9 is encoded by the following transcript(s): 
10 M62096JPEA 1_T14, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M62096JPEA_1_T14 is shown in bold; this coding portion starts at 
position 396 and ends at position 1937. The transcript also has the following SNPs as listed in 
Table 638 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
15 known SNPs in variant protein M62096_PEA_1_P9 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 638 - Nucleic acid SNPs 



SNP position on nucleotide : 
sequence \ ■ 


Alternative nucleic acid . 


Previously known SNP? 


92 


C -> A 


Yes 


408 


G-> A 


Yes 



20 Variant protein M62096JPEAJ JP10 according to the present invention has an amino 

acid sequence as given at the end of the application; it is encoded by transcript(s) 
M62096JPEA_1_T15. An alignment is given to the known protein (Kinesin heavy chain 
isoform 5C) at the end of the application. One or more alignments to one or more previously 
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published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between M62096_PEA_1_P10 and KF5 C_HUM AN : 

1. An isolated chimeric polypeptide encoding for M62096_PEA_1_P10, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence MTQNFRLMWNILLFPLNFS corresponding to amino acids 1 - 19 of 
M62096_PEA_1_P10, a second amino acid sequence being at least 90 % homologous to 
LNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGLEETVSREL 

QTLHNLRKLFVQDLTTRVKK corresponding to amino acids 738 - 815 of KF5C_HUMAN, 
which also corresponds to amino acids 20 - 97 of M62096_PEA_1_P10, and a third amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
VSSLCLNGTEKKIKDGREESFSVEISLA corresponding to amino acids 98 - 125 of 
M62096_PEA_1_P10, wherein said first amino acid sequence, second amino acid sequence and 
third amino acid sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a head of M62096_PEA_1_P10, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence MTQNFRLMWNILLFPLNFS of M62096_PEA_1_P10. 

3. An isolated polypeptide encoding for a tail of M62096_PEA_1_P10, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence VSSLCLNGTEKKIKDGREESFSVEISLA in M62096_PEA_1_P10. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because one of the two signal- 
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peptide prediction programs (HMM:Non-secretory protein,NN:YES) predicts that this protein 
has a signal peptide. 

Variant protein M62096_PEA_1_P10 is encoded by the following transcript(s): 
M62096_PEA_1_T15, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M62096_PEA_1_T15 is shown in bold; this coding portion starts at 
position 633 and ends at position 1007. 

Variant protein M62096_PEA_1_P1 1 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
M62096_PEA_1_T4. An alignment is given to the known protein (Kinesin heavy chain isoform 
5C) at the end of the application. One or more alignments to one or more previously published 
protein sequences are given at the end of the application. A brief description of the relationship 
of the variant protein according to the present invention to each such aligned protein is as 
follows: 

Comparison report between M62096JPEA_1_P1 1 and KF5C_HUMAN: 
l.An isolated chimeric polypeptide encoding for M62096_PEA_1_P11, comprising a first 
amino acid sequence being at least 90 % homologous to 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFDRVLPPNTTQ 
EQVYNACAKQIVKTJVLEGYNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIAHDIFD 
HIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLAVHEDKNRVPYVKGCTERFVSSPE 
EVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLmiKQENVETEKKLSGKLYLVDLAGSE 
KVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGN 
CRTTIVICCSPSWNEAETKSTLMFGQRAKTIKNTVSWLELTAEEWKKKYEKEKEKNK 
TLKNVIQHLEMELNRWRN corresponding to amino acids 1 - 372 of KF5 C_HUMAN, which 
also corresponds to amino acids 1 - 372 of M62096_PEA_1_P1 1, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
DFLAAHVFGKLLE corresponding to amino acids 373 - 385 of M62096_PEA_1_P11, wherein 
said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 
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2 An isolated polypeptide encoding for a tail of M62096_PEA_1_P1 1 , comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence DFLAAHVFGKLLE in M62096JPEA_1 JP1 1. 

5 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
intracellularly . The protein localization is believed to be intracellularly because neither of the 
10 trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signal-peptide prediction programs predict that this protein is a non-secreted 
protein. 

Variant protein M62096JPEA 1JP1 1 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 639, (given according to their position(s) on the 
15 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M62096_PEA_1 JP1 1 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 639 - Amino acid mutations 



SNP ^osition(s) on amino acid 

sequence-:.., ; ... r ;.*> ; •. ' 


^ernativs $n$no aci<l(s) 


f Previously knbwn^W?? 


5 


A->T 


Yes 



20 



Variant protein M62096_PEA_1_P1 1 is encoded by the following transcript(s): 
M62096_PEA_1_T4, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M62096_PEA_1_T4 is shown in bold; this coding portion starts at 
position 396 and ends at position 1550. The transcript also has the following SNPs as listed in 
25 Table 640 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
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known SNPs in variant protein M62096_PEA_1 JP1 1 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 



Table 640 - Nucleic acid SNPs 



SNP position on nucleotide vj 
sequence : ^ 


/ Alternative nucleic acid 


Previously known SNP? 


92 


C->A 


Yes 


408 


G-> A 


Yes 


6908 


G->T 


No 



5 

Variant protein M62096_PEA_1_P12 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
M62096_PEA_1_T5. An alignment is given to the known protein (Kinesin heavy chain isoform 
5C) at the end of the application. One or more alignments to one or more previously published 
10 protein sequences are given at the end of the application. A brief description of the relationship 
of the variant protein according to the present invention to each such aligned protein is as 
follows: 

Comparison report between M62096_PEA_1_P12 and KF5C HUMAN: 

l.An isolated chimeric polypeptide encoding for M62096_PEA_1_P12, comprising a first 

15 ammo acid sequence being at least 90 % homologous to 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETWIGQGKPYVFDRVLPPNTTQ 

EQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIAHDIFD 

HIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLAVHEDKNRVPYVKGCTERFVSSPE 

EVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLIMKQENVETEKKI.SGKLYLVDLAGSE 

20 KVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGN 
CRTTIVICCSPSVFNEAETKSTLMFGQR corresponding to amino acids 1 - 323 of 
KF5C HUMAN, which also corresponds to amino acids 1 - 323 of M62096_PEA_1_P12, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 

25 having the sequence V corresponding to amino acids 324 - 324 of M62096_PEA_1_P12, 
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wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

The location of the variant protein was determined according to results from a number of 
5 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
intracellularly. The protein localization is believed to be intracellularly because neither of the 
trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signal-peptide prediction programs predict that this protein is a non-secreted 
1 0 protein. 

Variant protein M62096JPEA_1_P12 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 641, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M62096JPEA_1 JP12 



15 sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 641 - Amino acid mutations 



'sequence - J', . 


Alternative amino add(s) j ; 


Previously known SNP? 


5 


A->T 


Yes 



Variant protein M62 0 9 6_PE A_ 1 _P 1 2 is encoded by the following transcript(s): 
20 M62096_PEA_1_T5, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M62096JPEA_1JT5 is shown in bold; this coding portion starts at 
position 378 and ends at position 1349. The transcript also has the following SNPs as listed in 
Table 642 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
25 known SNPs in variant protein M62096_PEA_1JP12 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

Table 642 - Nucleic acid SNPs 
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SNP position on nucleotide 
sequence •> 


Alternative nucleic acid : 


previously known SNP? 

■ . - . . ' " c ' 


92 


C -> A 


Yes 


390 


G-> A 


Yes 


6784 


G->T 


No 



As noted above, cluster M62096 features 42 segment(s), which were listed in Table 2 
above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 



5 provided. 

Segment cluster M62096_PEA_l_node_0 according to the present invention is supported 
by 14 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M62096_PEA_1_T4, M62096_PEA_1_T5, 
10 M62096_PEA_1_T13 and M62096_PEA_1_T14. Table 643 below describes the starting and 
ending position of this segment on each transcript. 
Table 643 - Segment location on transcripts 





; Segment !<v - 
starting position J J; 


Segment, ? ■:,^_[ 
ending position , r ; 


M62096_PEA_1_T4 


1 


355 


M62096_PEA_1_T5 


1 


355 


M62096_PEA_1_T13 


1 


355 


M62096_PEA_1_T14 


1 


355 



Segment cluster M62096_PEA_l_node_2 according to the present invention is supported 
by 12 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M62096_PEA_1_T4, M62096_PEA_1_T5, 
M62096_PEA_1_T13 and M62096_PEA_1_T14. Table 644 below describes the starting and 
ending position of this segment on each transcript. 
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Table 644 - Segment location on transcripts 



Transcript name : . 


Segment - ; ~ ^ 


Segment i 




starting position J _ 


ending position 


M62096_PEA_1_T4 


374 


521 


M62096_PEA_1_T5 


356 


503 


M62096_PEA_1_T13 


374 


521 


M62096_PEA_1_T14 


374 


521 



Segment cluster M62096_PEA_l_node_15 according to the present invention is 
5 supported by 28 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096JPEA_1__T4 ? M62096J>EA_1_T5, 
M62096 JPEA__1_T 1 3 and M62096_PEA_1_T14. Table 645 below describes the starting and 
ending position of this segment on each transcript. 

Table 645 - Segment location on transcripts 



Transcript name / j'% 


Segment /. ;■;.•>■" 


f Segment j : 

f p : 'if ' : fit 




starting position i 


: ending position . i 


M62096_PEA_1_T4 


985 


1109 


M62096_PEA_1_T5 


967 


1091 


M62096_PEA_1_T13 


985 


1109 


M62096_PEA_1_T14 


985 


1109 



10 

Segment cluster M62096JPEA_1 jtiode_17 according to the present invention is 
supported by 1 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096_PEA_1_T7. Table 646 below 
15 describes the starting and ending position of this segment on each transcript 

Table 646 - Segment location on transcripts 
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Transcript name 


Segment 
starting position 


f Segment 
ending position 


M62096_PEAJJT7 


1 


147 



Segment cluster M62096JPEA__l_node_19 according to the present invention is 
supported by 3 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M62096JPEA_1 JT6 and 

M62096_PEA_1_T9. Table 647 below describes the starting and ending position of this 
segment on each transcript. 

Table 647 - Segment location on transcripts 



I r— r **77~ ' — 77: ' " • T 

Transcript name ' : x ;," f 


; Segment >•"»>-;- '• ' 
; starting position . 


Segment 

ending position ? \ 


M62096_PEA_1_T6 


1 


125 


M62096_PEA_1_T9 


1 


125 



10 

Segment cluster M62096 JPEA_l_node_23 according to the present invention is 
supported by 36 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096JPEA_1 JT4, M62096_PEA_1_T5, 
M62096JPEA_1_T6, M62096_PEA_1_T7 ? M62096_PEA_1_T9, M62096_PEA_1_T13 and 
15 M62096JPEAJJT14. Table 648 below describes the starting and ending position of this 



segment on each transcript. 

Table 648 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


; Segment 
ending position 


M62096_PEA_1_T4 


1215 


1363 


M62096_PEA_1_T5 


1197 


1345 


M62096_PEA_1_T6 


231 


379 
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M62096_PEA_1_T7 


253 


401 


M62096_PEA_1_T9 


231 


379 


M62096JPEA_1_T13 ' 


1215 


1363 


M62096_PEA_1_T14 


1215 


1363 



Segment cluster M62096JPEA_l_node_27 according to the present invention is 
supported by 35 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M62096_PEAJ JT4, M62096JPEA_1_T5, 
M62096JPEA_1_T6, M62096JPEA_1 JT7, M62096_PEA_1_T9, M62096JPEA_1_T13 and 
M62096_PEA_1_T14. Table 649 below describes the starting and ending position of this 
segment on each transcript. 
Table 649 - Segment location on transcripts 





Segment ' \; A 
starting position $ 


Segment 1 
ending position ^ . 


M62096_PEA_1_T4 


1364 


1512 


M62096_PEA_1_T5 


1407 


1555 


M62096_PEA_1_T6 


380 


528 


M62096_PEA_1_T7 


402 


550 


M62096_PEA_1_T9 


441 


589 


M62096_PEA_1_T13 


1364 


1512 


M62096_PEA_1_T14 


1364 


1512 



10 

Segment cluster M62096_PEA_l_node_29 according to the present invention is 
supported by 1 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096_PEA_1 JT4. Table 650 below 
1 5 describes the starting and ending position of this segment on each transcript. 

Table 650 - Segment location on transcripts 
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Transcript name / ; 


Segment 
starting position 


, Segment 
ending position 


M62096_PEA_1_T4 


1513 


1679 



Segment cluster M62096_PEA_l_node_3 1 according to the present invention is 
supported by 24 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following txanscript(s): M62096J > EA_1_T4, M62096JPEA_1JT5, 
M62096JPEAJJT6, M62096JPEA_1JT7, M62096_JPEA_1_T9, M62096JPEA_1_T13 and 
M62096_PEA_1_T14. Table 651 below describes the starting and ending position of this 
segment on each transcript. 

Table 651 - Segment location on transcripts 



©adscript name . ' 

.yv : ■'§■'■ .#/ -ff> $ 


Segment ^ . 
1 starting position J 


: Segment , - ' 
ending position ^ 


M62096_PEA_1_T4 


1680 


1855 


M62096_PEA_1_T5 


1556 


1731 


M62096_PEA_1_T6 


529 


704 


M62096_PEA_1_T7 


551 


726 


M62096_PEA_1_T9 


590 


765 


M62096_PEA_1_T13 


1513 


1688 


M62096_PEA_1_T14 


1513 


1688 



Segment cluster M62096JPEA_l_node_34 according to the present invention is 
supported by 3 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096_PEA_1 JT14. Table 652 below 
15 describes the starting and ending position of this segment on each transcript. 

Table 652 - Segment location on transcripts 
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Transcript name - 


Segment 
starting position 


Segment : 
ending position 


M62096JPEA_1_T14 


1758 


2261 



Segment cluster M62096JPEA_l_node_36 according to the present invention is 
supported by 26 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M62096JPEA_1_T4, M62096JPEA_1_T5, 
M62096_PEAJLT6, M62096JPEAJLT7, M62096_PEA_1JT9 andM62096JPEA_lJT13. 
Table 653 below describes the starting and ending position of this segment on each transcript. 



Table 653 - Segment location on transcripts 





Segment : I 
slatting position 


Segment 4: 
ending position^ 


M62096_PEA_1_T4 


1925 


2131 


M62096_PEA_1_T5 


1801 


2007 


M62096_PEA_1_T6 


774 


980 


M62096_PEA_1_T7 \ 


796 


1002 


M62096_PEA_1_T9 


835 


1041 


M62096_PEA_1_T13 


1758 


1964 



10 

Segment cluster M62096JPEA_l_node_38 according to the present invention is 
supported by 24 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096JPEA_1_T4, M62096JPEA_1 JT5 9 
M62096JPEA_1_T6, M62096JPEA_1_T7, M62096J>EA„1_T9 and M62096_PEA_1 JT13. 
15 Table 654 below describes the starting and ending position of this segment on each transcript. 



Table 654 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 
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M62096_PEA_1_T4 


2132 


2278 


M62096_PEA_1_T5 


2008 


2154 


M62096_PEA_1_T6 


981 


1127 


M62096_PEA_1_T7 


1003 


1149 


M62096_PEA_1_T9 


1042 


1188 


M62096_PEA_1_T13 


1965 


2111 



Segment cluster M62096JPEA_l_node_40 according to the present invention is 
supported by 21 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M62096_PEA_1 JT4, M62096_PEA_1_T5, 
M62096JPEA_1 JT6, M62096J>EA_1 JT7, M62096_PEA_1_T9 and M62096J>EA__1_T13. 
Table 655 below describes the starting and ending position of this segment on each transcript. 



Table 655 - Segment location on transcripts 



■ Tf adscript name : : \ J\ — 


"Segment;;.'- ' ^\ 


■ Segment ' . f-V ' : Af 




starting position ■ . 


ending position £ '(> 


M62096_PEA_1_T4 


2279 


2467 


M62096_PEA_1_T5 


2155 


2343 


M62096_PEA_1_T6 


1128 


1316 


M62096_PEA_1_T7 


1150 


1338 


M62096_PEA_1_T9 


1189 


1377 


M62096_PEA_1_T13 


2112 


2300 



Segment cluster M62096_PEA_l_node_48 according to the present invention is 
supported by 7 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096JPEAJ JT13. Table 656 below 
describes the starting and ending position of this segment on each transcript. 

1 5 Table 656 - Segment location on transcripts 
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Tra 


nscri.pt name ^ : - 


Segment 
starting position 


Segment 
ending position , 


M62096_PEA_1_T13 


2606 


2945 



Segment cluster M62096 J>EA_l_node_50 according to the present invention is 
supported by 3 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M62096JPEA_1 JIT 1 and 

M62096JPEA_1_T15. Table 657 below describes the starting and ending position of this 

segment on each transcript. 

Table 657- Segment location on transcripts 



Transcript name "V ; |' ./\ 


£>egh*eri$ ■ t 4: 4: . 
sfartmg position 


Segment ' ; " ^ 
ending position j 


M62096_PEA_1_T11 


1 


688 


M62096JPEA_1_T15 


1 


688 



Segment cluster M62096_PEA_l_node_56 according to the present invention is 
supported by 1 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096_PEA_1_T15. Table 658 below 
describes the starting and ending position of this segment on each transcript. 

1 5 Table 658 - Segment location on transcripts 



Transcript name 


{ Segment 
starting position 


Segment 
ending position 


M62096JPEA_1_T15 


924 


1059 



Segment cluster M62096_PEA_l_node_60 according to the present invention is 
supported by 13 libraries. The number of libraries was determined as previously described. This 
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segment can be found in the following transcript(s): M62096JPEA_1 JT4, M62096_PEA_1_T5, 
M62096_PEA_1_T6, M62096_PEA_1 JT7, M62096JPEAJJT9 and M62096_PEA_ljri 1. 
Table 659 below describes the starting and ending position of this segment on each transcript. 



Table 659 - Segment location on transcripts 



Transcript name ' -'t, \.. '.V*.;'/- 


Segment . -V ; , - : 
starting position g 


Segment ^ " 
ending position 


M62096_PEA_1_T4 


3113 


3329 


M62096_PEA_1_T5 


2989 


3205 


M62096_PEA_1_T6 


1962 


2178 


M62096_PEA_1_T7 


1984 


2200 


M62096_PEA_1_T9 


2023 


2239 


M62096_PEA_1_T11 


1029 


1245 



5 



Segment cluster M62096_PEA_l_node_65 according to the present invention is 
supported by 51 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096JPEAJLT4, M62096JPEA_1_T5 ? 
10 M62096JPEA_1 JT6, M62096JPEAJJT7, M62096_PEA_1 JT9 and M62096 JPEA_1 JT1 1. 
Table 660 below describes the starting and ending position of this segment on each transcript. 



Table 660 - Segment location on transcripts 



Transcript name ' . 


Segment 


Segment 




starting position 


ending position 


M62096_PEA_1_T4 


3444 


4763 


M62096_PEA_1_T5 


3320 


4639 


M62096_PEA_1_T6 


2293 


3612 


M62096_PEA_1_T7 


2315 


3634 


M62096_PEA_1_T9 


2354 


3673 


M62096_PEA_1_T11 


1360 


2679 
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Segment cluster M62096_PEA__l_node_69 according to the present invention is 
supported by 85 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096_PEA_1_T4 5 M62096JPEA_1 JT5, 
5 M62096_PEA_1_T6, M62096_PEA_1_T7 5 M62096JPEA__1_T9 and M62096_PEA„1_T1 1 . 
Table 661 below describes the starting and ending position of this segment on each transcript. 



Table 661 - Segment location on transcripts 



Transcript name • %.r l ' '" ? ' ' . % * , 


Segment " v •* 


Segment V 




starting position \ 't . 


ending position ; 


M62096_PEA_1_T4 


4894 


5826 


M62096_PEA_1_T5 


4770 


5702 


M62096_PEA_1_T6 


3743 


4675 


M62096_PEA_1_T7 


3765 


4697 


M62096_PEA_1_T9 


3804 


4736 


M62096_PEA_1_T11 


2810 


3742 



10 Segment cluster M62096_PEA_l_node_71 according to the present invention is 

supported by 178 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): M62096_PEA_1_T4, 
M62096J>EA_1 JT5, M62096J>EA„1_T6, M62096_PEAJLT7, M62096JPEA_1_T9 and 
M62096_PEA_1_T1 L Table 662 below describes the starting and ending position of this 

1 5 segment on each transcript. 

Table 662 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


M62096_PEA_1_T4 


5882 


7128 


M62096_PEA_1_T5 


5758 


7004 


M62096_PEA_1_T6 


4731 


5977 
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M62096_PEA_1_77 


4753 


5999 


M62096_PEA_1_T9 


4792 


6038 


M62096JPEA_1_T11 


3798 


5044 



the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

Segment cluster M62096_PEA_l_node_l according to the present invention can be found 
in the following transcript(s): M62096_PEA_1_T4, M62096J>EA_1 JT13 and 
M62096JPEA_1_T14. Table 663 below describes the starting and ending position of this 
segment on each transcript. 
Table 663 - Segment location on transcripts 





Segment / 
.starting position . ?r- 


Segment - 
ending position "y , " 


M62096_PEA_1_T4 


356 


373 


M62096_PEA_1_T13 


356 


373 


M62096_PEA_1_T14 


356 


373 



10 



Segment cluster M62096_PEA_l_node_4 according to the present invention is supported 
by 12 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M62096_PEA_1_T4, M62096_PEA_1_T5, 
15 M62096JPEA_1 JT13 and M62096JPEAJLT14. Table 664 below describes the starting and 
ending position of this segment on each transcript. 
Table 664 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


M62096_PEA_1_T4 


522 


612 


M62096_PEA_1_T5 


504 


594 
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M62096_PEA_1_T13 


522 


612 


M62096_PEA_1_T14 


522 


612 



Segment cluster M62096_PEA__l_node_6 according to the present invention is supported 
by 13 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M62096_PEA_1 JT4, M62096JPEA_1_T5, 
M62096_PEA_1_T13 and M62096_PEA_1_T14. Table 665 below describes the starting and 
ending position of this segment on each transcript. 

Table 665 - Segment location on transcripts 





Segment e s ' - f *' v 
starting position* • 


Segment -J'' f | 
ending position, f 


M62096_PEA_1_T4 


613 


686 


M62096_PEA_1_T5 


595 


668 


M62096_PEA_1_T13 


613 


686 


M62096_PEA_1_T14 


613 


686 



Segment cluster M62096JPEA_l_node_7 according to the present invention is supported 
by 19 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M62096_PEA_1_T4> M62096JPEA_1_T5, 
M62096_PEA_1_T13 and M62096_PEA_1_T14. Table 666 below describes the starting and 
ending position of this segment on each transcript. 

Table 666 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


M62096_PEA_1_T4 


687 


791 


M62096_PEA_1_T5 


669 


773 


M62096_PEA_1_T13 


687 


791 
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| M62096_PEA_1_T14 


687 


791 









Segment cluster M62096_PEA_l_node_9 according to the present invention is supported 
by 18 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): M62096_PEA_1 JT4, M62096_PEA_1_T5, 

M62096_PEA_1_T13 and M62096JPEA_1_T14. Table 667 below describes the starting and 
ending position of this segment on each transcript. 



Table 667 - Segment location on transcripts 





Segment "' 


Segment , \ / 




i staftiag position r 


: ending i>osition * S 


M62096_PEA_1_T4 


792 


840 


M62096_PEA_1_T5 ^ 


774 


822 


M62096_PEA_1_T13 


792 


840 


M62096_PEA_1_T14 


792 


840 



Segment cluster M62096JPEA_l_node_l 1 according to the present invention is 
supported by 22 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096_PEA_1_T4 ? M62096_PEA__1_T5, 
M62096JPEA_1_T13 and M62096JPEA_1 JT14. Table 668 below describes the starting and 
1 5 ending position of this segment on each transcript. 



Table 668 - Segment location on transcripts 



Transcript name 


Segment 


Segment . 




starting position 


ending position 


M62096_PEA_1_T4 


841 


896 


M62096_PEA_1_T5 


823 


878 


M62096_PEA_1_T13 


841 


896 


M62096_PEA_1_T14 


841 


896 
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Segment cluster M62096_PEA_l_node_13 according to the present invention is 
supported by 24 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M62096JPEA_1_T4, M62096_PEA_1_T5 ? 
M62096JPEA_1_T13 and M62096JPEA_1 JT14. Table 669 below describes the starting and 
ending position of this segment on each transcript. 



Table 669 - Segment location on transcripts 



Transcript name 


Segment .* % 
starting position # 


Segment Y- ■ ■'.'*. 
ending position ■[ '0 


M62096_PEA_1_T4 


897 


984 


M62096_PEA_1_T5 


879 


966 


M62096_PEA_1_T13 


897 


984 


M62096_PEA_1_T14 


897 


984 



Segment cluster M62096JPEA_1 _node_21 according to the present invention is 
supported by 33 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096_PEA_1_T4, M62096JPEA_1 JT5, 
M62096_PEA__1 JT6, M62096J>EA_1_T7 ? M62096_PEA_1 JT9, M62096JPEA_J JT13 and 
15 M62096JPEA_1_T14. Table 670 below describes the starting and ending position of this 
segment on each transcript. 



Table 670 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position, 


M62096_PEA_1_T4 


1110 


1214 


M62096_PEA_1_T5 


1092 


1196 


M62096_PEA_1_T6 


126 


230 


M62096_PEA_1_T7 


148 


252 
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M62096_PEA_1_T9 


126 


230 


M62096_PEA_1_T13 


1110 


1214 


M62096_PEA_1_T14 


1110 


1214 



Segment cluster M62096_PEA_l_node_25 according to the present invention is 
supported by 3 libraries. The number of libraries was determined as previously described. 
5 segment can be found in the following transcript(s): M62096_PEA_1_T5 and 

M62096_PEA_1_T9. Table 671 below describes the starting and ending position of this 
segment on each transcript. 

Table 671 - Segment location on transcripts 





Segment ' ■ ^ 
starting position # 


Segment ? . ; v \ 
ending position ,5" ? : 


M62096_PEA_1_T5 


1346 


1406 


M62096_PEA_1_T9 


380 


440 



Segment cluster M62096_PEA_l_node_33 according to the present invention is 
supported by 20 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096_PEA_1_T4, M62096_PEA_1_T5, 
M62096_PEA_1_T6, M62096_PEA_1_T7> M62096_PEA_1_T9, M62096_PEA_1_T13 and 
15 M62096 PEA_1_T14. Table 672 below describes the starting and ending position of this 



segment on each transcript. 

Table 672 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




: starting position 


ending position 


M62096_PEA_1_T4 "1 


1856 


1924 


M62096_PEA_1_T5 


1732 


1800 


M62096_PEA_1_T6 


705 


773 



WO 2006/131783 



PCT/IB2005/004037 



733 



M62096_PEA_1_T7 


727 


795 


M62096_PEA_1_T9 


766 


834 


M62096_PEA_1_T13 


1689 


1757 


M62096_PEA_1_T14 


1689 


1757 



Segment cluster M62096JPEA_l_node_42 according to the present invention is 
supported by 17 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript (s): M62096JPEA_1_T4, M62096JPEA_1_T5, 
M62096JPEA_1_T6, M62096JPEA_1_T7, M62096_PEA_1_T9 and M62096_PEA_1 JT13. 
Table 673 below describes the starting and ending position of this segment on each transcript. 

Table 673 - Segment location on transcripts 



Transcript name 


5 Segment ." y' ,» 


; Segment' -| v \ J. ' 




. starting position . ; 


■ endmg position! | -. _ 


M62096_PEA_1_T4 


2468 


2585 


M62096_PEA_1_T5 


2344 


2461 


M62096_PEA_1_T6 


1317 


1434 


M62096_PEA_1_T7 


1339 


1456 


M62096_PEA_1_T9 


1378 


1495 


M62096_PEA_1_T13 


2301 


2418 



10 

Segment cluster M62096_PEA_l_node_44 according to the present invention is 
supported by 19 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096JPEA_1 JT4, M62096_PEA_1_T5 ? 
M62096_PEA_1 JT6, M62096_PEA_1_T7 ? M62096_PEA_1_T9 and M62096JPEA„1 JT13. 
15 Table 674 below describes the starting and ending position of this segment on each transcript. 

Table 674 - Segment location on transcripts 
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Transcript name £ 

--. ■ ■ • '. ... V * ' ; * 


Segment 
starting position 


Segment 
enuing posiiion 


M62096_PEA_1_T4 


2586 


2662 


M62096_PEA_1_T5 


2462 


2538 


M62096_PEA_1_T6 


1435 


1511 


M62096_PEA_1_T7 


1457 


1533 


M62096_PEA_1_T9 


1496 


1572 


M62096_PEA_1_T13 


2419 


2495 



Segment cluster M62096JPEA_l_node_47 according to the present invention is 
supported by 21 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M62096JPEA_1„T4, M62096_PEA_1_T5, 
M62096_PEA_1 JT6, M62096_PEA_1_T7 ? M62096JPEA_1 JT9 andM62096_PEA„l_T13. 
Table 675 below describes the starting and ending position of this segment on each transcript. 



Table 675 - Segment location on transcripts 



Tiranserfpt name • >; - / 


Segment . 
starting position 


Segment 

ending position ' 


M62096_PEA_1_T4 


2663 


2772 


M62096_PEA_1_T5 


2539 


2648 


M62096_PEA_1_T6 


1512 


1621 


M62096_PEA_1_T7 


1534 


1643 


M62096_PEA_1_T9 


1573 


1682 


M62096_PEA_1_T13 


2496 


2605 



10 Microarray (chip) data is also available for this segment as follows. As described above 

with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 676. 
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Table 676 - Oligonucleotides related to this segment 



OWgojoucleotide name 


Overexpressed in cancers 


, Chip reference e t / 


M62096_0_7J) 


lung malignant tumors 


LUN 



Segment cluster M62096_PEA_l_node_51 according to the present invention is 
5 supported by 1 1 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096JPEAJJT4, M62096_PEA_1_T5 5 
M62096J>EA_1_T6, M62096_PEA_1_T7 ? M62096JPEA_1JT9, M62096_PEA_1_T1 1 and 
M62096J?EA_1 JT15. Table 677 below describes the starting and ending position of this 
segment on each transcript. 

10 Table 677 - Segment location on transcripts 





Segment ; - 
starting position •■->■- 


Segment f| ; .' 
; ending position , M 


M62096_PEA_1_T4 


2773 


2874 


M62096_PEA_1_T5 


2649 


2750 


M62096_PEA_1_T6 


1622 


1723 


M62096_PEA_1_T7 


1644 


1745 


M62096_PEA_1_T9 


1683 


1784 


M62096_PEA_1_T11 


689 


790 


M62096_PEA_1_T15 


689 


790 



Segment cluster M62096JPEA_l_node_53 according to the present invention is 
supported by 10 libraries. The number of libraries was determined as previously described. This 
15 segment can be found in the following transcript(s): M62096_PEA_1_T4 ? M62096_PEA_1_T5, 
M62096_PEA_1JT6, M62096_PEA_1_T7, M62096_PEA_1 JT9, M62096_PEA_1_T11 and 
M62096 JPEA_1 _T 1 5 . Table 678 below describes the starting and ending position of this 
segment on each transcript. 

Table 678 - Segment location on transcripts 
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Transcript name . 1 . 


Segment ' , . 
starling position 


Segment ' i. 
ending position 


M62096_PEA_1_T4 


2875 


2935 


M62096_PEA_1_T5 


2751 


2811 


M62096_PEA_1_T6 


1724 


1784 


M62096_PEA_1_T7 


1746 


1806 


M62096_PEA_1_T9 


1785 


1845 


M62096_PEA_1_T1 1 


791 


851 


M62096_PEA_1_T15 


791 


851 



Segment cluster M62096JPEA_l_node_55 according to the present invention is 
supported by 9 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096_JPEA_1_T4, M62096_PEA_1_T5 5 
M62096JPEA_1_T6, M62096_PEA_1_T7, M62096„PEA_1_T9, M62096JPEA_1 JT11 and 
M62096JPEA 1 JT15. Table 679 below describes the starting and ending position of this 
segment on each transcript. 

Table 679 - Segment location on transcripts 



Transcript name \3: ' X. : 


Segment j 
starting position 


Segment'",,,.." ; 
ending position 


M62096_PEA_1_T4 


2936 


3007 


M62096_PEA_1_T5 


2812 


2883 


M62096_PEA_1_T6 


1785 


1856 


M62096_PEA_1_T7 


1807 


1878 


M62096_PEA_1_T9 


1846 


1917 


M62096_PEA_1_T11 


852 


923 


M62096_PEA_1_T15 


852 


923 
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Segment cluster M62096_PEA_l_node_58 according to the present invention is 
supported by 9 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096J>EA_1_T4, M62096_PEA_1_T5, 
M62096JPEA_1_T6, M62096JPEA_1_T7 5 M62096J>EA_1_T9 and M62096_PEA_1JT1 1. 
5 Table 680 below describes the starting and ending position of this segment on each transcript 



Table 680 - Segment location on transcripts 



Transcript name -f 


Segment,' - $ 
startmfe^psition ' 


Segment ' \ • 
ending position 


M62096_PEA_1_T4 


3008 


3112 


M62096_PEA_1_T5 


2884 


2988 


M62096JPEA_1_T6 


1857 


1961 


M62096_PEA_1_T7 


1879 


1983 


M62096_PEA_1_T9 


1918 


2022 


M62096_PEA_1_T11 


924 


1028 



Segment cluster M62096JPEA_l_node_62 according to the present invention is 
1 0 supported by 14 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096_PEA_1_T4, M62096J>EA_1_T5 ? 
M62096_PEA_1_T6, M62096_PEA_1_T7, M62096JPEA_1 JT9 and M62096_PEA_1_T1 1 . 
Table 681 below describes the starting and ending position of this segment on each transcript. 



Table 681 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 


M62096_PEA_1_T4 


3330 


3443 


M62096_PEA_1_T5 


3206 


3319 


M62096_PEA_1_T6 


2179 


2292 


M62096_PEA_1_T7 


2201 


2314 


M62096_PEA_1_T9 


2240 


2353 
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M62096_PEA_1_T1 1 


1246 


1359 









Segment cluster M62096_PEA_l_node_66 according to the present invention is 
supported by 23 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M62096JPEA_1 JT4, M62096JPEA_1_T5, 
M62096_PEA_1_T6, M62096JPEA_1_T7, M62096JPEAJ JT9 and M62096JPEAJLT1 1. 
Table 682 below describes the starting and ending position of this segment on each transcript. 



Table 682 - Segment location on transcripts 



Transcript name , ■-■^ 


• Segment / "f ■ 
: starting position ' 


Segment "4* #. i ;¥ 
ending position •• ~J- i 


M62096_PEA_1_T4 


4764 


4881 


M62096_PEA_1_T5 


4640 


4757 


M62096_PEA_1_T6 


3613 


3730 


M62096_PEA_1_T7 


3635 


3752 


M62096_PEA_1_T9 


3674 


3791 


M62096_PEA_1_T11 


2680 


2797 



Segment cluster M62096JPEA_l_node__67 according to the present invention can be 
found in the following transcript(s): M62096_PEA_1_T4, M62096_PEA__1 JT5, 
M62096_PEA_1_T6, M62096_PEA_1 JT7, M62096_PEA_1_T9 and M62096JPEA_1_T1 1 . 
Table 683 below describes the starting and ending position of this segment on each transcript. 

1 5 Table 683 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


M62096_PEA_1_T4 


4882 


4887 


M62096_PEA_1_T5 


4758 


4763 


M62096_PEA_1_T6 


3731 


3736 
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M62096_PEA_1_T7 


3753 


3758 


M62096_PEA_1_T9 


3792 


3797 


M62096_PEA_1_T11 


2798 


2803 



Segment cluster M62096JPEA_l_node_68 according to the present invention can be 
found in the following transcript(s): M62096J?EA_1 JT4, M62096_PEA_1_T5, 



5 M62096JPEA_1 JT6, M62096_PEA_1_T7, M62096 JPEA_1_T9 and M62096 JPEA_1_T1 1 . 
Table 684below describes the starting and ending position of this segment on each transcript. 

Table 684 - Segment location on transcripts 



.Transcript name ^ , i; 


Segment -Ml 


: Segment j & . /. 


.... .". ' '% ~ :/\ "-• . 


starting position 


ending position v 


M62096_PEA_1_T4 


4888 


4893 


M62096_PEA_1_T5 


4764 


4769 


M62096_PEA_1_T6 


3737 


3742 


M62096_PEA_1_T7 


3759 


3764 


M62096_PEA_1_T9 


3798 


3803 


M62096_PEA_1_T11 


2804 


2809 



10 Segment cluster M62096JPEA_J_nodeJ70 according to the present invention is 

supported by 55 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M62096_PEA_1_T4, M62096 PEA1 JT5, 
M62096JPEAJJT6, M62096JPEA_1 JT7, M62096J>EA_1_T9 and M62096_PEA__1_T1 1. 
Table 685 below describes the starting and ending position of this segment on each transcript. 

15 Table 685 - Segment location on transcripts 



Transcript name i 


Segment 

1 starting position 


! Segment 
ending position 


M62096JPEA_1JT4 


5827 


5881 
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M62096JPEA_1_T5 


5703 


5757 


M62096_PEA_1_T6 


4676 


4730 


M62096_PEA_1_T7 


4698 


4752 


M62096_PEA_1_T9 


4737 


4791 


M62096_PEA_1_T11 


3743 


3797 
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Variant protein alignment to the previously known protein: 

Sequence name: KF5CJHUMAN 

Sequence documentation : 

Alignment of: M620 9 6_PEA_1_P4 x KF5C_HUMAN 



15 Alignment segment 1/1: 



Quality: 6936.00 

Escore: 0 

Matching length: 719 
20 length: 719 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 
25 Gaps : 0 



Total 



Matching Percent 



Total Percent 
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741 

Alignment: 

7 VSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRIL 5 6 

I M | I I I I I I I I I I II I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I i I I I 

23 9 VSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRIL 288 

57 QDSLGGNCRTTIVICCSPSVFNEAETKSTLMFGQRAKTIKNTVSVNLELT 10 6 

I | | | I I I I I II I I I I I I I I I I I I I I I I I I I I I It I I I I I I I I I I I I I I I I 
289 QDSLGGNCRTTIVICCSPSVFNEAETKSTLMFGQRAKTIKNTVSVNLELT 338 

107 AEEWKKKYEKEKEKNKTLKNVIQHLEMELNRWRNGEAVPEDEQISAKDQK 15 6 

I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
339 AEEWKKKYEKEKEKNKTLKNVIQHLEMELNRWRNGEAVPEDEQISAKDQK 388 



15 157 NLEPCDNTPIIDNIAPWAGISTEEKEKYDEEISSLYRQLDDKDDEINQQ 20 6 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
38 9 NLEPCDNTPIIDNIAPWAGISTEEKEKYDEEISSLYRQLDDKDDEINQQ 438 

. . - • • 

207 SQLAEKLKQQMLDQDELLASTRRDYEKIQEELTRLQIENEAAKDEVKEVL 25 6 

20 || | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I I 

439 SQLAEKLKQQMLDQDELLASTRRDYEKIQEELTRLQIENEAAKDEVKEVL 48 8 

. • • • ■ 

257 QALEELAVNYDQKSQEVEDKTRANEQLTDELAQKTTTLTTTQRELSQLQE 30 6 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I M I I I I 

25 48 9 QALEELAVNYDQKSQEVEDKTRANEQLTDELAQKTTTLTTTQRELSQLQE 53 8 

. . • • • 

307 LSNHQKKRATEILNLLLKDLGEIGGI IGTNDVKTLADVNGVIEEEFTMAR 356 

I I 1 I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

53 9 LSNHQKKRATEILNLLLKDLGEIGGI IGTNDVKTLADVNGVIEEEFTMAR 58 8 



30 



357 L Y I SKMKS E VKS LVNRSKQLE S AQMD SNRKMNASERE LAACQLL I S QHE A 40 6 
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| | | I | | I I I II I I ! I I I i I I I I I I I I I I I M I I I M I t I I I I 1 I M i i I I 

589 LYISKMKSEVKSLVNRSKQLESAQMDSNRKMNASERELAACQLLISQHEA 638 

407 KIKSLTDYMQNMEQKRRQLEESQDSLSEELAKLRAQEKMHEVSFQDKEKE 45 6 

| | | | | 1 | I I I M I I I I I I I I I 1 I I I I I I 1 I 1 I I I I I ! I I I I I I II I 1 I I I 
639 KIKSLTDYMQNMEQKRRQLEESQDSLSEELAKLRAQEKMHEVSFQDKEKE 688 

457 HLTRLQDAEEMKKALEQQMESHREAHQKQLSRLRDEIEEKQKIIDEIRDL 506 

| | | | | | | I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
68 9 HLTRLQDAEEMKKALEQQMESHREAHQKQLSRLRDEIEEKQKIIDEIRDL 73 8 

507 NQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGLE 55 6 

I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

73 9 NQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGLE 788 
557 ETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQKQKISFLE 60 6 

I I M I I I I I I I I I I II I I I I I 1 I I I I I I I I I M I I I I I I I I I I I I I I I I I 

78 9 ETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQKQKISFLE 83 8 
607 NNLEQLTKVHKQLVRDNADLRCELPKLEKRLRATAERVPCALESALKEAKE 656 

I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I M I ! I I I I I I Ml I I I I I I 

83 9 NNLEQLTKVHKQLVRDNADLRCELPKLEKRLRATAERVKALESALKEAKE 88 8 

657 NAMRDRKRYQQE VDRI KE AVRAKNMARRAHS AQ I AKP I RPGH YPAS S PTA 706 

I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11 
88 9 NAMRDRKRYQQEVDRIKEAVRAKNMARRAHSAQIAKPIRPGHYPASSPTA 938 



707 VHAIRGGGGSSSNSTHYQK 
I I I I I I I I I I I I I I I I I I I 

93 9 VHAIRGGGGSSSNSTHYQK 



725 
957 
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Sequence name: KF5CJEUMAN 
Sequence documentation : 

Alignment of: M620 9 6_PEA_1_P5 x KF5C_HUMAN 
Alignment segment 1/1: 

Quality: 6520.00 

Escore: 0 

Matching length: 67 4 

length: 674 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 

Alignment: 

1 MTRILQDSLGGNCRTTIVICCSPSVFNEAETKSTLMFGQRAKTIKNTVSV 5 0 

I I I I I I I I t I I !! I I I I I I I I I I I I I I II I I I I 1 I M I I I I I I t I I I I I I 

284 MTRILQDSLGGNCRTTIVICCSPSVFNEAETKSTLMFGQRAKTIKNTVSV 333 
51 NLELTAEEWKKKYEKEKEKNKTLKNVIQHLEMELNRWRNGEAVPEDEQIS 100 

i 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M i 



Total 
Matching Percent 
Total Percent 
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33 4 NLELTAEEWKKKYEKEKEKNKTLKNVIQHLEMELNRWRNGEAVPEDEQI S 3 83 

• • » • • 

101 AKDQKNLEPCDNTPI IDNIAPVVAGISTEEKEKYDEEISSLYRQLDDKDD 150 

1 I I I I I I t I I I I I I 1 1 11 I I 1 I I I I I I I I I I M 1 1 I I 1 I I I I I I I I I I i 1 

384 AKDQKNLEPCDNT P 1 1 DNI APVVAG I S TEEKEKYDEE I S SLYRQLDDKDD 433 

• • • • ■ 

151 EINQQSQLAEKLKQQMLDQDELLASTRRDYEKIQEELTRLQIENEAAKDE 200 

I I I 1 I I I I I I I I 1 I 1 M I I I I I I I ! I I I I I I I I I I I I I 1 I I I I I I I M I I 

4 34 EINQQSQLAEKLKQQMLDQDELLASTRRDYEKIQEELTRLQIENEAAKDE 483 

201 VKEVLQALEELAVNYDQKSQEVEDKTRANEQLTDELAQKTTTLTTTQREL 250 

I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I i I 1 I I I I I I I I I I I I I I I I I I I 
484 VKE VLQALEEL AVN Y DQKS QE VE DKTRANE QL T DE L AQKT T T L T T T QRE L 533 

251 SQLQELSNHQKKRATE I LNLLLKDLGE I GG I I GTNDVKTLADVNGVIEEE 300 

I I I I I I I i I I i I I I I I I I I I i 1 I I I I i 1 I I I i I I I I I 1 I I i 1 I I I I I I I I 
534 SQLQELSNHQKKRATE ILNLLLKDLGEIGGI I GTNDVKTLADVNGVIEEE 58 3 

. . • - - 

301 FTMARLYISKMKSEVKSLVNRSKQLESAQMDSNRKMNASERELAACQLLI 350 

I I I I I I I I 11 I I II I I I I I I I I I I I I I I 1 I I I 11 I I ! I I I I I I I I ! I I I I 

584 FTMARL Y I S KMKS E VKSLVNRS KQLE S AQMDSNRKMNASEREL AACQLL I 633 

. . • • • 

351 S QHE AK1KS LT DYMQNMEQKRRQLEE S QD S L SEELAKLRAQEKMHEVS FQ 400 

I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I ! I I I I I 

634 S QHE AK IKS LT DYMQNMEQKRRQLEE S QDS L SEELAKLRAQEKMHE VS FQ 683 
401 DKEKEHLTRLQDAEEMKKALEQQMESHREAHQKQLSRLRDEIEEKQKIID 450 

I I I I I I I I 1 I I I I I I I I I ! I I I I I I I I 1 I I I I I I I I ! I I I I I I I I I I I I I 

684 DKEKEHLTRLQDAEEMKKALEQQMESHREAHQKQLSRLRDEIEEKQKIID 733 
451 EIRDLNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQARED 50 0 
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| | | I | | | | I I I I I M I I 1 ) I 1 I I I I I I I I ! I I It I M I I I 1 i I M I I I I 1 

734 EIRDLNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQARED 783 

501 LKGLEETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQKQK 550 

| | | I | | | | | | | | I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I 

784 LKGLEETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQKQK 83 3 

551 ISFLENNLEQLTKVHKQLVRDNADLRCELPKLEKRLRATAERVKALESAL 600 

| | I I I I I I I I I I II I I I I I I ! I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I 
834 ISFLENNLEQLTKVHKQLVRDNADLRCELPKLEKRLRATAERVKALESAL 883 

601 KEAKENAMRDRKRYQQEVDRI KEAVRAKNMARRAHS AQ I AKP I RPGH Y P A 650 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

884 KEAKENAMRDRKRYQQEVDRIKEAVRAKNMARRAHSAQIAKPIRPGHYPA 933 

651 SSPTAVHAIRGGGGSSSNSTHYQK 674 

I I I 1 I I I II I I I I I I I I I I I I I I I 

934 SSPTAVHAIRGGGGSSSNSTHYQK 957 



Sequence name: KF5C_HUMAN 
Sequence documentation : 

Alignment of: M620 96_PEA__1_P3 x KF5 C_HUMAN 
Alignment segment 1/1: 
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Quality: 5726.00 

Escore: 0 

Matching length: 593 
5 length: 593 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 
10 Gaps: 0 



Total 



Matching Percent 



Total Percent 



Alignment : 



15 



1 MELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIIDNIAPVVAGISTEEK 5 0 

I ! I I I I I I I I I I I I 11 ! t I I I I I M I I I I ! I I i I I I I I I I I I I I I I M I I 

365 MELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIIDNIAPVVAGISTEEK 414 



20 



51 EKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQMLDQDELLASTRRDYE 100 
M I I I I I I I I I I I I I I I I I I I I I I i I I I 1 I I I I I I I I I I I I I I 1 I M I I I 
415 EKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQMLDQDELLASTRRDYE 464 



25 



101 KIQEELTRLQIENEAAKDEVKEVLQALEELAWYDQKSQEVEDKTRANEQ 150 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

4 65 KIQEELTRLQIENEAAKDEVKEVLQALEELAVNYDQKSQEVEDKTRANEQ 514 

151 LTDELAQKTTTLTTTQRELSQLQELSNHQKKRATEILNLLLKDLGEIGGI 200 

I | | | | I I I I I I I I I I I I I I I I I II I I I I II i I I I I I I I I I I I I I I I I I I I 

515 LTDELAQKTTTLTTTQRELSQLQELSNHQKKRATEILNLLLKDLGEIGGI 564 



30 



201 IGTNDVKTLADVNGVIEEEFTMARLYISKMKSEVKSLVNRSKQLESAQMD 250 

M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 
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5 65 IGTNDVKTLADVNGVIEEEFTMARLYISKMKSEVKSLVNRSKQLESAQMD 614 
251 SNRKMNASERELAACQLLISQHEAKIKSLTDYMQNMEQKRRQLEESQDSL 300 

I I I I I I I I I I I I I i i I I I I M I I I 1 I I i 1 1 I I I I I M I t I I I I I I II I M 

615 SNRKMNASERELAACQLLISQHEAKIKSLTDYMQNMEQKRRQLEESQDSL 6 64 
301 SEELAKLRAQEKMHEVSFQDKEKEHLTRLQDAEEMKKALEQQMESHREAH 350 

I i | | | | | I I I I I I I I I I I I I I I I II I I I I I I 1 I III I I I I I M I I M I II 

6 65 SEELAKLRAQEKMHEVSFQDKEKEHLTRLQDAEEMKKALEQQMESHREAH 714 

351 QKQLSRLRDE I EEKQK 1 1 DE IRDLNQKLQLEQEKLS S DYNKLKI E DQERE 400 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
715 QKQLSRLRDEIEEKQKIIDEIRDLNQKLQLEQEKLSSDYNKLKIEDQERE 7 64 

401 MKLEKLLLLNDKREQAREDLKGLEETVSRELQTLHNLRKLFVQDLTTRVK 450 

| I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

7 65 MKLEKLLLLNDKREQAREDLKGLEETVSRELQTLHNLRKLFVQDLTTRVK 814 

451 KSVELDNDDGGGSAAQKQKISFLENNLEQLTKVHKQLVRDNADLRCELPK 500 

| | | | | | | | | | I I II I I I I I I I I I I I I I I I M I I I I I I I I I I I M t I I I I I 

815 KSVELDNDDGGGSAAQKQKISFLENNLEQLTKVHKQLVRDNADLRCELPK 8 64 

501 LEKRLRAT AERVKALE S ALKEAKENAMRDRKRYQQEVDRI KE AVRAKNMA 550 

| | | | | | | M I M I I I I ! I I I I I I I I I I I I I I I I I M I I 1 M I I I I M I I I 

8 65 LEKRLRATAERVKALESALKEAKENAMRDRKRYQQEVDRIKEAVRAKNMA 914 



551 RRAHSAQIAKPIRPGHYPASSPTAVHAIRGGGGSSSNSTHYQK 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
915 RRAHSAQIAKPIRPGHYPASSPTAVHAIRGGGGSSSNSTHYQK 



593 
957 
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5 Sequence name: KF5CJHUMAN 
Sequence documentation : 

Alignment of: M620 96_PEA_1_P7 x KF5C_HUMAN 

10 

Alignment segment 1/1: 

Quality: 2117.00 

Escore: 0 

15 Matching length: 220 Total 

length: 220 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

20 Identity: 100.00 

Gaps : 0 

Alignment : 

> • • ■ • 

25 2 0 LNQKLQLEQEKLS S DYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGL 6 9 

I I I I I I I I I I I I M I I I I I ! I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I 

738 LNQKLQLEQEKLS SDYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGL 787 

7 0 EETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDDGGGS AAQKQKI S FL 119 

30 | I I I M I I 1 I I I I I I I I I I I I I I M I M I I I I I I t I I I I I I I I I I I I I I I 

78 8 EETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQKQKISFL 837 
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120 ENNLEQLTKVHKQLVRDNADLRCELPKLEKRLRATAERVKALESALKEAK 169 

I I ! I I ! I I I I I I I I I I i I I ! I 1 ! I I t 1 I II I I I I I I I I I I M I I M I I i I 

838 ENNLEQLTKVHKQLVRDNADLRCELPKLEKRLRATAERVKALESALKEAK 887 
5 ..... 

17 0 EN AMRDRKR YQ QE V DRI KE AVRAKNMARRAH S AQ I AK P I RP GHY PAS S P T 219 

I | | | | II I I II I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
888 EN AMRDRKRYQQEVDRIKE AVRAKNMARRAH SAQ I AKPIRPGHYP AS SPT 937 

10 220 AVHAIRGGGGSSSNSTHYQK 239 

I I I I 1 I I I I I I I I I I I I I I I 
938 AVHAIRGGGGSSSNSTHYQK 957 



15 



Sequence name: KF5CJHUMAN 

20 

Sequence documentation : 

Alignment of: M620 9 6_PEA_1_P8 x K F 5 C_HUMAN 
25 Alignment segment 1/1: 

Quality: 7146.00 

Escore: 0 

Matching length: 737 Total 

30 length: 737 
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Matching Percent Similarity: 100.00 Matching Percent 
Identity: 99.86 

Total Percent Similarity: 100.00 Total Percent 

Identity: 99.86 
5 Gaps : 0 



Alignment : 

1 MADPAECS IKVMCRFRPLNEAE ILRGDKFI PKFKGDETVVI GQGKPYVFD 50 

10 | I | | | I I I I i I ! I I 1 I I I I I I I ! I I I )! I I I I I M I I I 11 I I I! I I I i I I 

1 MADPAECS IKVMCRFRPLNEAE ILRGDKFI PKFKGDETVVI GQGKPYVFD 50 

51 RVLPPNTTQEQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKL 100 

I M | | I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

15 51 RVLPPNTTQEQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKL 100 

101 HDPQLMGIIPRIAHDIFDHIYSMDENLEFHIKVSYFEIYLDKIRDLLDVS 150 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I 
101 HDPQLMGIIPRIAHDIFDHIYSMDENLEFHIKVSYFEIYLDKIRDLLDVS 150 

20 - 

151 KTNLAVHEDKNRVPYVKGCTERFVS S PEEVMDVI DEGKANRHVAVTNMNE 200 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I 
151 KTNLAVHEDKNRVPYVKGCTERFVS S PEEVMDVI DEGKANRHVAVTNMNE 200 

25 201 HSSRSHSIFLINIKQENVETEKKLSGKLYLVDLAGSEKVSKTGAEGAVLD 250 

I I I I I I I I I I I II I I I I I I I I I I I I M I 1 I I I I I I I I I I I I I I M I I I I I 

201 HSSRSHSIFLINIKQENVETEKKLSGKLYLVDLAGSEKVSKTGAEGAVLD 250 

251 EAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTI 30 0 

30 I I I I I I I I I I II I I I I I I I I I I I M I II I I I I I I I I I I I I I I M I I I I I I 

251 EAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTI 300 
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301 VICCSPSVFNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEK 350 

| | | I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I 1 I M I I I M I I I I I 
301 VICCSPSVFNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEK 350 

5 • 

351 EKNKTLKNVIQHLEMELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIID 4 00 

| | | I I I I I I I I I 1 I 1 I I I I I I I I I I I M I I I I I I I I I II I I I I I I I I 1 I I 
351 EKNKTLKNVIQHLEMELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIID 4 00 

10 401 NIAPVVAGISTEEKEKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQML 4 50 

| | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 II I I I I I I 
4 01 N I AP V V AG I S TEEKEK Y DEE I S S L YRQLDDKDDE I N Q Q S QLAEKLK QQML 4 50 

451 DQDELLASTRRDYEKIQEELTRLQIENEAAKDEVKEVLQALEELAVNYDQ 500 

15 | | I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 1 I I 

451 DQDELLASTRRDYEKIQEELTRLQIENEAAKDEVKEVLQALEELAVNYDQ 50 0 

501 KSQEVEDKTRANEQLTDELAQKTTTLTTTQRELSQLQELSNHQKKRATEI 550 

I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I 1 I I I I 

20 501 KSQEVEDKTRANEQLTDELAQKTTTLTTTQRELSQLQELSNHQKKRATEI 550 

551 LNLL LK DLGEIGGIIG TN D VKT L AD VN G V I E E E F TM ARL Y I S KMK S E VK S 60 0 

| | | I 1 I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I 
551 LNLLLKDLGE I GGI I GTNDVKTLADVNGVIEEEFTMARLY I SKMKSEVKS 60 0 

25 • 

601 LVNRSKQLESAQMDSNRKMNASERELAACQLLISQHEAKIKSLTDYMQNM 65 0 

I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I 

601 LVNRS KQLE S AQMDSNRKMN ASERELAAC QLL I S QHE AK I K S LT D YMQNM 650 
30 651 EQKRRQLEE S QD S LSEELAKLRAQEKMHEVS FQDKEKEHLTRLQDAEEMK 700 

| | | | | j I I I I I I I I I I I I I I 1 I I I M I I I I I I I I I I I I I I I I I I I I I I I I 
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651 EQKRRQLEESQDSLSEELAKLRAQEKMHEVSFQDKEKEHLTRLQDAEEMK 700 

701 KALEQQME S HREAHQKQLSRLRDE I EEKQK 1 1 DE I RE 737 

I t ) I I 1 I 1 I I I I II II I t I 1 I I M I I I 1 I i 1 1 1 1 M : 
701 KALEQQMES HREAHQKQLSRLRDE I EEKQK I IDE I RD 73 7 



10 

Sequence name: KF5C_HUMAN 
Sequence documentation : 

15 

Alignment of: M620 96_PEA_1_P9 x KFSCJiUMAN 

Alignment segment 1/1: 

20 Quality: 4434.00 

Escore: 0 

Matching length: 454 Total 

length: 454 
Matching Percent Similarity: 100.00 Matching Percent 
25 Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

30 Alignment: 
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1 MADPAECS IKVMCRFRPLNEAEILRGDKFI PKFKGDETVVIGQGKPYVFD 5 0 

I I I I ! I 1 I I I I 1 I ! I I I 1 I I I i I I 1 I I I I I I I I I I I I I I I I I 1 I I I I I M 

1 MADPAECS IKVMCRFRPLNEAE I LRGDKFI PKFKGDETVVI GQGKPYVFD 50 

5 51 RVLPPNTTQEQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKL 100 

I I I I I I I I I I I I! I I I I I 1 I I I I 1 ! 1 I I i I 1 I I I I I I I I I I 1 I I I ) I I I I 
51 RVLPPNTTQEQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKL 10 0 
. • • • ■ 

101 HDPQLMG1 1 PRI AHD1 FDHI YSMDENLEFHIKVS YFE I YLDKIRDLLDVS 150 

10 | | | ) | 1 | 1 I I I M I I I I I I I I I I 1 I I I I 11 I I I I 1 I ! I I I I I I I I I 1 I I I 

101 HDPQLMGI I PRIAHDIFDHI YSMDENLEFHIKVS YFEIYLDKIRDLLDVS 150 

151 KTNLAVHEDKNRVPYVKGCTERFVSSPEEVMDVIDEGKANRHVAVTNMNE 20 0 
I I I I I I I I I 1 I t I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I LI I I I 
15 151 KTNLAVHE DKNRVPYVKGCTERFVS S PEE VMDVI DEGKANRHVAVTNMNE 200 

201 HSSRSHSIFLINIKQENVETEKKLSGKLYLVDLAGSEKVSKTGAEGAVLD 250 
I I I I I I I I I I I I ! I I I I I 1 I I I I I I I I I I I I 1 I I I I ) I I I I I I II I 1 I I I 

2 01 HSSRSHSIFLINIKQENVETEKKLSGKLYLVDLAGSEKVSKTGAEGAVLD 250 
20 ..... 

251 EAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTI 300 
I I I M I I I I I M I I I I 1 I I I I I 1 I I I I I I I I I I I I I I I I I II I I I I II I I 

251 EAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTI 300 

25 301 VI CCS PS VFNEAETKS TLMFGQRAKT IKNT VS VNLELTAEEWKKKYEKEK 350 

I I I I I I I I I i I I I 1 I I i I I I I I I I I I I I M I I I I I I I I I I I I 1 1 1 I I I I I 

301 VI CCS PS VFNEAETKS TLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEK 350 

351 EKNKTLKNVIQHLEMELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIID 400 
30 I I I 1 I I I I I I I I I I I I I I I I I I I I I II I I I ! I I I I I I I I I I I I I I I I I I I 

351 EKNKTLKNVIQHLEMELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIID 400 
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401 NIAPVVAGISTEEKEKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQML 450 

| | | I I I I 1 I I I I I I i I I I I I I I I I i I I I I I I ! 1 I M I M I I I I I I i I 1 I I 

401 NIAPWAG I S TEEKEKY DEE I S S LYRQLDDKDDE INQQSQLAEKLKQQML 450 

451 DQDE 454 
I I I I 

451 DQDE 454 



Sequence name: KF5C_HUMAN 
Sequence documentation : 
Alignment of: M62 0 9 6_PEA_1_P10 
Alignment segment 1/1: 

Quality: 

Escore: 0 

Matching length: 

length: 7 8 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



x KF5C_HUMAN 
747 .00 

78 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 
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Alignment : 

2 0 LNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGL 69 

5 | | | | | | | | | | | | I M II I I I 11 I I t I I I I I I I i M I I N I I I I t II I I I I 

738 LNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGL 787 

7 0 EETVSRELQTLHNLRKLFVQDLT TRVKK 97 

I | | | | I I I I I I I I II I I I I I I I I I I M I 

10 7 88 EETVSRELQTLHNLRKLFVQDLTTRVKK 815 



15 

Sequence name: KF5C_HUMAN 
Sequence documentation : 

20 

Alignment of: M620 9 6_PEA_l_Pll x KF5C_HUMAN 
Alignment segment 1/1: 

25 Quality: 3634.00 

Escore: 0 

Matching length: 372 Total 

length: 372 
Matching Percent Similarity: 100.00 Matching Percent 

30 Identity: 100.00 
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Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 



Alignment : 

1 MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFD 50 

| | | | | | | I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
1 MADPAEC S IKVMCRFRPLNE AE I LRGDKFI PKFKGDETVVI GQGKPYVFD 50 

51 RVL P PNT T QEQ V YN AC AKQ I VKDVLE G YN G T I FAY GQT S S GKT HTME GKL 100 

| | | | | I I I I I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
51 RVLPPNTTQEQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKL 100' 

101 HDPQLMGI I PRIAHDIFDHI YSMDENLEFHIKVSYFE I YLDKIRDLLDVS 150 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M 1 1 I I I I I I I I I I I I 

101 HDPQLMGI I PRIAHDIFDHIYSMDENLEFHIKVSYFEIYLDKIRDLLDVS 150 
151 KTNLAVHEDKNRVPYVKGCTERFVS S PEEVMDVI DEGKANRHVAVTNMNE 20 0 

| | | I I I I I I I I I I I M I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I 

151 KTNLAVHEDKNRVPYVKGCTERFVSS PEEVMDVI DEGKANRHVAVTNMNE 200 
201 HSSRSHSIFLINIKQENVETEKKLSGKLYLVDLAGSEKVSKTGAEGAVLD 250 

I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 

201 HSSRSHSIFLINIKQENVETEKKLSGKLYLVDLAGSEKVSKTGAEGAVLD 250 
251 EAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTI 30 0 

1 | I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 

251 EAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTI 300 

• • * 

301 VICCSPSVFNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEK 350 
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I | | | | I I ! t I I I I I I I i I I I I I I I t I I I I I I I I I I I I I I I M M M MI 1 

301 VICCSPSVFNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEK 350 
351 EKNKTLKNVIQHLEMELNRWRN 372 

I I I I I I I I I I I I 1 i I I I I I I I I 

351 EKNKTLKNVIQHLEMELNRWRN 372 



Sequence name: KF 5 C__HUMAN 
Sequence documentation : 

Alignment of: M 6 2 0 9 6_PE A_1_P 1 2 x KF5C_HUMAN 
Alignment segment 1/1: 

Quality: 3145.00 

Escore: 0 

Matching length: 323 Tot; 

length: 323 
Matching Percent Similarity: 100.00 Matching Percent 

Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

Alignment : 
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1 MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFD 50 

I 1 ! t t I I I I I M I I I I I I I i I I M I M I I I I I I I I I M I 1 I I I I I I i ! I I 

1 MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFD 50 
51 RVLPPNTTQEQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKL 100 

M | I I I 1 I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

51 RVLPPNTTQEQVYNACAKQIVKDVLEGYNGTIFAYGQTSSGKTHTMEGKL 100 

101 HDPQLMGIIPRIAHDIFDHIYSMDENLEFHIKVSYFEIYLDKIRDLLDVS 150 

| | | I I I I I I I I I I II I I I I I I I 1 I I I I I 1 I I I I I I I I I I 1 I I I I I I I I I I 
101 HDPQLMG1 1PRIAHDIFDHIYSMDENLEFHIKVSYFEIYLDKIRDLLDVS 150 

151 KTNLAVHEDKNRVPYVKGCTERFVSSPEEVMDVIDEGKANRHVAVTNMNE 200 

| I I I I I I I I I I I I I I I I I I I I I! I I I I I M I I I I I I II I I I I I I II I II I 
151 KTNLAVHEDKNRVPYVKGCTERFVSSPEEVMDVIDEGKANRHVAVTNMNE 200 

201 HSSRSHSIFLINI KQEN VE TEKKL S GKL YL VDL AG S E KV S KT GAE G AVL D 250 

| I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I II I I I I I I I I I 

201 HSSRSHSIFLINIKQENVETEKKLSGKLYLVDLAGSEKVSKTGAEGAVLD 250 
251 EAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTI 300 

I I I I I I I I I I II I I I I I I I I I I F I I I I I I I I I I I I I I I I I I I I I I II I I I 

251 E AKN INKS LS ALGNVI S ALAEGTKT HVPYRDSKMTRI LQDS LGGNCRT T I 300 



301 VICCS PS VFNEAETKSTLMFGQR 

I I I I I I M I I I I I I I I I I I I I I I 

301 VI CCSPS VFNEAETKSTLMFGQR 



323 
323 
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Expression o/Homo sapiens protein tyrosine phosphatase, receptor type, S (PTPRS) M62069 
transcripts which are detectable by amplicon as depicted in sequence name M62069 segl9 in 

normal and cancerous lung tissues 
Expression of Homo sapiens protein tyrosine phosphatase, receptor type, S (PTPRS) 
5 transcripts detectable by or according to segl9, M62069 segl9 amplicon (SEQ ID NO: 1657) 
and M62069 segl9F (SEQ ID NO: 1655) and M62069 segl9R (SEQ ID NO: 1656) primers was 
measured by real time PCR. In parallel the expression of four housekeeping genes -PBGD 
(GenBank Accession No. BC019323; amplicon - PBGD- amplicon, SEQ ID NO:334), HPRT1 
(GenBank Accession No. NM_000194; amplicon - HPRT1 -amplicon, SEQ ID NO: 1297), 
10 Ubiquitin (GenBank Accession No. BC000449; amplicon - Ubiquitin- amplicon, SEQ ID 
NO:328) and SDHA (GenBank Accession No. NM_004168; amplicon - SDHA-amplicon, SEQ 
ID NO:331), was measured similarly. For each RT sample, the expression of the above 
amplicon was normalized to the geometric mean of the quantities of the housekeeping genes. 
The normalized quantity of each RT sample was then divided by the median of the quantities of 
15 the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, above), to 
obtain a value of fold up-regulation for each sample relative to median of the normal PM 
samples. 

Figure 65 is a histogram showing over expression of the above -indicated Homo sapiens 
protein tyrosine phosphatase, receptor type, S (PTPRS) transcripts in cancerous lung samples 
20 relative to the normal samples. Values represent the average of duplicate experiments. Error 
bars indicate the minimal and maximal values obtained. 

As is evident from Figure 65, the expression of Homo sapiens protein tyrosine 
phosphatase, receptor type, S (PTPRS) transcripts detectable by the above amplicon(s) in cancer 
samples was significantly higher than in the non-cancerous samples (Sample Nos. 47-50, 90-93, 
25 96-99 Table 2). Notably an over- expression of at least 5 fold was found in 2 out of 15 
adenocarcinoma samples, and in 8 out of 8 small cells carcinoma samples. 
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Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: M62069 segl9F forward primer; and 
M62069 segl9R reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: M62069 segl9. 

Forward primer -M62069 segl9F (SEQ ID NO: 1655): 
GCTGATTGTCCCCATGAAGG 

Reverse primer- M62069 segl9 (SEQ ID NO: 1656): TGGCATACGGGAACTCAGTG 

Amplicon (SEQ ID NO: 1657): 
GCTGATTGTCCCCATGAAGGCCAGCCTTGAAGCTTGGTCAGTCTCCCTAACTGTATG 

ATTGATCCCCACTTATTGCACTACATCACTGAGTTCCCGTATGC 



Expression o/Homo sapiens protein tyrosine phosphatase, receptor type, S (PTPRS) M62069 
transcripts which are detectable by amplicon as depicted in sequence name M62069 seg29 in 

normal and cancerous lung tissues 
Expression of Homo sapiens protein tyrosine phosphatase, receptor type, S (PTPRS) 
transcripts detectable by or according to seg29, M62069 seg29 amplicon (SEQ ID NO: 1660) 
and M62069 seg29F (SEQ ID NO: 1658) and M62069 seg29R (SEQ ID NO: 1659) primers was 
measured by real time PCR. In parallel the expression of four housekeeping genes — PBGD 
(GenBank Accession No. BC019323; amplicon - PBGD-amplicon, SEQ ID NO:334), HPRT1 
(GenBank Accession No. NM_000194; amplicon - HPRT1 -amplicon, SEQ ID NO:1297), 
Ubiquitin (GenBank Accession No. BC000449; amplicon - Ubiquitin- amplicon, SEQ ID 
NO:328) and SDHA (GenBank Accession No. NM_004168; amplicon - SDH A- amplicon, SEQ 
ID NO:331), was measured similarly. For each RT sample, the expression of the above 
amplicon was normalized to the geometric mean of the quantities of the housekeeping genes. 
The normalized quantity of each RT sample was then divided by the median of the quantities of 
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the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, above), to 
obtain a value of fold up -regulation for each sample relative to median of the normal PM 
samples. 

Figure 66 is a histogram showing over expression of the above -indicated Homo sapiens 
5 protein tyrosine phosphatase, receptor type, S (PTPRS) transcripts in cancerous lung samples 
relative to the normal samples. Values represent the average of duplicate experiments. Error 
bars indicate the minimal and maximal values obtained. 

As is evident from Figure 66, the expression of Homo sapiens protein tyrosine 
phosphatase, receptor type, S (PTPRS) transcripts detectable by the above amplicon(s) in cancer 
10 samples was significantly higher than in the non-cancerous samples (Sample Nos. 47-50, 90-93, 
96-99 Table 2). Notably an over- expression of at least 5 fold was found in 2 out of 15 
adenocarcinoma samples, and in 7 out of 8 small cells carcinoma samples. 

Primer pairs are also optionally and preferably encompassed within the present 
1 5 invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: M62069 seg29F forward primer; and 
M62069 seg29R reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
20 was obtained as a non- limiting illustrative example only of a suitable amplicon: M62069 seg29. 

Forward primer -M62069 seg29F: ATTGAATAATTCAGCACCTGAGGC 

Reverse primer- M62069 seg29R: TTCATATGGCTACTCCCCACCT 

Amplicon: 

ATTGAATAATTCAGCACCTGAGGCTGGTGGATGATTCTTTGCAATTTGGCAGGAATG 
25 GGAGAGTCGGGAGCAGTAGTTGGCAAGGTGGGGAGTAGCCATATGAA 
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DESCRIPTION FOR CLUSTER M78076 
Cluster M78076 features 9 transcript(s) and 35 segment(s) of interest, the names for 
which are given in Tables 686 and 687, respectively, the sequences themselves are given at the 
end of the application. The selected protein variants are given in table 688. 

5 Table 686 - Transcripts of interest 



Transcript Name * 


Sequence ID No. ; ; 


M78076_PEA_1_T2 


74 


M78076_PEA_1_T3 


75 


M78076_PEA_1_T5 


76 


M78076_PEA_1_T13 


77 


M78076_PEA_1_T15 


78 


M78076_PEA_1_T23 


79 


M78076_PEA_1_T26 


80 


M78076_PEA_1_T27 


81 


M78076_PEA_1_T28 


82 


Table 687 - Segments of interest 


Segment Name _.• . -; ':f:h' , 


! Sequence ID No; ;, V 

i 1 ' " ■,'Sk'" : ', : - 


M78076_PEA_l_node_0 


659 


M78076_PEA_l_node_l 0 


660 


M78076_PEA_l_node_l 5 


661 


M78076_PEA_l_node_l 8 


662 


M78076_PEA_l_node_20 


663 


M78076_PEA_l_node_24 


664 


M7 8076_PEA_l_node_26 


665 : 


M78076_PEA_l_node_29 


666 


M78076_PEA_l_node_32 


667 


M78076_PEA_l_node_35 


668 


M78076_PEA_l_node_37 


669 
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M78076JPEA_l_node_46 


670 


M7 8076_PEA_1 _node_47 


671 


M7 8076_PEA_1 _node_54 


672 


M7 8076 JPEA_l_node_l 


673 


M7 8076 JPEAJ _node_2 


674 


M78076_PEA_l_nodeJ3 


675 


M78076_PEA_l_node_6 


676 


M7 8 0 7 6_PE A_ 1 _no de_7 


677 


M78076_PEA_l_node_12 


678 


M78076_PEA_l_node_22 


679 


M78076JPEA_l_node27 


680 


M7 8076 JPEA__l_node_3 0 


681 


M78076 JPEA_l_node_3 1 


682 


M78076JPEA_l_node_34 


683 


M78076J > EA_l_node_36 


684 


M78076_PEA_l_node__4 1 


685 


M7 8076JPEA_l_node_42 


686 


JVL/eU / o_r xi A_ i _noue_ z t j 


6R7 


M7 8076_PEA_l_node_45 


688 


M78076_PEA_l_node_49 


689 


M78076_PEA_l_node_50 


690 


M78076_PEA_l_node_5 1 


691 


M78076_PEA_l_node_52 


692 


M78076_PEA_l_node_53 


693 



Table 688 - Proteins of interest 



Protein Name 


Sequence ID No. 


Corresponding Transcripts) 


M78076_PEA_1_P3 


1350 


M78076_PEA_1_T2; 
M78076_PEA_1_T5 



WO 2006/131783 



PCT/IB2005/004037 



764 



M /SU7o_PEA_l_P4 


1 "i C 1 

1351 


M7 8076_PE A_ 1 _T3 


M78076_PEA_1_P12 


1352 


M78076_PEA_1_T13 


M78076_PEA_1_P14 


1353 


M78076_PEA_1_T15 


M78076_PEA_1_P21 


1354 


M78076_PEA_1_T23 


M78076_PEA_1_P24 


1355 


M78076_PEA_1_T26 


M78076_PEA_1_P2 


1356 


M78076_PEA_1_T27 


M78076_PEA_1_P25 


1357 


M78076_PEA_1_T28 



These sequences are variants of the known protein Amyloid- like protein 1 precursor 
(SwissProt accession identifier APP 1HUMAN; known also according to the synonyms APLP; 
APLP-1), SEQ ID NO: 1439, referred to herein as the previously known protein. 
5 Protein Amyloid- like protein 1 precursor is known or believed to have the following 

fimction(s): May play a role in postsynaptic function. The C- terminal gamma- secretase 
processed fragment, ALID1, activates transcription activation through APBB1 (Fe65) binding 
(By similarity). Couples to JIP signal transduction through C- terminal binding. May interact 
with cellular G-protein signaling pathways. Can regulate neurite outgrowth through binding to 
10 components of the extracellular matrix such as heparin and collagen I. The gamma-CTF peptide, 
C30, is a potent enhancer of neuronal apoptosis (By similarity). The sequence for protein 
Amyloid- like protein 1 precursor is given at the end of the application, as "Amyloid -like protein 
1 precursor amino acid sequence". Known polymorphisms for this sequence are as shown in 
Table 689. 

1 5 Table 689 - Amino acid mutations for Known Protein 



SNP ppsition(5) on 
amino acid sequence 


Comment .. / . ? :v" : A • v ' 


48 


A->P 



Protein Amyloid- like protein 1 precursor localization is believed to be Type I membrane 
protein. C-terminally processed in the Golgi complex. 

The following GO Annotation(s) apply to the previously known protein. The following 
20 annotation(s) were found: endocytosis; apoptosis; cell adhesion; neurogenesis; cell death, which 
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are annotation(s) related to Biological Process; protein binding; heparin binding, which are 
annotation(s) related to Molecular Function; and basement membrane; coated pit; integral 
membrane protein, which are annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 

5 Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

As noted above, cluster M78076 features 9 transcript(s), which were listed in Table 1 
above. These transcript(s) encode for protein(s) which are variant(s) of protein Amyloid-like 
protein 1 precursor. A description of each variant protein according to the present invention is 

10 now provided. 

Variant protein M78076_PEA_1_P3 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
M78076_PEA_1_T2. An alignment is given to the known protein (Amyloid- like protein 1 
1 5 precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between M78076_PEA_1_P3 and APP1_HUMAN: 
20 1 .An isolated chimeric polypeptide encoding for M78076_PEA_1_P3, comprising a first 

amino acid sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 

CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 

RWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 

25 EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 
SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 
DIYFGMPGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQALN 
EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 
ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQSLGLLD 

30 QNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKD corresponding to 
amino acids 1 - 517 of APP1_HUMAN, which also corresponds to amino acids 1 - 517 of 
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M78076JPEA_1 JP3, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence GE corresponding to amino acids 518-519 
of M78076_PEA_1 JP3, wherein said first amino acid sequence and second amino acid 
5 sequence are contiguous and in a sequential order. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 

10 secreted. The protein localization is believed to be secreted because both signatpeptide 

prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein M78076JPEA_1 JP3 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 690, (given according to their position(s) on the 

15 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M78076_PEA_1_P3 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 690 - Amino acid mutations 



on amino agid 

sequence 


Alternative amino acid(s) j ; 


Previously Knowii SNP? - 


4 


A->P 


Yes 


6 


P->H 


Yes 


13 


R->H 


Yes 


34 


Q-> 


No 


38 


G->R 


Yes 


88 


P->R 


Yes 


124 


R->Q 


Yes 


127 


S-> 


No 


145 


F->S 


No 
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G->R 


No 


214 


G-> 


No 


262 


Q-> 


No 


270 


V-> 


No 


309 


G->E 


Yes 


370 


Q-> 


No 



The glycosylation sites of variant protein M78076JPEA_1 JP3, as compared to the known 
protein Amyloid- like protein 1 precursor, are described in Table 691 (given according to their 
position(s) on the amino acid sequence in the first column; the second column indicates whether 
5 the glycosylation site is present in the variant protein; and the last column indicates whether the 
position is different on the variant protein). 



Table 691- Glycosylation site(s) 



Pt)^jdon(sy6ii known amino 
acid sequence ! : 


Present m vmmt^ot£^ f y I 


Position in variant protein^ 


337 


yes 


337 


461 


yes . 


461 


551 


no 





Variant protein M78076JPEA_1JP3 is encoded by the following transcript(s): 
10 M78076_PEA_1_T2, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M7 8076_PE A_l T2 is shown in bold; this coding portion starts at 
position 142 and ends at position 1698. The transcript also has the following SNPs as listed in 
Table 692 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
15 known SNPs in variant protein M78076_PEA_1_P3 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 692 - Nucleic acid SNPs 



WO 2006/131783 



PCT/IB2005/004037 



768 



SIMP position, on nucleotide 
sequence , . 


Alternative nucleic acid 


Previously known SNP? 


114 


G-> 


No 


151 


G->C 


Yes 


158 


C->A 


Yes 


179 


G-> A 


Yes 


219 


A->G 


Yes 


243 


G-> 


No 


253 


G-> A 


Yes 


315 


A->G 


Yes 


366 


A->G 


Yes 


404 


C->G 


Yes 


512 


G-> A 


Yes 


522 


C-> 


No 


522 


C->T 


No 


575 


T->C 


No 


781 


G-> 


No 


781 


G->A 


No 


927 


G-> 


No 


951 


C-> 


No 


1067 


G-> A 


Yes 


1077 


G-> A 


Yes 


1251 


G-> 


No 


1398 


G->T 


Yes 


1423 


C ->T 


Yes 


2146 


G->A 


Yes 


2224 


C->T 


No 


2362 


C->T 


Yes 


2513 


A->G 


No 


2656 


C->T 


Yes 
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Variant protein M78076_PEA_1_P4 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 M78076JPEA_1_T3. An alignment is given to the known protein (Amyloid- like protein 1 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

1 0 Comparison report between M78076_PEA_1 JP4 and APP1_HUMAN: 

1. An isolated chimeric polypeptide encoding for M78076 PEA_1 JP4, comprising a first 
amino acid sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 
CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 
15 RWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 
EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 
SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 
DIYFGMPGEISEHEGFLRAKMDLEERRMRQ 

EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 
20 ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQSLGLLD 
QNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMTLPKG 
corresponding to amino acids 1 - 526 of APP 1JHUMAN, which also corresponds to amino 
acids 1 - 526 of M78076JPEA1P4, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
25 preferably at least 95% homologous to a polypeptide having the sequence 

ECLTVNPSLQIPLNP corresponding to amino acids 527 - 541 of M78076_PEA_1 JP4, wherein 
said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

2. An isolated polypeptide encoding for a tail of M78076JPEA_1_P4, comprising a 
30 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
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more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence ECLTVNPSLQIPLNP in M78076JPEA_1 JP4. 

The location of the variant protein was determined according to results from a number of 
5 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

10 Variant protein M78076_PEA_1 JP4 also has the following non-silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 693, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M78076JPEA 1P4 
sequence provides support for the deduced sequence of this variant protein according to the 

15 present invention). 

Table 693 - Amino acid mutations 



SNP position(s) on amino acid 

"sequence:' f :- .8 , 


Alternative amino acid(s) 


Previously known SNP? , 


4 


A->P 


Yes 


6 


P->H 


Yes 


13 


R->H 


Yes 


34 


Q-> 


No 


38 


G->R 


Yes 


88 


P->R 


Yes 


124 


R->Q 


Yes 


127 


S-> 


No 


145 


F->S 


No 


214 


G->R 


No 


214 


G-> 


No 


262 


Q-> 


No 
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270 


V-> 


No 


309 


G->E 


Yes 


370 


Q-> 


No 



The glycosylation sites of variant protein M78076JPEA_1 JP4, as compared to the known 
protein Amyloid- like protein 1 precursor, are described in Table 694(given according to their 
position(s) on the amino acid sequence in the first column; the second column indicates whether 
5 the glycosylation site is present in the variant protein; and the last column indicates whether the 
position is different on the variant protein). 



Table 694 - Glycosylation site(s) 



Position(s) on known amino 
kcid sequence "1 § ' \ 


Present px variant protein! | 


Position in variant protein? 


337 


yes 


337 


461 


yes 


461 


551 


no 





Variant protein M78076JPEA_1_P4 is encoded by the following transcript(s): 
10 M78076JPEA_1__T3, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M78076J?EA_1_T3 is shown in bold; this coding portion starts at 
position 142 and ends at position 1764. The transcript also has the following SNPs as listed in 
Table 695 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
15 known SNPs in variant protein M78076JPEA_1 JP4 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 695 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


114 


G-> 


No 


151 


G->C 


Yes 
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158 


C -> A 


Yes 


1 79 


G -> A 


Yes 




A -> G 


Yes 


243 


G -> 


No 


253 


G -> A 


Yes 


QIC 


A -> G 


Yes 


366 


A -> G 


Yes 


404 


C -> G 


Yes 


CIO 

512 


G-> A 


Yes 


coo 

522 


C -> 


No 


522 


C ->T 


No 


C~l c 

575 


T->C 


No 


781 


G -> 


No 


781 


G -> A 


No 


927 


G -> 


No 


951 


C -> 


No 


1067 


G-> A 


Yes 


1077 


G-> A 


Yes 


1251 


G -> 


No 


i o no 

1398 


G -> T 


Yes 




-> I 


Yes 


1817 


G->A 


Yes 


2362 


G->A 


Yes 


2440 


C->T 


No 


2578 


C->T 


Yes 


2729 


A->G 


No 


2872 


C->T 


Yes 
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Variant protein M78076JPEA_1 JP12 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
M78076JPEA_1 JT13. An alignment is given to the known protein (Amyloid-like protein 1 
precursor) at the end of the application. One or more alignments to one or more previously 
5 published protein sequences are given at the end of the application. A brief description of the 

relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between M78076_PEA_1 JP12 and APP1 JHUMAN: 
1 An isolated chimeric polypeptide encoding for M78076JPEA_1_P12, comprising a first 
10 amino acid sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 
CGRLTLHRDLRTGRWEPDPQRS RRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 
RWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 
EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 
15 SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 
DIYFGMPGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADN^ 

EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 

ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQSLGLLD 

QNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMTLPKG 

20 corresponding to amino acids 1 - 526 of APP1 JHUMAN, which also corresponds to amino 

acids 1 - 526 of M78076JPEA_1 JP12, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
ECVCSKGFPFPLIGDSEG corresponding to amino acids 527 - 544 of M78076JPEA_1_P12, 

25 wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

2. An isolated polypeptide encoding for a tail of M78076JPEA_1JP12, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
30 sequence ECVCSKGFPFPLIGDSEG in M78076JPEA_1 JP12. 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
5 prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein M78076JPEA_1 JP12 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 696, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
10 the SNP is known or not; the presence of known SNPs in variant protein M78076JPEA_1 JP12 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 696 - Amino acid mutations 



&pPj)o$Mon(sJ on amino apid 


Alternative amino acid(s) 


Previously known SNP? 


sequence ' > -J: / , 






4 


A->P 


Yes 


6 


P->H 


Yes 


13 


R->H 


Yes 


34 


Q-> 


No 


38 


G->R 


Yes 


88 


P->R 


Yes 


124 


R->Q 


Yes 


127 


S-> 


No 


145 


F->S 


No 


214 


G->R 


No 


214 


G-> 


No 


262 


Q-> 


No 


270 


V-> 


No 


309 


G->E 


Yes ' 


370 


Q-> 


No 
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The glycosylation sites of variant protein M78076_PEA_1JP12, as compared to the 
known protein Amyloid- like protein 1 precursor, are described in Table 697 (given according to 
their position(s) on the amino acid sequence in the first column; the second column indicates 



5 whether the glycosylation site is present in the variant protein; and the last column indicates 
whether the position is different on the variant protein). 

Table 697- Glycosylation site(s) 



Position(s) on knbwji amino 
acid sequent 


Present in variant protein? *h 

'%? %W I • "■ ; 

■ *» , • 


Position in variant protein? 


337 


yes 


337 


461 


yes 


461 


551 


no 





Variant protein M78076_PEA_1_P12 is encoded by the following transcript(s): 
10 M78076_PEA_1_T13, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M78076_PEA_1_T13 is shown in bold; this coding portion starts at 
position 142 and ends at position 1773. The transcript also has the following SNPs as listed in 
Table 698 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
15 known SNPs in variant protein M78076_PEA_1 JP12 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

Table 698 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


114 


G-> 


No 


151 


G->C 


Yes 


158 


C->A 


Yes 


179 


G-> A 


Yes 


219 


A->G 


Yes 
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243 


G-> 


No 


253 


G-> A 


Yes 


315 


A -> G 


Yes 


366 


A->G 


Yes 


404 


C ->G 


Yes 


512 


G-> A 


Yes 


522 


C -> 


No 


522 


C ->T 


No 


575 


T-> C 


No 


781 


G-> 


No 


781 


G-> A 


No j 


927 


G-> 


No 


951 


C -> 


No 


1067 


G-> A 


Yes 


1077 


G->A 


Yes 


1251 


G-> 


No 


1398 


G-> T 


Yes 


1423 


C->T 


Yes 


1816 


G->A 


Yes 


1894 


C->T 


No 


2032 


C->T 


Yes 


2183 


A->G 


No 


2326 


C->T 


Yes 



Variant protein M78076_PEA_1 JP14 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 M78076_PEA_1_T15. An alignment is given to the known protein (Amyloid- like protein 1 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
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relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between M78076_PEA_1JP14 and APP1 JHUMAN: 

1. An isolated chimeric polypeptide encoding for M78076_PEA_1_JP14, comprising a first 
5 amino acid sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 
CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 
RWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 
EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 
10 SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 
DIYFGMPGEISEHEGFLRAKMDLEERRMR 

EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 

ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQSLGLLD 

QNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMTLPKGST 

1 5 EQDAASPEKEKMNPLEQYERKVNASVPRGFPFHSSEIQRDEL corresponding to amino 
acids 1 - 570 of APP1 JHUMAN, which also corresponds to amino acids 1 - 570 of 
M78076_PEA_1_P14, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 

20 VRGGTAGYLGEETRGQRPGCDSQSHTGPSKKPSAPSPLPAGTSWDRGVP corresponding 
to amino acids 571-619 of M78076_PEA_1__P14, wherein said first amino acid sequence and 
second amino acid sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of M78076JPEA_1 JP14, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 

25 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence VRGGTAGYLGEETRGQRPGCDSQSHTGPSKKPSAPSPLPAGTSWDRGVP in 
M78076JPEA_1JP14. 

The location of the variant protein was determined according to results from a number of 
30 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
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secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein M78076_PEA_1 JP14 also has the following non- silent SNPs (Single 
5 Nucleotide Polymorphisms) as listed in Table 699, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M78076JPEA_1 JP14 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

1 0 Table 699- Ammo acid mutations 



SNP position^ on amino acid 
sequence " ; ' : : ' :l § 


Alternative amino acid(s) " if.-' 
■ "■■ t:%... ". . 


lreviousiy Known oj.ni c , 

• 'y. y '■ ■'■ ■ ■ • W..- ,. - ■ .:■ 


4 


A->P 


Yes 


6 


P->H 


Yes 


13 


R->H 


Yes 


34 


Q-> 


No 


38 


G->R 


Yes 


88 


P->R 


Yes 


124 


R->Q 


Yes 


127 


S-> 


No 


145 


F->S 


No 


214 


G->R 


No 


214 


G-> 


No 


262 


Q-> 


No 


270 


V-> 


No 


309 


G->E 


Yes 


370 


Q-> 


No 



The glycosylation sites of variant protein M78076_PEA_1 JP14, as compared to the 
known protein Amyloid- like protein 1 precursor, are described in Table 700 (given according to 
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their position(s) on the amino acid sequence in the first column; the second column indicates 
whether the glycosylation site is present in the variant protein; and the last column indicates 
whether the position is different on the variant protein). 

Table 700 - Glycosylation site(s) 



i*qsitiQn(s) on kno wh amino 
acid sequence . 


Present in variant protein? 


Position in varia&t protein? 


337 


yes 


337 


461 


yes 


461 


551 


yes 


551 



5 



Variant protein M78076_PEA_1JP14 is encoded by the following transcript(s): 
M78076J?EA_1 JT15, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M78076JPEA_1 JT15 is shown in bold; this coding portion starts at 
position 142 and ends at position 1998. The transcript also has the following SNPs as listed in 
10 Table 701 (given according to their position on the nucleotide sequence, with the alternative 

nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein M78076JPEA_1_P14 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 



Table 701 -Nucleic acid SNPs 



SNP position on nucleotide 
sequence "'•..) 


Alternative nucleic acid ?/ 


I jKreviously known SNP? j 


114 


G-> 


No 


151 


G->C 


Yes 


158 


C->A 


Yes 


179 


G-> A 


Yes 


219 


A->G 


Yes 


243 


G-> 


No 


253 


G-> A 


Yes 


315 


A->G 


Yes 
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366 


A->G 


Yes 


404 


C ->G 


Yes 


512 


G-> A 


Yes 


522 


C -> 


No 


522 


C ->T 


No 


575 


T->C 


No 


781 


G-> 


No 


781 


G-> A 


No 


927 


G-> 


No 


951 


C-> 


No 


1067 


G-> A 


Yes 


1077 


G-> A 


Yes 


1251 


G-> 


No 


1398 


G->T 


Yes 


1423 


C->T 


Yes 


2008 


G-> A 


Yes 


2086 


C->T 


No 


2224 


C->T 


Yes 


2375 


A->G 


No 


2518 


C->T 


Yes 



Variant protein M78076JPEA__1JP21 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 M78076JPEA_1_T23. An alignment is given to the known protein (Amyloid- like protein 1 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

1 0 Comparison report between M78076 JPEA_J_P2 1 and APP 1 HUMAN: 
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l.An isolated chimeric polypeptide encoding for M78076J>EA_1 JP21, comprising a first 
amino acid sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 

CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 

RWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 

EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 

SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 

DIYFGMPGEISEHEGFLRAKMDLEERRMRQINEVMREWAMA 

E coixesponding to amino acids 1 - 352 of APP1_HUMAN, which also corresponds to amino 
acids 1 - 352 of M78076_PEA_1JP21, and a second amino acid sequence being at least 90 % 
homologous to 

AERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQ 
SLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMT 
LPKGSTEQDAASPEKEKMNPLEQYERKVNASVPRGFPFHSSEIQRDELAPAGTGVSREA 
VSGLLIMGAGGGSLIVLSMLLLRRKKPYGAISHGVVEVDPMLTLEEQQLRELQRHGYE 
NPTYRFLEERP coixesponding to amino acids 406 - 650 of APP1 JtfUMAN, which also 
corresponds to amino acids 353 - 597 of M78076JPEAJ JP21, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 

2.An isolated chimeric polypeptide encoding for an edge portion of 
M78076JPEA_1 JP21, comprising a polypeptide having a length "n", wherein n is at least about 
10 amino acids in length, optionally at least about 20 amino acids in length, preferably at least 
about 30 amino acids in length, more preferably at least about 40 amino acids in length and most 
preferably at least about 50 amino acids in length, wherein at least two amino acids comprise 
EA, having a structure as follows: a sequence starting from any of amino acid numbers 352-x to 
352; and ending at any of amino acid numbers 353+ ((n-2) - x), in which x varies from 0 to n-2. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
membrane. The protein localization is believed to be membrane because although both signal- 
peptide prediction programs agree that this protein has a signal peptide, both trans -membrane 
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region prediction programs predict that this protein has a trans -membrane region downstream of 
this signal peptide. 

Variant protein M78076_PEAJ_P2 1 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 702, (given according to their position(s) on the 
5 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M78076_PEA_1 JP21 
sequence provides support for the deduced sequence of this variant protein according to the 



present invention). 

Table 702 -Amino acid mutations 



OiN-F pO^lLlOn^o ) Oil <tllilJ.lv ciV/iu 

sequence , 


Alternative amino acidfsl 


Previously known SNP? ^ s 


4 


A->P 


Yes 


6 


P->H 


Yes 


13 


R->H 


Yes 


34 


Q-> 


No 


38 


G->R 


Yes 


88 


P->R 


Yes 


124 ~* 


R->Q 


Yes 


127 


S -> 


No 


145 


F->S 


No 


214 


G->R 


No 


214 


G-> 


No 


262 


Q-> 


No 


270 


V-> 


No 


309 


G->E 


Yes 



10 

The glycosylation sites of variant protein M78076_PEA_1_P21, as compared to the 
known protein Amyloid- like protein 1 precursor, are described in Table 703 (given according to 
their position(s) on the amino acid sequence in the first column; the second column indicates 
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whether the glycosylation site is present in the variant protein; and the last column indicates 
whether the position is different on the variant protein). 

Table 703- Glycosylation site(s) 



Positidn(s) on Ibaown anlino 
; add sequence f' . ' ;,t ; ; 


Present in variant protein? f 


Position in variant protein? 


337 


yes 


337 


461 


yes 


408 


551 


yes 


498 



Variant protein M78076JPEA_1 JP21 is encoded by the following transcript(s): 
M78076JPEAJLT23, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M78076JPEA_1_T23 is shown in bold; this coding portion starts at 
position 142 and ends at position 1932. The transcript also has the following SNPs as listed in 
Table 704 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein M78076_PEA_1 JP21 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

Table 704 -Nucleic acid SNPs 



SNP position on nucleotide 
sequence •<..'■"■>: 


Alternative nucleic acid 


Previously known SNP? *>- :! 

■ ■ ' ■ ■ ' 


114 


G-> 


No 


151 


G->C 


Yes 


158 


C-> A 


Yes 


179 


G-> A 


Yes 


219 


A->G 


Yes 


243 


G-> 


No 


253 


G-> A 


Yes 


315 


A->G 


Yes 


366 


A->G 


Yes 











784 




404 


C->G 


Yes 


512 


G -> A 


Yes 


522 


C -> 


No 


522 


C->T 


No 


575 


T->C 


No 


781 


G-> 


No 


781 


G->A 


No 


927 


G-> 


No 


951 


C-> 


No 


1067 


G -> A 


Yes 


1077 


G-> A 


Yes 


1239 


G->T 


Yes 


1264 


C ->T 


Yes 


1728 


G-> A 


Yes 


1806 


C ->T 


No 


1944 


C ->T 


Yes 


2095 


A->G 


No 


2238 


C ->T 


Yes 



Variant protein M78076_PEA_1JP24 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 M78076JPEA_1_T26. An alignment is given to the known protein (Amyloid- like protein 1 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

1 0 Comparison report between M78076 JPEA_1_P24 and APP 1_HUMAN: 

LAn isolated chimeric polypeptide encoding for M78076_PEA_1_P24, comprising a first 
amino acid sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 
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CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 
RWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 
EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 
SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 
DIYFGMPGEISEHEGFLRAKMDLEERRMRQmEVMREWAMADNQSKNLPKADRQALN 
EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 
ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQSLGLLD 
QNPHLAQELRPQI corresponding to amino acids 1 - 481 of APP1_HUMAN 5 which also 
corresponds to amino acids 1 - 481 of M78076JPEA_1 JP24, and a second amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85% 5 more preferably at least 
90% and most preferably at least 95% homologous to a polypeptide having the sequence 
RECLLPWLPLQISEGRS corresponding to amino acids 482 - 498 of M7 8076_PE A_1_P24, 
wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

2.An isolated polypeptide encoding for a tail of M78076JPEA_1_P24, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence RECLLPWLPLQISEGRS in M78076_PEA_1 JP24. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein M78076_PEA_1_P24 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 705, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M78076__PEA_1_P24 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 



WO 2006/131783 



PCT/IB2005/004037 



786 



Table 705 - Amino acid mutations 



SNP position(s) on amino acid 
sequence ' • _ \, : : \ 


Alternative amino acid(s) 


r reviousiy jcuown jmn rx 


4 


A->P 


Yes 


6 


P->H 


Yes 


13 


R->H 


Yes 


34 


Q> 


No 


38 


G->R 


Yes 


88 


P->R 


Yes 


124 


R->Q 


Yes 


127 


s -> 


1NO 


145 


F->S 


No 


214 


G->R 


No 


214 


G-> 


No 


262 


Q-> 


No 


270 


V-> 


No 


309 


G->E 


Yes 


370 


Q-> 


No 



The glycosylation sites of variant protein M78076JPEA_1 JP24, as compared to the 
known protein Amyloid- like protein 1 precursor, are described in Table 706 (given according to 
5 their position(s) on the amino acid sequence in the first column; the second column indicates 
whether the glycosylation site is present in the variant protein; and the last column indicates 
whether the position is different on the variant protein). 

Table 706 - Glycosylation site(s) 



Position(s) on known amino 
acid sequence 


Present in variant protein? 


Position in variant protein? 


337 


yes 


337 


461 


yes 


461 
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551 



no 



10 



Variant protein M78076JPEA_1_P24 is encoded by the following trans cript(s): 
M78076JPEA1T26, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M78076JPEA_1_T26 is shown in bold; this coding portion starts at 
position 142 and ends at position 1635. The transcript also has the following SNPs as listed in 
Table 707 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein M78076_PEA_1 JP24 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

Table 707 - Nucleic acid SNPs 



SN%>osition on nucleotide 
sequence ; -£/.. . 


- Alternative nucleic acid ; ; ? 


Previously known SNP? 


114 


G-> 


No 


151 


G->C 


Yes 


158 


C->A 


Yes 


179 


G->A 


Yes 


219 


A->G 


Yes 


243 


G-> 


No 


253 


G->A 


Yes 


315 


A->G 


Yes 


366 


A->G 


Yes 


404 


C->G 


Yes 


512 


G-> A 


Yes 


522 


C-> 


No 


522 


C->T 


No 


575 


T->C 


No 


781 


G-> 


No 


781 


G-> A 


No 


927 


G-> 


No 
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951 


C -> 


JNO | 


1067 


G -> A 


Yes 


1077 


G -> A 


Yes 


1251 


G-> 


No 


1398 


G->T 


Yes 


1423 


C ->T 


Yes 


2184 


G-> A 


Yes 



Variant protein M78076_PEA_1_P2 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 M78076_PEA_1_T27. An alignment is given to the known protein (Amyloid- like protein 1 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

1 0 Comparison report between M78076_PEA_1_P2 and APPIJHUMAN: 

l.An isolated chimeric polypeptide encoding for M78076_PEA_1_P2, comprising a first 
amino acid sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 
CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 

1 5 RWCGGSRSGSCAHPHHQWPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 
EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSIRSWPPG 
SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 
DIYFGMPGEISEHEGFLRAJKMDLEERRMRQINEVMREWAMADNQSKNLPKA^ 
EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 

20 ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQV corresponding to amino acids 
1 _ 449 of APP1 JBUMAN, which also corresponds to amino acids 1 - 449 of 
M78076_PEA_1_P2, and a second amino acid sequence being at least 70%, optionally at least 
80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 



WO 2006/131783 



PCT/IB2005/004037 



789 

LTSFQLPNAPLFLRRPRLRLFSCPLDPLSVSWTPSYPLNTASLPLPSLSAQLPDPETWTLT 
CCVFDPCFLALGFLLPPPSILCSVPWIFTAFPRIVFFFFFFLRQVLALSPRQESSVRSWLIAT 
STSWVQAILLPQPLE corresponding to amino acids 450 - 588 of M78076JPEA_1_P2 5 
wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 
5 sequential order. 

2 An isolated polypeptide encoding for a tail of M78076JPEA_1 JP2, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

10 LTSFQLPNAPLFLRRPRLRLFSCPLDPLSVSWTPSYPLNTASLPLPSLSAQLPDPETWTLT 
CCVFDPCFLALGFLLPPPSILCSVPWIFTAFPRIVFFFFFFLRQVLALSPRQESSVRSWLIAT 
STSWVQAILLPQPLE inM78076_PEA_l_P2. 

The location of the variant protein was determined according to results from a number of 
15 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
membrane. The protein localization is believed to be membrane because although both signal- 
peptide prediction programs agree that this protein has a signal peptide, both trans -membrane 
region prediction programs predict that this protein has a trans -membrane region downstream of 
20 this signal peptide. 

Variant protein M78076_PEA__1_P2 also has the following nonrsilent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 708, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein M78076_PEA_1_P2 
25 sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 708 -Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


4 


A->P 


Yes 
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6 


P->H 


Yes 


13 


R->H 


Yes 


34 


Q-> 


No 


38 


G->R 


Yes 


88 


P->R 


Yes 


124 


R->Q 


Yes 


127 


S-> 


No 


145 


F->S 


No 


214 


G->R 


No 


214 


G-> 


No 


262 


Q-> 


No 


270 


V-> 


No 


309 


G->E 


Yes 


370 


Q-> 


No 


520 


A->S 


Yes 


546 


F-> 


Yes 


564 


S->C 


Yes 



The glycosylation sites of variant protein M78076J?EA_1 JP2, as compared to the known 
protein Amyloid- like protein 1 precursor, are described in Table 709 (given according to their 
position(s) on the amino acid sequence in the first column; the second column indicates whether 



5 the glycosylation site is present in the variant protein; and the last column indicates whether the 
position is different on the variant protein). 

Table 709 - Glycosylation site(s) 



Position(s) on known amino 
acid sequence , 


Present in variant protein? 


Position to variant protein? 


337 


yes 


337 


461 


no 




551 


no 
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Variant protein M78076JPEA_1 JP2 is encoded by the following transcript(s): 
M78076JPEAJ JT27, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript M78076JPEA_1 JT27 is shown in bold; this coding portion starts at 
position 142 and ends at position 1905. The transcript also has the following SNPs as listed in 
5 Table 710 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein M78076JPEA_1 JP2 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 710- Nucleic acid SNPs 



SNP position on nuclgibtidef 1 > 
sequence fii ' # ' ; " 


Alternative nucleic acid r 

ir- '• ■ •- ■ .. •>• . " :% 

r- v. ■ 


Previously toaowii SNP? _ ; 

r ■ V '«#: ■ ?r r 


114 


G-> 


No 


151 


G->C 


Yes 


158 


C -> A 


Yes 


179 


G-> A 


Yes 


219 


A->G 


Yes 


243 


G-> 


No 


253 


G-> A 


Yes 


315 


A->G 


Yes 


366 


A->G 


Yes 


404 


C->G 


Yes 


512 


G->A 


Yes 


522 


C-> 


No 


522 


C ->T 


No 


575 


T->C 


No 


781 


G-> 


No 


781 


G->A 


No 


927 


G-> 


No 


951 


C-> 


No 


1067 


G->A 


Yes 
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1077 


G-> A 


Yes 


1251 


G-> 


No 


1398 


G->T 


Yes 


1423 


C->T 


Yes 


1500 


C ->T 


Yes 


1699 


G->T 


Yes 


1725 


G-> A 


Yes 


1777 


T-> 


Yes 


1831 


A->T 


Yes 


2274 


A->G 


Yes 


2525 


A->G 


Yes 


2681 


G-> A 


Yes 


3831 


G-> A 


Yes 



Variant protein M78076_PEA_1_P25 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 M78076_PEA_1_T28. An alignment is given to the known protein (Amyloid- like protein 1 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

10 Comparison report between M78076_PEA_1_P25 and APP1_HUMAN: 

l.An isolated chimeric polypeptide encoding for M78076_PEA_1_P25, comprising a first 
amino acid sequence being at least 90 % homologous to 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGL 
CGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPME 
1 5 RWCGGSRSGSCAHPHHQWPFRCLPGEFVSEALLVPEGCRFLHQERMDQCESSTRRHQ 
EAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPG 
SRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGV 
DIYEGWGEISEHEGFLRAKMDLEERRMRQmEWREWAMADNQSKNLPKADRQALN 
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EHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQADPPQAERVLL 
ALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQ corresponding to amino acids 1 
_ 448 of APP 1 HUMAN, which also corresponds to amino acids 1 - 448 of 
M78076JPEA_1 JP25, and a second amino acid sequence being at least 70%, optionally at least 
5 80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 

PQNPNSQPRAAGSLEVIISHPFVRRLEILISPFQFQNSIPKNSQIVPAASPRGTSSP 
corresponding to amino acids 449 - 505 of M78076J > EA_1_P25, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 
10 2. An isolated polypeptide encoding for a tail of M78076JPEA_1 JP25, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

PQNPNSQPRAAGSLEVIISHPFVRRLEILISPFQFQNSIPKNSQIVPAASPRGTSSP 
in M78076_PEAJ_P25. 

15 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 

20 prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein M78076JPEA_1 JP25 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 71 1, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 

25 the SNP is known or not; the presence of known SNPs in variant protein M78076_PEA_1 JP25 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 711 - Amino acid mutations 



SNP position(s) on amino acid 


Alternative amino acid(s) 


\ Previously known SNP? 


sequence 
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4 


A->P 


Yes 


6 


P ->H 


Yes 


13 


R->H 


Yes 


34 


Q-> 


No 


38 


G->R 


Yes 


88 


P ->R 


Yes 


124 


R->Q 


Yes 


127 


S -> 


No 


145 


b -> o 




214 


G -> R 


No 


214 


G-> 


No 


262 


Q-> 


No 


270 


V-> 


No 


309 


G->E 


Yes 


370 


Q-> 


No 



The glycosylation sites of variant protein M78076_JPEA_1_P25, as compared to the 
known protein Amyloid- like protein 1 precursor, are described in Table 712 (given according 
their position(s) on the amino acid sequence in the first column; the second column indicates 



5 whether the glycosylation site is present in the variant protein; and the last column indicates 
whether the position is different on the variant protein). 
Table 712- Glycosylation site(s) 



Positions) on known amino 
acid sequence 


Present in vadant proteiift 


Position in variant ptoisml 


337 


yes 


337 


461 


no 




551 


no 





Variant protein M78076_PEA_1_P25 is encoded by the following transcripts): 
10 M78076_PEA_1_T28, for which the sequence(s) is/are given at the end of the application. The 
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coding portion of transcript M78076JPEA_1_T28 is shown in bold; this coding portion starts at 
position 142 and ends at position 1656. The transcript also has the following SNPs as listed in 
Table 713 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
5 known SNPs in variant protein M78076_PEA_1_P25 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

Table 713 -Nucleic acid SNPs 



SNIi position on nucleotide ~? 
sequence 5 "■■ . i VI '.' 


Alternative; nucleic acid 

i. " - . - t j" ' . ■'. ■ " . - 

i ' ' • , " v ■■ ; '* - 
s ■■> "' "' ■ ■ >?*i ' 'c ' : ' . 


Previously known SNP? : . 


114 


G-> 


No 


151 


G->C 


Yes 


158 


C->A 


Yes 


179 


G->A 


Yes 


219 


A->G 


Yes 


243 


G-> 


No 


253 


G->A 


Yes 


315 


A->G 


Yes 


366 


A->G 


Yes 


404 


C->G 


Yes 


512 


G->A 


Yes 


522 


C -> 


No 


522 


C ->T 


No 


575 


T->C 


No 


781 


G-> 


No 


781 


G->A 


No 


927 


G-> 


No 


951 


C-> 


No 


1067 


G->A 


Yes 


1077 


G->A 


Yes 


1251 


G-> 


No 
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1398 


G->T 


Yes 


1423 


C ->T 


Yes 


1593 


A->G 


No 


1736 


C ->T ^ 


Yes 



As noted above, cluster M78076 features 35 segment(s) ? which were listed in Table 2 
above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 



5 provided. 

Segment cluster M78076_PEA_l_node_0 according to the present invention is supported 
by 47 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M78076J>EA_1_T2, M78076_PEA_1_T3, 
10 M78076J>EAJ_T5, M78076_JPEA_1_T13, M78076_PEAJ_T15, M78076_PEA_1JT23, 
M78076_PEA_1 JT26, M78076JPEA_1_T27 and M78076_PEA_1_T28. Table 714 below 
describes the starting and ending position of this segment on each transcript 



Table 714 - Segment location on transcripts 



Transcript name f ' v " }\ 


Segment f . ' 
starting position : ; 


Segment ••' .t. .. f./ ■ 
ending position < 


M78076_PEA_1_T2 




160 


M78076_PEA_1_T3 




160 


M78076_PEA_1_T5 




160 


M78076_PEA_1_T13 




160 


M78076_PEA_1_T15 




160 


M78076_PEA_1_T23 




160 


M78076_PEA_1_T26 




160 


M78076_PEA_1_T27 




160 


M78076_PEA_1_T28 




160 



15 
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Segment cluster M78076JPEA_l_node_10 according to the present invention is 
supported by 70 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript (s): M78076_PEA_1_T2, M78076_PEA_1_T3 5 
M78076_PEA_1JT5, M78076_PEA_1_T13, M78076_PEAJLT15, M78076JPEA_1 JT23, 
5 M78076JPEAJ JT26, M78076JPEA_1_T27 and M78076_PEA_1 JT28. Table 715 below 
describes the starting and ending position of this segment on each transcript. 



Table 715 - Segment location on transcripts 



Transcript n|me f % < -.y , . ^ 


Segment ■ .„ 
starting position ~ 


Segment '\ ->$y 
ending position .f 


M78076_PEA_1_T2 


433 


565 


M78076_PEA_1_T3 


433 


565 


M78076_PEA_1_T5 


433 


565 


M78076_PEA_1_T13 


433 


565 


M78076_PEA_1_T15 


433 


565 


M78076_PEA_1_T23 


433 


565 


M78076_PEA_1_T26 


433 


565 


M78076_PEA_1_T27 


433 


565 


M78076_PEA_1_T28 


433 


565 



10 Segment cluster M78076JPEA_l_node_15 according to the present invention is 

supported by 74 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M78076JPEA_1_T2, M78076JPEAJ JT3, 
M78076__PEA_1_T5, M78076_PEA_1 JT13, M78076JPEA_1_T15, M78076_PEA_1_T23, 
M78076_PEA_1 JT26, M78076JPEA_1_T27 and M78076_PEA__1_T28. Table 716 below 

15 describes the starting and ending position of this segment on each transcript. 



Table 716- Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 
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M78076_PEA_1_T2 


679 


812 


M78076_PEA_1_T3 


679 


812 


M78076_PEA_1_T5 


679 


812 


M78076_PEA_1_T13 


679 


812 


M78076_PEA_1_T15 


679 


812 


M78076_PEA_1_T23 


679 


812 


M78076_PEA_1_T26 


679 


812 


M78076_PEA_1_T27 


679 


812 


M78076_PEA_1_T28 


679 


812 



Segment cluster M78076_PEA_l_node_l 8 according to the present invention is 
supported by 95 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M78076JPEA_1_T2, M78076_PEA_1__T3, 
M78076_PEA_1_T5, M78076_PEA_1_T13, M78076_PEA„1_T15 ? M78076_PEA_1_T23, 
M78076_PEA__1_T26 ? M78076_PEA_1_T27 and M78076_PEA_1_T28. Table 717 below 
describes the starting and ending position of this segment on each transcript. 



Table 717 - Segment location on transcripts 





-.Segment 


Segment * *j 




starting position 


ending position 


M78076_PEA_1_T2 


813 


991 


M78076_PEA_1_T3 


813 


991 


M78076_PEA_1_T5 


813 


991 


M78076_PEA_1_T13 


813 


991 


M78076_PEA_1_T15 


813 


991 


M78076_PEA_1_T23 


813 


991 


M78076_PEA_1_T26 


813 


991 


M78076_PEA_1_T27 


813 


991 


M78076_PEA_1_T28 


813 


991 
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Segment cluster M78076JPEAJ_node_20 according to the present invention is 
supported by 99 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M78076J?EA_1_T2 ? M78076_PEA_1 JT3, 
M78076JPEA_1JT5, M78076JPEA_1JT13, M78076_PEAJLT15, M78076_PEA_1_T23, 
M78076_PEA__1_T26, M78076_PEA_1JT27 and M78076JPEA_1JT28. Table 718 below 
describes the starting and ending position of this segment on each transcript. 



Table 718- Segment location on transcripts 



Transcript name f \ '-' 


Segment. ■' \ 
starting position , 


Segment \ j 
ending position , 


M78076JPEA_1_T2 


992 


1122 


M78076_PEA_1_T3 


992 


1122 


M78076_PEA_1_T5 I 


992 


1122 ! 


M78076_PEA_1_T13 


992 


1122 


M78076_PEA_1_T15 


992 


1122 


M78076_PEA_1_T23 


992 


1122 


M78076_PEA_1_T26 


992 


1122 


M78076_PEA_1_T27 


992 


1122 


M78076_PEA_1_T28 


992 


1122 



Segment cluster M78076_PEA_l_node_24 according to the present invention is 
supported by 105 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): M78076_PEA_1_T2, 
M78076_PEA_1_T3, M78076_PEA_1_T5, M78076_PEA_1_T13, M78076_PEA_1_T15, 
M78076_PEA_1_T26, M78076_PEA_1_T27 and M78076_PEA_1_T28. Table 719 below 
describes the starting and ending position of this segment on each transcript. 

Table 719 - Segment location on transcripts 



WO 2006/131783 



PCT/IB2005/004037 



800 



Transcript name ■ _ • , 


.Segment 
starting position 


Segment ; * 
ending position 


M78076_PEA_1_T2 


1198 


1356 


M78076_PEA_1_T3 


1 1 AO 

1 198 


1 J JO 


M78076_PEA_1_T5 


1198 


1356 


M78076_PEA_1_T13 


1198 


1356 


M78076_PEA_1_T15 


1198 


1356 


M78076_PEA_1_T26 


1198 


1356 


M78076_PEA_1_T27 


1198 


1356 


M78076_PEA_1_T28 


1198 


1356 



Segment cluster M78076_PEA_l_node_26 according to the present invention is 
supported by 99 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): M78076_PEA_1_T2, M78076_PEA_1_T3, 
M78076_PEA_1_T5, M78076_PEA_1_T13, M78076_PEA_1_T15, M78076_PEA_1_T23, 
M78076_PEA_1_T26, M78076_PEA_1_T27 and M78076_PEA_1_T28. Table 720 below 
describes the starting and ending position of this segment on each transcript. 



Table 720 - Segment location on transcripts 



Transcript name ' ; 4\ 


Segment . _ s ■ 


Segment f ' % 




starting position, * . 


ending position *- 


M78076_PEA_1_T2 


1357 


1485 


M78076_PEA_1_T3 


1357 


1485 


M78076_PEA_1_T5 


1357 


1485 


M78076_PEA_1_T13 


1357 


1485 


M78076_PEA_1_T15 


1357 


1485 


M78076_PEA_1_T23 


1198 


1326 


M78076_PEA_1_T26 


1357 


1485 


M78076_PEA_1_T27 


1357 


1485 
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M78076 PEA 1 T28 



1357 



1485 



Segment cluster M78076JPEA_l_node_29 according to the present invention is 
supported by 2 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M78076JPEA_1JT27. Table 721 below 
describes the starting and ending position of this segment on each transcript. 

Table 721 - Segment location on transcripts 



o ■■ > .t - - . ■* - . / v 

: - ' % ¥■ •,. ■ '■'•'>"• 


: Segment 

; starting position 


SegmeM :i \ ; : 
ending position, ; * % 


M78076JPEA_1JT27 


1490 


3132 



10 Segment cluster M78076JPEA_l_nodeJ32 according to the present invention is 

supported by 2 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M78076_PEA_1_T26 and 
M78076JPEA_1_T27. Table 722 below describes the starting and ending position of this 
segment on each transcript. 

15 Table 722 - Segment location on transcripts 



Transcript name ( ■£ 


Segment r 
starting position 


Segment '>/ 
ending position 


M78076_PEA_1_T26 


1586 


2457 


M78076_PEA_1_T27 


3233 


4104 



Segment cluster M78076_PEA_1 jtiode_35 according to the present invention is 
supported by 4 libraries. The number of libraries was determined as previously described. This 
20 segment can be found in the following transcript(s): M78076_PEA_1_T2 and 

M78076_PEA_1 JT5. Table 723 below describes the starting and ending position of this 
segment on each transcript. 
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Table 723- Segment location on transcripts 



Transcript name v - 


Segment ■- 
starting position 


Segment 
ending position . 


M78076_PEA_1_T2 


1694 


1952 


M78076_PEA_1_T5 


1694 


1952 



Segment cluster M78076_PEA_l_node_37 according to the present invention is 
5 supported by 1 1 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M78076_PEA_1 JT3 and 
M78076_PEA_1_T5. Table 724 below describes the starting and ending position of this 
segment on each transcript. 
Table 724 - Segment location on transcripts 





! ; Segment 'zg 
starting position 


Segment . ,£ • , : = ;# 
. ending position S 


M78076_PEA_1_T3 


1718 


2180 


M78076_PEA_1_T5 


1977 


2439 



Segment cluster M78076_PEA_l_node_46 according to the present invention is 
supported by 3 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M78076JPEA_1_T15. Table 725 below 
15 describes the starting and ending position of this segment on each transcript. 

Table 725 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


M78076JPEA_1JT15 


1852 


1972 
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Segment cluster M78076_PEA_l_node_47 according to the present invention is 
supported by 155 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): M78076_PEA_1_T2 5 
M78076JPEA_1_T3, M78076JPEA_1_T5, M78076JPEAJ JT13, M78076__PEA_1 JT15 and 
5 M78076JPEA_1_T23. Table 726 below describes the starting and ending position of this 
segment on each transcript. 

Table 726 - Segment location on transcripts 



"Transcript name ."Y ' f *' v . 


Segment 

starting position 1 


Segment ' Y" s . . 
ending position ,,- ,! 


M78076_PEA_1_T2 


2111 


2254 


M78076_PEA_1_T3 


2327 


2470 


M78076_PEA_1_T5 


2586 


2729 


M78076_PEA_1_T13 


1781 


1924 


M78076_PEA_1_T15 


1973 


2116 


M78076_PEA_1_T23 


1693 


1836 



10 Segment cluster M78076JPEA_l_jiode_54 according to the present invention is 

supported by 133 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): M78076JPEA_1 JT2, 
M78076_PEA_1_T3, M78076__PEA_1 JT5, M78076J>EA_1„T13 ? M78076_PEA_1_T15 ? 
M78076_PEA„1_T23 and M78076_PEA__1_T28. Table 727 below describes the starting and 

15 ending position of this segment on each transcript. 

Table 727 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 


M78076_PEA_1_T2 


2412 


2715 


M78076_PEA_1_T3 


2628 


2931 


M78076_PEA_1_T5 


2887 


3190 
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M78076_PEA_1_T13 


2082 


2385 


M78076_PEA_1_T15 


2274 


2577 


M78076_PEA_1_T23 


1994 


2297 


M78076_PEA_1_T28 


1492 


1795 



the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



Segment cluster M78076_PEA_l_node_l according to the present invention is supported 
by 47 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M78076_PEA_1_T2, M78076JPEA_1 JT3, 
M78076JPEA_1_T5, M78076JPEA_1_T13, M78076JPEA_1JIT5 ? M78076_PEA_1_T23, 
M78076_PEA_1_T26, M78076J?EA_1_T27 and M78076_PEA_1_T28. Table 728 below 
describes the starting and ending position of this segment on each transcript. 



Table 728- Segment location on transcripts 



Transcript name : a 


Segment ". \.. ."^/._ •' 


Segment : : -.- 




starting positikm # V' : 


ending position 


M78076_PEA_1_T2 


161 


204 


M78076_PEA_1_T3 


161 


204 


M78076_PEA_1_T5 


161 


204 


M78076_PEA_1_T13 


161 


204 


M78076_PEA_1_T15 


161 


204 


M78076_PEA_1_T23 


161 


204 


M78076_PEA_1_T26 


161 


204 


M78076_PEA_1_T27 


161 


204 


M78076_PEA_1_T28 


161 


204 



Segment cluster M78076_PEA_l_node_2 according to the present invention can be found 
in the foUowing transcript(s): M78076_PEA_1_T2, M78076_PEA_1_T3, M78076_PEA_1_T5, 
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M78076_PEA_1 JT13, M78076JPEA_1_T15, M78076JPEAJ JT23, M78076JPEA_1_T26, 
M78076JPEA_1_T27 and M78076JPEA_1_T28. Table 729 below describes the starting and 
ending position of this segment on each transcript. 



Table 729 - Segment location on transcripts 



Transcript name ^ 'fy 


Segment o " : : f T 
< starting position. 


Segment . ' - J- • ; ", : 
ending position v 


M78076_PEA_1_T2 


205 


224 


M78076_PEA_1_T3 


205 


224 


M78076_PEA_1_T5 


205 


224 


M78076_PEA_1_T13 


205 


224 


M78076_PEA_1_T15 


205 


224 


M78076_PEA_1_T23 


205 


224 


M78076_PEA_1_T26 


205 


224 


M78076_PEA_1_T27 


205 


224 


M78076_PEA_1_T28 


205 


224 



5 



Segment cluster M78076JPEA_l_node_3 according to the present invention is supported 
by 52 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): M78076_PEA_1_T2, M78076_PEA_1_T3, 
10 M78076_PEA_1_T5, M78076_PEA_1_T13, M78076_PEA_1_T15, M78076JPEA_1_T23, 
M78076_PEA_1_T26, M78076_PEA_1_T27 and M78076_PEA_1_T28. Table 730 below 
describes the starting and ending position of this segment on each transcript. 



Table 730 - Segment location on transcripts 



Transcript name 


Segment 

; starting position 


Segment 
ending position 


M78076_PEA_1_T2 


225 


288 


M78076_PEA_1_T3 


225 


288 


M78076_PEA_1_T5 


225 


288 
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M78076_PEA_1_T13 


225 


288 


M78076_PEA_1_T15 


225 


288 


M78076_PEA_1_T23 


225 


288 


M78076_PEA_1_T26 


225 


288 


M78076_PEA_1_T27 


225 


288 


M78076_PEA_1_T28 


225 


288 



Segment cluster M7 8 07 6_JPE A_ 1 _no de_6 according to the present invention is supported 
by 59 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following trans cript(s): M78076JPEA_1 JT2, M78076_PEA_1_T3 3 

M78076JPEA_1 JT5, M78076JPEA_1_T13, M78076_PEA_1_T15, M78076__PEA_1_T23 3 
M78076_PEA_1_T26, M78076JPEAJ JT27 and M78076JPEAJMT28. Table 731 below 
describes the starting and ending position of this segment on each transcript. 

Table 731 - Segment location on transcripts 



Transcript name /% ■ . 


Segment % • ' -W/' 


i Segment . f ', '% / . . .? 




starting position . % 


ending position ' 


M78076_PEA_1_T2 


289 


370 


M78076JPEA_1_T3 


289 


370 


M78076_PEA_1_T5 


289 


370 


M78076_PEA_1_T13 


289 


370 


M78076_PEA_1_T15 


289 


370 


M78076_PEA_1_T23 


289 


370 


M78076_PEA_1_T26 


289 


370 


M78076_PEA_1_T27 


289 


370 


M78076JPEA_1_T28 


289 


370 



Segment cluster M78076JPEA_l_node_7 according to the present invention is supported 
by 64 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): M78076_PEA_1_T2, M78076JPEA_1 JT3, 
M78076JPEAJJT5, M78076_PEA_1_T13 5 M78076_PEA_1 JT15, M78076JPEA_1 JT23, 
M78076J>EA_1 JT26, M78076JPEA_1 JT27 and M78076JPEAJ JT28. Table 732 below 
describes the starting and ending position of this segment on each transcript. 



5 Table 732 - Segment location on transcripts 



Transcript name . ' ;/ "'• . ,; 


Segment V 


. Segment "7/ '"^ 




"starting position 


ending position ,? 


M78076_PEA_1_T2 


371 


432 


M78076_PEA_1_T3 


371 


432 


M78076_PEA_1_T5 


371 


432 


M78076_PEA_1_T13 


371 


432 


M78076_PEA_1_T15 


371 


432 


M78076_PEA_1_T23 


371 


432 


M78076JPEA_1_T26 


371 


432 


M78076_PEA_1_T27 


371 


432 


M78076_PEA_1_T28 


371 


432 



Segment cluster M78076JPEA_l_node_l 2 according to the present invention is 
supported by 71 libraries. The number of libraries was determined as previously described. This 
10 segment can be found in the following transcript(s): M78076JPEAJLJT2, M78076JPEA_1_T3, 
M78076JPEA_1 JT5, M78076JPEA_1_T13, M78076_PEA_1 JT15, M78076_PEA_1_T23 ? 
M78076JPEA_1JT26 ? M78076JPEAJLT27 andM78076_PEA_l_T28. Table 733 below 
describes the starting and ending position of this segment on each transcript. 



Table 733- Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


M78076_PEA_1_T2 


566 


678 


M78076_PEA_1_T3 


566 


678 
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M78076_PEA_1_T5 


566 


678 


M78076_PEA_1_T13 


566 


678 


M78076_PEA_1_T15 


566 


678 


M78076_PEA_1_T23 


566 


678 


M78076_PEA_1_T26 


566 


678 


M78076_PEA_1_T27 


566 


678 


M78076_PEA_1_T28 


566 


678 



Segment cluster M78076_PEA_l__node_22 according to the present invention is 
supported by 92 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M78076JPEA_1 JT2, M78076 _PEA_1 _T3, 
M78076_PEA_1_T5, M78076JPEA_1 JT13, M78076_PEA_1_T15 ? M78076_PEA_1_T23, 
M78076_PEA_1_T26, M78076JPEA_1 JT27 and M78076JPEAJ_T28. Table 734 below 
describes the starting and ending position of this segment on each transcript. 

Table 734 - Segment location on transcripts 



:1>^cri^ name * J ' *f - % J 


Segment } _ i\ - \ 
starting position . 


Segment ••*' f 
ending position ' 


M78076_PEA_1_T2 


1123 


1197 


M78076_PEA_1_T3 


1123 


1197 


M78076_PEA_1_T5 


1123 


1197 


M78076_PEA_1_T13 


1123 


1197 


M78076_PEA_1_T15 


1123 


1197 


M78076_PEA_1_T23 


1123 


1197 


M78076_PEA_1_T26 


1123 


1197 


M78076_PEA_1_T27 


1123 


1197 


M78076_PEA_1_T28 


1123 


1197 
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Segment cluster M7 807 6_PEA_l_node_27 according to the present invention can be 
found in the following transcript (s): M78076_PEAJ_T27. Table 735 below describes the 
starting and ending position of this segment on each transcript. 

Table 735 - Segment location on transcripts 



Tramciipt name : t: ,,3 /£l 


Segment 

starting positicto-' 


Segment - :< 
ending position 


M78076JPEA_1_T27 


1486 


1489 



5 



Segment cluster M7 8 07 6JPE A_ 1 _node_3 0 according to the present invention is 
supported by 90 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): M78076JPEA_1_T2 ? M78076_PEA_1_T3, 
10 M78076J>EAJ„T5, M78076J>EA_1 JT13, M78076JPEA_1_T15, M78076JPEA^_1 JT23, 
M78076_PEA_1_T26 and M78076_PEA_1_T27. Table 736 below describes the starting and 
ending position of this segment on each transcript. 

Table 736 - Segment location on transcripts 



Transcript name- M- 


Segment , 
starting position 


Segment ' ) j ' 
ending-position 


M78076_PEA_1_T2 


1486 


1557 


M78076_PEA_1_T3 


1486 


1557 


M78076_PEA_1_T5 


1486 


1557 


M78076_PEA_1_T13 


1486 


1557 


M78076_PEA_1_T15 


1486 


1557 


M78076_PEA_1_T23 


1327 


1398 


M78076_PEA_1_T26 


1486 


1557 


M78076_PEA_1_T27 


3133 


3204 



15 

Segment cluster M78076JPEA_l_node_31 according to the present invention is 
supported by 89 libraries. The number of libraries was determined as previously described. This 
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segment can be found in the following transcript(s): M78076JPEA_1_T2, M78076JPEA_1_T3, 
M78076_PEA_1_T5, M78076_PEA_1_T13, M78076_PEA_1_T15 ? M78076_PEA__1_T23, 
M78076_PEA_1_T26 and M78076_PEAJ_T27. Table 737 below describes the starting and 
ending position of this segment on each transcript. 
5 Table 737- Segment location on transcripts 



Transcript tiame ' • . 

K, ■< * .if-- <£ ; ' \ 

"■ . ■' . ;i -■ . . ' ': ., - . - y f " " ' ,- 


Segment 

-: ■. ■ ■ . r ■ 

.starting position. 


Segment f . 
ending position 


M78076_PEA_1_T2 


1558 


1585 


M78076_PEA_1_T3 


1558 


1585 


M78076_PEA_1_T5 


1558 


1585 


M78076_PEA_1_T13 


1558 


1585 


M78076_PEA_1_T15 


1558 


1585 


M78076_PEA_1_T23 


1399 


1426 


M78076_PEA_1_T26 


1558 


1585 


M78076_PEA_1_T27 


3205 


3232 



Segment cluster M78076JPEA_l_node_34 according to the present invention is 
supported by 103 libraries. The number of libraries was determined as previously described. 
10 This segment can be found in the following transcript(s): M78076_PEA_1_T2 5 

M78076JPEA„1_T3, M78076JPEA_1_T5, M78076_PEA_1„T13, M78076_PEA_1_T15 and 
M78076_PEA_1_T23. Table 738 below describes the starting and ending position of this 
segment on each transcript. 
Table 738 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


M78076_PEA_1_T2 


1586 


1693 


M78076_PEA_1_T3 


1586 


1693 


M78076JPEA_1_T5 


1586 


1693 
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M78076_PEA_1_T13 


1586 


1693 


M78076_PEA_1_T15 


1586 


1693 


M78076_PEA_1_T23 


1427 


1534 



Segment cluster M78076JPEA_l_node_36 according to the present invention can be 
found in the following transcript(s): M78076JPEAJJT2, M78076_PEA_1 JT3, 
5 M78076JPEAJLT5, M78076_PEA„1_T13, M78076JPEA_1_T15 and M78076_PEA_1_T23. 
Table 739 below describes the starting and ending position of this segment on each transcript. 



Table 739 - Segment location on transcripts 



Transcript name. . < ' ,, 


Segment f 
Starting position 


Segment .... *7. 
ending position ? 


M78076_PEA_1_T2 


1953 


1976 


M78076_PEA_1_T3 


1694 


1717 


M78076_PEA_1_T5 


1953 


1976 


M78076_PEA_1_T13 


1694 


1717 


M78076_PEA_1_T15 


1694 


1717 


M78076_PEA_1_T23 


1535 


1558 



10 Segment cluster M78076_PEA_l_node_41 according to the present invention can be 

found in the following transcript(s): M78076JPEAJJT3 and M78076JPEA_1_T5. Table 740 
below describes the starting and ending position of this segment on each transcript. 



Table 740 - Segment location on transcripts 



Transcript name 


Segment 
j starting position 


Segment 
ending position 


M78076_PEA_1_T3 


2181 


2192 


M78076_PEA_1_T5 


2440 


2451 
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Segment cluster M78076_PEA_l_node_42 according to the present invention can be 
found in the following transcript(s): M78076_PEA_1 JT2, M78076J>EA_1_T3, 
M78076JPEA_1_T5, M78076JPEAJ JT15 and M78076JPEAJ JT23. Table 741 below 
5 describes the starting and ending position of this segment on each transcript. 



Table 741 - Segment location on transcripts 



Transcript name U- : f-f 


Segment ; T 


Segment * f 


, " *>v-~ ' ..... ' • .""j- 


starting position > • 


ending position ■ 


M78076_PEA_1_T2 


1977 


1985 


M78076_PEA_1_T3 


2193 


2201 


M78076_PEA_1_T5 


2452 


2460 


M78076_PEA_1_T15 


1718 


1726 


M78076JPEA_1_T23 


1559 


1567 



Segment cluster M78076JPEA_l_node_43 according to the present invention is 
10 supported by 1 10 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): M78076_PEA_1 JT2, 
M78076JPEA_1_T3 ? M78076_PEA_1_T5, M78076_PEA_1_T15 andM78076JPEA_l_T23. 
Table 742 below describes the starting and ending position of this segment on each transcript. 



Table 742 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment v- 
ending position 


M78076_PEA_1_T2 


1986 


2047 


M78076_PEA_1_T3 


2202 


2263 


M78076_PEA_1_T5 


2461 


2522 


M78076_PEA_1_T15 


1727 


1788 


M78076_PEA_1_T23 


1568 


1629 
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Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 743. 



5 Table 743 - Oligonucleotides related to this segment 



Qligpnucieotide ^ame » . 


Oyefexpressed in cancers 


Chip reference * f f 


M78076_0J7J) 


lung malignant tumors 


LUN 



Segment cluster M78076_PEA_l_node_45 according to the present invention is 
supported by 132 libraries. The number of libraries was determined as previously described. 
10 This segment can be found in the following transcript(s): M78076JPEA_1 JT2, 

M78076JPEA_1 JT3, M78076JPEA_1_T5, M78076JPEAJ JT13, M78076JPEA_1_T15 and 
M78076_PEA_1 JT23. Table 744 below describes the starting and ending position of this 
segment on each transcript. 

Table 744 - Segment location on transcripts 



Transcript name . " '\vj- 

' ;y • fj£* . $ ■ 'If ' -' "', 


Segment . 'if 
starting position 


Segment , ;|f .-. 
ending position : V 


M78076_PEA_1_T2 


2048 


2110 


M78076_PEA_1_T3 


2264 


2326 


M78076_PEA_1_T5 


2523 


2585 


M78076_PEA_1_T13 


1718 


1780 


M78076_PEA_1_T15 


1789 


1851 


M78076_PEA_1_T23 


1630 


1692 



Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 745. 
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Table 745 - Oligonucleotides related to this segment 



Oligonucleotide name 


Overexpressed in cancers -A v 


Chip reference 


M78076_0J7_0 


lung malignant tumors 


LUN 



Segment cluster M7S076J?EA_l_node_49 according to the present invention is 
5 supported by 129 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): M78076JPEA_1_T2, 
M78076JPEA_1 JT3, M78076_PEA_1_T5, M78076JPEA_1_T13, M78076_PEA_1_T15 and 
M78076JPEA_1_T23. Table 746 below describes the starting and ending position of this 
segment on each transcript. 
10 Table 746 - Segment location on transcripts 



Xranscrrpt nafeeV'^? t ■ *V- -~ 1 


..Segment f -M ^ 


Segment % '•>: 




starting position 


ending position 

- ' * ' j : ■ is. - ;' . .. 


M78076_PEA_1_T2 


2255 


2290 


M78076_PEA_1_T3 


2471 


2506 


M78076_PEA_1_T5 


2730 


2765 


M78076_PEA_1_T13 


1925 


1960 


M78076_PEA_1_T1 5 


2117 


2152 


M78076_PEA_1_T23 


1837 


1872 



Segment cluster M78076_PEA_1 jriode_50 according to the present invention is 
supported by 125 libraries. The number of libraries was determined as previously described. 
1 5 This segment can be found in the following transcript(s): M78076JPEA_1JT2 3 

M78076_PEA_1_T3, M78076JPEAJLT5, M78076JPEA_1 JT13, M78076J>EA_1_T15 and 
M7 8076JPE A_l _T23 . Table 747 below describes the starting and ending position of this 
segment on each transcript. 

Table 747 - Segment location on transcripts 
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Transcript name , v - 


Segment 
starting position 


Segment ' ■ 
ending position 


M78076_PEA_1_T2 


2291 


2329 


M78076_PEA_1_T3 


2507 


2545 


M78076_PEA_1_T5 


2766 


2804 


M78076_PEA_1_T13 


1961 


1999 


M78076_PEA_1_T15 


2153 


2191 


M78076_PEA_1_T23 


1873 


1911 



Segment cluster M78076_PEA__l_node_51 according to the present invention is 
supported by 123 libraries. The number of libraries was determined as previously described. 
5 This segment can be found in the following transcript(s): M7 8 0 7 6_PE A_ 1 __T2 ? 

M78076_PEAJJT3, M78076__PEA_1 JT5, M78076JPEAJLJT13, M78076_PEA_1_T15 and 
M78076JPEA_1_T23. Table 748 below describes the starting and ending position of this 
segment on each transcript. 

Table 748 - Segment location on transcripts 



Transcript name . ; , . " 


Segment :):^h 


Segment 




starting position 


ending position 


M78076_PEA_1_T2 


2330 


2388 


M78076_PEA_1_T3 


2546 


2604 


M78076_PEA_1_T5 


2805 


2863 


M78076_PEA_1_T13 


2000 


2058 


M78076_PEA_1_T15 


2192 


2250 


M78076_PEA_1_T23 


1912 


1970 



Segment cluster M78076_PEA_l_node_52 according to the present invention can be 
found in the following transcript(s): M78076_PEA_1_T2, M78076_PEA_1_T3, 
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M78076_PEA_1JT5, M78076JPEAJJT13, M78076JPEA_1 JT15 and M78076_PEA_1_T23. 
Table 749 below describes the starting and ending position of this segment on each transcript. 



Table 749 - Segment location on transcripts 



Trariscrip t name , ^ ''' v ' / ~ : /%> 


Segment / ' > f-\ 
starting position ,< 


Segment ■ ,/y; 
ending position , 


M78076_PEA_1_T2 


2389 


2405 


M78076_PEA_1_T3 


2605 


2621 


M78076_PEA_1_T5 


2864 


2880 


M78076_PEA_1_T13 


2059 


2075 


M78076_PEA_1_T15 


2251 


2267 


M78076_PEA_1_T23 


1971 


1987 



Segment cluster M78076JPEA_ljnode_53 according to the present invention can be 
found in the following transcript(s): M78076_PEA_1_T2, M78076_PEA_1_T3, 
M78076_PEAJ_T5 3 M78076_PEA_1_T13, M78076JPEA__1_T15, M78076JPEA_1„T23 and 
M78076JPEA_1__T28. Table 750 below describes the starting and ending position of this 
1 0 segment on each transcript. 



Table 750 - Segment location on transcripts 



Transcript name * . . ? , 


Segment ' 
starting position - /V 


Segment ; J , ; ; 
ending position 


M78076_PEA_1_T2 


2406 


2411 


M78076_PEA_1_T3 


2622 


2627 


M78076_PEA_1_T5 


2881 


2886 


M78076_PEA_1_T13 


2076 


2081 


M78076_PEA_1_T15 


2268 


2273 


M78076_PEA_1_T23 


1988 


1993 


M78076_PEA_1_T28 


1486 


1491 
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Variant protein alignment to the previously known protein: 
Sequence name: APP1_HUMAN 

Sequence documentation : 

Alignment of: M7 8 07 6_PEA_1_P3 x APP1_HUMAN 
Alignment segment 1/1: 

Quality: 5132.00 

Escore: 0 

Matching length: 517 
length: 517 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps: 0 

Alignment : 

1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 

I I I I I I I I I ) I I I I I M I I I I I I I I I M I 1 I I I I I I M I I I I I I I I I I I I 

1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 



Total 
Matching Percent 
Total Percent 
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51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 100 

I I I I ] I I I I II 1 I I) I I I I I I I I ! I I I 1 I I I M I I I I 1 I 1 I I I I I I I I I I 

51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 100 
» . - » • 

5 101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEAL 150 

I I I I I 1 I I I I I I I t I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I 
101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQWPFRCLPGEFVSEAL 150 

151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 

10 | | | | | ! | | | | | | I I | I I I I I | I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I 

151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 

201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 
I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
15 201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 

251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 300 

I I I I I I I I I I I I I I II I I 1 I I I II I I 1 I I I I 1 II I I I I I I I I I I I I I I I I 
251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 30 0 
20 ..... 

301 PGE I S EHE G FLRAKMDLEERRMRQ INE VMREWAMADNQ SKNLPKADRQAL 350 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
301 PGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQAL 35 0 

• • • - 

25 351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 400 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 40 0 

..... 
401 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 450 
30 | | I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I 

401 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 450 
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451 THLQVIEERVNQSLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAP 500 

I ] I I I I i i I 1 I I I I I I I I I I I I I t I 1 I I I I I I I M I M I I I I I I ! I I I I I 

451 THLQVIEERVNQSLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAP 500 

501 GGSSEDKGGLQPPDSKD 517 

I I I II I i I I II I I i I i I 

501 GGSSEDKGGLQPPDSKD 517 



Sequence name: APP1_HUMAN 
Sequence documentation : 
Alignment of: M7 807 6_PEA_1_P4 
Alignment segment 1/1: 

Quality : 

Escore: 0 

Matching length: 

length: 52 6 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



K APP INHUMAN 

5223.00 

526 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 
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Alignment : 

1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 

I I I I I I I I 1 I I I I I I I I I I M 1 I I I II I I I I I I I I I I I I I M I I I I I I 1 I 

1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 
- 

5 1 PGS AQ VAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP -10 0 

| | I I I I I I I I i i I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I I I 

51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 100 
...» * 
101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEAL 150 

I | | | I I I I I I I I I I I I M I I I I I 11 I I I I I I I I I M I I M I I M M I I II 

101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQWPFRCLPGEFVSEAL 150 
151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 

I ! | | | I 1 I I I ! I I i I I I I M I I I I I I I I I I 1 I I I I I I I II I I I I I I I I I I 

151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 

201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I II I I I I 
201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 

251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAWGKVTPTPRPTDGVDIYFGM 300 

I M | I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I 

251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 300 

301 P GE I S E HE GFLRAKMDLEERRMRQ I NE VMRE W AMADNQ S KN LPK ADRQAL 350 

| I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 P GE I SEHE GFLRAKMDLEERRMRQ INE VMRE WAMADNQSKN LP KADRQAL 350 
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351 NEHFQSILQT LEE Q V S GERQRL VE T H A T RV I AL I N DQRRAALE G FL A ALQ 4 00 

I I I I M I I I I I I II I I II I ! I I I I I I I I 1 I I I I I I I I 11 I I I I I 1 I 1 I I I 

351 NEHFQSILQTLEE Q V S GE RQRL VE T HA T R V I AL I N D Q RRA ALE G F LA ALQ 4 00 
401 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 45 0 

| | | | I I I I I I I I I I I 1 I I I I II I I I I I I I I I I I I M I I I I I I I M I I I I I 

401 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 450 

451 THLQVIEERVNQSLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAP 50 0 

| | | | | | I II I I I I I I 1 I I I I I I I I I I I 11 I I I I I I I I I I I I I N I I I I I I 
451 THLQVIEERVNQSLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAP 500 



501 GGS SEDKGGLQPPDSKDDTPMTLPKG 
I I I I I I I I I I I I I I I I I I I I I I I I I I 
501 GGS SEDKGGLQPPDSKDDTPMTLPKG 



526 
526 



Sequence name: APP1__HUMAN 
Sequence documentation : 

Alignment of: M7 8 07 6_PEA_1_P12 x APP1_HUMAN 
Alignment segment 1/1: 

Quality: 5223.00 

Escore: 0 
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Matching length: 
length: 526 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



822 

526 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 



Alignment : 

1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 

| I 1 | | I I I I I I I I I I I 1 I I I I I I I I I I I ! I I M I I I III I I I I I I I ) I I I 

1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 



51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 10 0 

| I I I I I I I I I I I I 1 I I I I I I I I I I I I ! I I M I I I M I I I II I I I I I I I I I 

51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 100 
101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQWPFRCLPGEFVSEAL 150 

I | | I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQWPFRCLPGEFVSEAL 150 
151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 

I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 

151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 
201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 

I I 1 1 I I i I I I 1 1 II I 1 1 1 I I i I 1 1 1 I I N 1 1 I II 1 1 I 1 1 I I M 1 1 I 1 1 1 I 

201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 
251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 300 
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i I I I I I I I I I I I I I I I I I I ! I I I t I ! I I i I ! I I I I I I I I M I I M I I ! I I 

251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 300 
301 PGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQAL 350 

I I I I I I I I I II I I I I I I I I I I I I I I I I ! I I I I I i I 1 I I t I I I I I I I I i I I 

301 PGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQAL 350 

351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 400 

I I I I I I II I I I I I I 1 I I I I I I I I I II I I I I II I I I I II I I I I I I I I I I I I 

351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 400 

. • • • • 

401 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 45 0 

M I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M II I I 

401 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 45 0 

• • • 

451 THLQVIEERVNQSLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAP 500 

I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
451 THLQVIEERWQSLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAP 50 0 

501 GGSSEDKGGLQPPDSKDDTPMTLPKG 52 6 

I I I I I I II II I I I I I I I I I I I I II I I 
501 GGSSEDKGGLQPPDSKDDTPMTLPKG 52 6 



Sequence name: APP1_HUMAN 
Sequence documentation : 
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Alignment of: M78 07 6_PEA_1_P14 x APP INHUMAN 
Alignment segment 1/1: 

5 

Quality: 5672.00 

Escore: 0 

Matching length: 575 Total 

length: 575 

10 Matching Percent Similarity: 99.48 Matching Percent 
Identity: 99.48 

Total Percent Similarity: 99.48 Total Percent 

Identity: 99.48 

Gaps : 0 

15 

Alignment : 

1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 

I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I i I I I I I I I I I I I I I I I I 

20 1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 

51 PG S AQ VAGL CGRLTLHRDLRT GRWE P D PQRS RRC LRDP QRVLE Y C RQMY P 100 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 ! 1 1 

51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 100 

25 . 

101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQWPFRCLPGEFVSEAL 150 

I I i ! I I I I I ! I I I I I I I I I I I I I I i I I I ! I 1 I I I I I II M I I I I I i t I I I 

101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQWPFRCLPGEFVSEAL 150 
30 151 LVPEGCRFLHQERMDQCE S S TRRHQEAQEAC S S QGL I LHG S GMLLPCGS D 200 

I I I I I I I I I I \ I I I I I I I I I I I I I I I M I I I I II I I I I I I I I I I ! I I I I I 
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151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 

201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 

I I I i I M I I I I I i I I I I I I I I I I I i I 1 I I I M I I I I I I I I I I I I I I I 1 I I 
201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 

251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 30 0 

m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i i m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 r i 

251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 300 
301 PGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQAL 350 

I 1 I I I I I i I I I I I I i I I I I I I I I 1 I I I I I I I I I I ! I I I i I I I I I I 1 M M 

301 PGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQAL 350- 
351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 40 0 

I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I t I I I I 1 I I I I I I M 

351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 40 0 
401 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 450 

I 1 1 1 i I I I I I I 1 1 1 I I 1 1 1 I 1 1 1 I I I 1 1 1 1 1 I M I I I 1 1 1 I I II 1 1 I I I I 

4 01 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 450 
451 THLQVIEERVNQSLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAP 500 

I I I I I I I M I I I I I I I I i I I I I I I I I I I I I I I I I M II I I M I I I I I I I I 

451 THLQVIEERVNQSLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAP 50 0 

501 GGSSEDKGGLQPPDSKDDTPMTLPKGSTEQDAASPEKEKMNPLEQYERKV 550 

| I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

501 GGSSEDKGGLQPPDSKDDTPMTLPKGSTEQDAASPEKEKMNPLEQYERKV 550 



551 NASVPRGFPFHSSEIQRDELVRGGT 



575 
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I I I I I I I 1 I I I I I 1 I I I I I I II 

551 NASVPRGFPFHSSEIQRDELAPAGT 



Sequence name: APP1_HUMAN 
Sequence documentation : 

Alignment of: M7 8 07 6_PEA__1_P21 x APPlJtiUMAN 
Alignment segment 1/1: 

Quality: 5822.00 

Escore: 0 

Matching length: 5 97 

length: 650 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 91.85 
Identity: 91.85 

Gaps : 1 

Alignment : 

1 MGPAS PAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 

I I I I I I I I I I I I I I I I i II I I I I I I I I I I ! I I I I I I t i ! I I ! I i I i I I I I 

1 MGPAS PAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 



Total 
Matching Percent 
Total Percent 
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- 

51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 10 0 

i I I I I I I It I I 1 I I I I I 1 I I I I I I 1 I ) I I I I I I I ! i I I I I 1 I I I I t I I M 

51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 100 

101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEAL 150 

| 1 I 1 I I i I I I I I I I I 1 I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I 1 I i I i I 

101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEAL 150 

151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 

I I I I I I I I I I I I I I 1 I I I I I I ! I I I I I I I ! I I I 1 I I I I ! I M I I ! I M I I 

151 LVPE GCRFLHQERMDQCE S S TRRHQE AQE AC S S QGL I LHGS GMLL PCGS D 200 

* 

201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 

201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 

251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 300 

I | | | | | | M I I I I II ! I I I II I I I I I II I I I I I I I I ) II i I II M I I I ) ! 
251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 300 

301 PGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPPCADRQAL 350, 

| | | | | | || | | | | I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

301 P GE I S E HE GFLRAKMDLE E RRMRQ I NE VMRE WAMADNQ S KN L PK ADRQ AL 35 0 

351 NE 352 

I I 

351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 400 

353 AERVLLALRRYLRAEQKEQRHTLRH YQHVAAVDPEKAQQMRFQVH 3 97 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
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4 01 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 4 50 
398 THLQVIEERVNQSLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAP 44 7 

I | | i I I I I I I I I I ! ! I I I I ! I I I I I i I I! I I I I I I I I I II M I i I I I I M 

451 THLQVIEERVNQSLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAP 500 

44 8 GGSSEDKGGLQPPDSKDDTPMTLPKGSTEQDAASPEKEKMNPLEQYERKV 4 97 

I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I 1 I I I I I I I I I I I I I I I I 
501 GGSSEDKGGLQPPDSKDDTPMTLPKGSTEQDAASPEKEKMNPLEQYERKV 550 

498 NASVPRGFPFHSSEIQRDELAPAGTGVSREAVSGLLIMGAGGGSLIVLSM 547 

M | I t I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

551 NASVPRGFPFHSSEIQRDELAPAGTGVSREAVSGLLIMGAGGGSLIVLSM 60 0 
548 LLLRRKKPYGAISHGVVEVDPMLTLEEQQLRELQRHGYENPTYRFLEERP 597 

I I I i I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I M I I I I I I I I I 

601 LLLRRKKP YGAI SHGWEVDPMLTLEEQQLREL QRHGYENPT YRFLEERP 650 



Sequence name: APP1_HUMAN 
Sequence documentation : 

Alignment of: M7 8 07 6_PEA_1_P24 x APP1_HUMAN 
Alignment segment 1/1: 
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Quality: 4791.00 

Escore: 0 

Matching length: 485 Total 

length: 485 

Matching Percent Similarity: 99.79 Matching Percent 
Identity: 99.59 

Total Percent Similarity: 99.79 Total Percent 

Identity: 99.59 

Gaps : 0 



Alignment : 



1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 

| i i | | I I I I It 1 I I I I I I I I I I I I I I I i I I I I I I M I I I I I I i I II II M 
1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 5 0 

51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 100 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I 
51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 100 

101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQWPFRCLPGEFVSEAL 150 

I I I I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQWPFRCLPGEFVSEAL 150 

151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 

| | I I I I I I I I I I I I ) I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I 

151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 

201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 

I | | | I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I i I I I I I I I M I I I I 

201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 
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251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 300 

M I I I I I I I I I I II M I II M I I I I I I I I I I I I M I I I 1 I I I I I I I I I 1 I 

251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 30 0 

301 PGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQAL 350 

I I I I I 1 I I I I I 1 I 1 I I I I I I I 1 I I I I! I I I I I I I I I I I I I I I I I I I I I I I 

301 PGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQAL 350 

351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 400 

I I I I I i I i I I I I I 11 I I I I I I I I I I I I i I I I I I I I I I I M ! I I I I I ! I I I 

351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 4 00 

4 01 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 450 

I I I I I I I I I I I I I I I I 1 I I II I ! II I I I I I 1 I I I I I I I I I I I I I I 

401 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 450 

451 THLQVIEERVNQSLGLLDQNPHLAQELRPQIRECL 4 85 

i i I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I = I i 
451 THLQVIEERVNQSLGLLDQNPHLAQELRPQIQELL 4 85 



Sequence name: APPl_HUMAN 
Sequence documentation : 

Alignment of: M7 8 07 6_PEA_1_P2 x APPl_HUMAN 
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Alignment segment 1/1: 



Quality: 4474.00 

5 Escore: 0 

Matching length: 454 
length: 454 

Matching Percent Similarity: 99.56 
Identity: 99.34 
10 Total Percent Similarity: 99.56 

Identity: 99.34 

Gaps : 0 



Total 



Matching Percent 



Total Percent 



15 



Alignment : 



1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 5 0 

I I I I I I I I 1 1 I I I I I I I I I I II I I I I I I I I I I I I i I I I I i I I I I I I I I I I 

1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 



20 



51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 10 0 

N i I II I I I I I I I I I i I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I 

51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 10 0 



25 



101 ELQ I ARVEQAT QA I PMERWC GGSRS G S C AHPHHQ WPFRCL PGE FVSE AL 150 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQWPFRCLPGEFVSEAL 150 



30 



151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 20 0 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

151 LVPEGCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 
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201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 

1 1 1 M 1 1 i 1 1 i 1 1 1 1 ! 1 1 1 1 1 1 1 1 i i 1 1 1 1 1 1 1 1 1 ) 1 1 1 1 1 1 i 1 1 1 1 1 1 1 

201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 
5 251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 30 0 

I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I 1 i I I I I I I I I ! I I II II II 

251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 30 0 

• • • 

301 PGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQAL 350 

10 I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 PGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQAL 350 

351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 40 0 

I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

15 351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 400 

. • • » * 

401 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVL 450 

I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

4 01 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVH 450 



20 



451 TSFQ 454 
I : I 

451 THLQ 454 



25 



30 Sequence name: APP1_HUMAN 
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Sequence documentation : 

Alignment of: M7 8 0 76_PEA_1_P25 x APP1__HUMAN 
5 Alignment segment 1/1: 

Quality: 4455.00 

Escore: 0 

Matching length: 448 Total 

10 length: 448 

Matching Percent Similarity: 100.00 Matching Percent 

Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

15 Gaps: 0 

Alignment : 

1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 

20 | | | | | | I I I I I I I I I I I II I I I ! I I I I ! I I M I I II I I I I I I I I I I I I I I 

1 MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEA 50 

51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 100 
I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
25 51 PGSAQVAGLCGRLTLHRDLRTGRWEPDPQRSRRCLRDPQRVLEYCRQMYP 100 

101 ELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQWPFRCLPGEFVSEAL 15 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I II I I I 

101 E LQ I ARVE Q AT Q A I PMERWC GG S RS G S C AH PHHQ WP FRC L P GE F V S E AL 150 



30 



151 LVPEGCRFLHQERMDQCE S S TRRHQE AQE AC S S QGL I LHG S GMLLP C G S D 20 0 
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I I I I I I I I I I I I I I ! I I I I I I I I I I I II I I II I I 1 I I I I I I I M 1 I I I I I 

151 LVPEGCRFLHQERMD'QCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSD 200 

201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 25 0 

I 1 I I I t I I I I I I I M I I I I I ! I I ! I 1 I I I I i I 1 I I I I II I I I M I I I I I I 

201 RFRGVEYVCCPPPGTPDPSGTAVGDPSTRSWPPGSRVEGAEDEEEEESFP 250 

251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 300 

| | | | | I I I I I I I I I I I I I I 1 I I I I I I I I II I I 1 I 1 I I I I I I I I I I I I I I I 
251 QPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGM 300 

301 P GE I S E HE G FLRAKMDLEERRMRQ I NE VMREWAMADN Q S KNL P K ADRQ AL 350 

M I 1 I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I 

301 PGEISEHEGFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQAL 350 
351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 400 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I I 

351 NEHFQSILQTLEEQVSGERQRLVETHATRVIALINDQRRAALEGFLAALQ 400 
401 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQ 44 8 

| | I I I I I I I I M I I I I I I I I ! 1 I I I I I I I I I I I I I I I I I I I M I I I I I 

401 ADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQ 44 8 



DESCRIPTION FOR CLUSTER T99080 
Cluster T99080 features 14 transcript(s) and 11 segment(s) of interest, the names for 
which are given in Tables 751 and 752, respectively, the sequences themselves are given at the 
end of the application. The selected protein variants are given in table 753. 

Table 751 - Transcripts of interest 
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Transcript Name ; . :f. 


Sequence ID No. 


T99080_PEA_4_T0 


83 


T99080_PEA_4_T2 


84 


T99080_PEA_4_T4 


85 


T99080_PEA_4_T6 


86 


T99080_PEA_4_T9 


87 


T99080_PEA_4_T 1 0 


88 


T99080_PEA_4_T1 1 


89 


T99080_PEA_4_T13 


90 


T99080_PEA_4_T14 


91 


T99080_PEA_4_T17 


92 


T99080_PEA_4_T18 


93 


T99080_PEA_4_T19 


94 


T99080_PEA_4_T20 


95 


T99080_PEA_4_T21 


96 


Table 752 - Segments of interest 


Segment Name ; c , ' 'M : . : 


Sequence ID No. J - 0. ■ ■ 


T99080_PEA_4_node_l 


695 


T99080_PEA_4_node_6 


696 


T99080_PEA_4_node_l 1 


697 


T99080_PEA_4_node_l 9 


698 


T99080_PEA_4_node_20 


699 


T99080_PEA_4_node_3 


700 


T99080_PEA_4_node_5 


701 


T99080_PEA_4_node_8 


702 


T99080_PEA_4_node_l 3 


703 


T99080_PEA_4_node_l 5 


704 


T99080_PEA_4_node_l 8 


705 
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Table 753 - Proteins of in terest 






Protein Name- ' V " 


Sequence ID No. 


Corresponding Transcript(s) ; 


T99080_PEA_4_P1 


1358 


T99080JPEA_4_T0 


T99080_PEA_4_P2 


1359 


T99080_PEA_4_T2 


T99080_PEA_4_P5 


1360 


T99080_PEA_4_T6 


T99080_PEA_4J?8 


1361 


T99080_PEA_4_T9 


T99080_PEA_4_P9 


1362 


T99080_PEA_4_T1 0 


T99080_PEA_4_P10 


1363 


T99080_PEA_4_T1 1 


T99080_PEA_4_P12 


1364 


T99080_PEA_4_T14 


T99080_PEA_4_P13 


1365 


T99080_PEA_4_T1 7 


T99080_PEA_4_P14 


1366 


T99080_PEA_4_T18 


T99080_PEA_4_P15 


1367 


T99080_PEA_4_T19 


T99080_PEA_4_P 1 6 


1368 


T99080_PEA_4_T20 


T99080_PE A_4_P 1 7 


1369 


T99080_PEA_4_T21 



These sequences are variants of the known protein Acylphosphatase, organ-common type 
5 isozyme (SwissProt accession identifier ACYOJHUMAN; known also according to the 
synonyms EC 3.6. 1.7; Acylphosphate phosphohydrolase; Acylphosphatase, erythrocyte 
isozyme), SEQ ID NO: 1440, referred to herein as the previously known protein. 

The sequence for protein Acylphosphatase, organ- common type isozyme is given at the 
end of the application, as "Acylphosphatase, orgaivcommon type isozyme amino acid 
10 sequence". Known polymorphisms for this sequence are as shown in Table 754. 



Table 754 -Amino acid mutations for Known Protein 



SNP position(s) on 
amino acid sequence 


Comment 


19 


G->R 
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The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: phosphate metabolism, which are annotation(s) related to Biological 
Process; and acylphosphatase, which are annotation(s) related to Molecular Function. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
5 Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

As noted above, cluster T99080 features 14 transcript(s), which were listed in Table 1 
above. These transcript(s) encode for protein(s) which are variant(s) of protein 
Acylphosphatase, organ-common type isozyme. A description of each variant protein according 
10 to the present invention is now provided. 

Variant protein T99080_PEA_4_P1 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
T99080JPEA_4_T0. The location of the variant protein was determined according to results 

15 from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: secreted. The protein localization is believed to be secreted because both 
signatpeptide prediction programs predict that this protein has a signal peptide, and neither 
trans- membrane region prediction program predicts that this protein has a trans -membrane 

20 region. 

Variant protein T99080_PEA_4JP1 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 755, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein T99080_PEA_4JP1 
25 sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 755 - Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


23 


A->V 


Yes 
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Variant protein T99080_PEA_4_P1 is encoded by the following transcript(s): 
T99080_PEA_4JT0, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T99080J?EA_4_T0 is shown in bold; this coding portion starts at 
5 position 226 and ends at position 41 1. The transcript also has the following SNPs as listed in 
Table 756 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein T99080JPEA_4JP1 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 756 - Nucleic acid SNPs 



SN1? position on nucleotide <f- 
sequfsnce ' .. S \. v ^ 


Alternative nucleic acid 


1 Previously known SNP? 1 


293 


C ->T 


Yes 


1293 


G->C 


Yes 


2034 


A->G 


Yes 


2114 


A->C 


Yes 


2153 


-> A 


No 



Variant protein T990 8 0_PEA_4 JP2 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 

15 T99080JPEA_4JT2. The location of the variant protein was determined according to results 
from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: membrane. The protein localization is believed to be membrane because 
although it is a partial protein, because both trans -membrane region prediction programs predict 

20 that this protein has a trans - membrane region. 

Variant protein T99080JPEA_4__P2 is encoded by the following transcript(s): 
T9 90 80_PE A_4 JT2, for which the sequence(s) is/are given at the end of the application. The 
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coding portion of transcript T99080_PEA_4_T2 is shown in bold; this coding portion starts at 
position 1 and ends at position 192. The transcript also has the following SNPs as listed in Table 
757 (given according to their position on the nucleotide sequence, with the alternative nucleic 
acid listed; the last column indicates whether the SNP is known or not; the presence of known 



5 SNPs in variant protein T99080_PEA 4JP2 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 757- Nucleic acid SNPs 



SNP positidrtbn nucleotide ; 
sequence -- '. M 


Alternative nucleic acid f U$?' : 


Previously kho wn SNP? %, 

"V ' '' ■ .■ - '-' 


1074 


G->C 


Yes 


1815 


A->G 


Yes 


1895 


A->C 


Yes 


1934 


-> A 


No 



10 Variant protein T99080JPEA_4_P5 according to the present invention has an amino acid 

sequence as given at the end of the application; it is encoded by transcript(s) 
T99080JPEA__4_T6. An alignment is given to the known protein (Acylphosphatase, organ- 
common type isozyme) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 

15 description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

Comparison report between T99080JPEA_4_P5 and ACYO_HUMAN_Vl (SEQ ID NO: 
1441): 

l.An isolated chimeric polypeptide encoding for T9 90 8 0 JPE A_4_P5 , comprising a first 
20 amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence MPASARLAGAGLLLAFLRALGCAGRAPGLS corresponding to amino acids 1 
- 30 of T99080_PEA__4_P5, and a second amino acid sequence being at least 90 % homologous 
to 
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MAEGNTLISVDYEIFGKVQGVFFRKHTQAEGKKLGLVGWVQNTDRGTVQGQLQGPIS 
KVRHMQEWLETRGSPKSHIDKANFNNEKVILKLDYSDFQIVK corresponding to amino 
acids 1 - 99 of ACYO_HUMAN_Vl ? which also corresponds to amino acids 31 - 129 of 
T99080_PEA_4_P5, wherein said first amino acid sequence and second amino acid sequence 
5 are contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a head of T99080_PEA_4_P5, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence MPASARLAGAGLLLAFLRALGCAGRAPGLS of T99080JPEA_4J>5. 

10 

It should be noted that the known protein sequence (ACYOJHUMAN) has one or more 
changes than the sequence given at the end of the application and named as being the amino 
acid sequence for ACYOJHUMANJVl. These changes were previously known to occur and 
are listed in the table below. 
1 5 Table 758 - Changes to ACYO_HUMAN_Vl 



SlSITR.position(s) on :/ 
aii&io acid sequence 


•'Type of dhjwg$l . ftl'i ■ ijjf-- S " ; - v - ; 


1 


init_met 



The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 

20 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein T99080_PEA_4_P5 also has the following non- silent SNPs (Single 

25 Nucleotide Polymorphisms) as listed in Table 759, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein T99080JPEA_4JP5 
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sequence provides support for the deduced sequence of this variant protein according to the 

present invention). 

Table 759 - Amino acid mutations 



SNP p^sition(s) on amino acid 
sequence '•>/. * 


Alteriiative amino acid(s) 


Previously known SNP? '£' 


23 


A->V 


Yes 



Variant protein T99080JPEAj4JP5 is encoded by the following transcript(s): 
T99080JPEA_4_T6, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T99080_PEA_4_T6 is shown in bold; this coding portion starts at 
position 226 and ends at position 612. The transcript also has the following SNPs as listed in 
Table 760 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein T99080JPEA_4_P5 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 760 - Nucleic acid SNPs 



SNP position on nucleotide ■*> 
sequence f 


Alternative nucleic acid 


Previously known SNP? 


293 


C->T 


Yes 


697 


A->G 


Yes 


777 


A->C 


Yes 


816 


-> A 


No 



Variant protein T99080_PEA_4JP8 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
T99080_PEA_4JT9. An alignment is given to the known protein (Acylphosphatase, organ- 
common type isozyme) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
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description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

Comparison report between T99080J?EA_4J>8 and ACYOJHUMANV1: 
l.An isolated chimeric polypeptide encoding for T99080JPEA_4JP8, comprising a first 
5 amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence M corresponding to amino acids 1 - 1 of T99080_PEA_4_P8, and a second amino 
acid sequence being at least 90 % homologous to 

QAEGKKLGLVGWVQNTDRGTVQGQLQGPISKVRHMQEWLETRGSPKSHIDKANFNNE 
10 KVILKLDYSDFQIVK corresponding to amino acids 28 - 99 of ACYOJHUMANJVl, which 
also corresponds to amino acids 2 - 73 of T99080_PEA_4_P8, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 

It should be noted that the known protein sequence (ACYOJHUMAN) has one or more 
15 changes than the sequence given at the end of the application and named as being the amino 
acid sequence for ACYOJHUMAN Vl . These changes were previously known to occur and 
are listed in the table below. 



Table 761 - Changes to ACYO_HUMAN_Vl 



SNf position(s) on 
amino acid sequence ' £ \ 


Type of change ; * • > ,;| / " r " ■ ,| 

— — * — ■;.:„. 1- — — — — -— 


1 


init_met 



The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
intracellularly. The protein localization is believed to be intracellularly because neither of the 
25 trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signatpeptide prediction programs predict that this protein is a non-secreted 
protein. 
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Variant protein T99080JPEA_4JP8 is encoded by the following transcript(s): 
T 9 9 0 8 0_P E A_4 JT 9 , for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T99080_PEA_4_T9 is shown in bold; this coding portion starts at 
5 position 162 and ends at position 380. The transcript also has the following SNPs as listed in 
Table 762 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein T99080 J>EA_4_P8 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 762 - Nucleic acid SNPs 



SNP position oil nucleotide f - 
sequence . 4f & 


Alternative nucleic acid 


Previously known SNP? : 


465 


A->G 


Yes 


545 


A->C 


Yes 


584 


-> A 


No 



Variant protein T99080_PEA_4_P9 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 

15 T99080_PEA_4_T1 0. The location of the variant protein was determined according to results 
from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: membrane. The protein localization is believed to be membrane because 
although it is a partial protein, because both trans -membrane region prediction programs predict 

20 that this protein has a trans -membrane region. 

Variant protein T99080_PEA_4P9 is encoded by the following transcript(s): 
T99080_PEA_4_T10 ? for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T99080__PEA_A_T10 is shown in bold; this coding portion starts at 
25 position 1 and ends at position 261. The transcript also has the following SNPs as listed in Table 
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763 (given according to their position on the nucleotide sequence, with the alternative nucleic 
acid listed; the last column indicates whether the SNP is known or not; the presence of known 
SNPs in variant protein T99080 PEA 4JP9 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 



5 Table 763 - Nucleic acid SNPs 



SNP position on nucleotide 


Alternative nucleic acid 


I Previously known SNP? 


557 


A->G 


Yes 


637 


A->C 


Yes 


676 


-> A 


No 



Variant protein T99080JPEA4JP10 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 

10 T990 8 0_PE A_4_T 1 1 . The location of the variant protein was determined according to results 
from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: membrane. The protein localization is believed to be membrane because 
although it is a partial protein, because both trans -membrane region prediction programs predict 

1 5 that this protein has a trans - membrane region. 

Variant protein T99080_PEA_4JP10 is encoded by the following transcript(s): 
T99080_PEA_4JT1 1, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T99080JPEA_4_T1 1 is shown in bold; this coding portion starts at 
20 position 1 and ends at position 240. The transcript also has the following SNPs as listed in Table 
764 (given according to their position on the nucleotide sequence, with the alternative nucleic 
acid listed; the last column indicates whether the SNP is known or not; the presence of known 
SNPs in variant protein T99080_PEA_4JP10 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

25 Table 764 - Nucleic acid SNPs 
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SNP position on nucleotide ? 
"sequence -.;f . 


Alternative nucleic acid 


Previously known SNP? , 


269 


G->T 


Yes 


592 


A->G 


Yes 


672 


A->C 


Yes 


711 


->A 


No 



Variant protein T99080_PEA_4JP12 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 T99080_PEA__4_T14. The location of the variant protein was determined according to results 
from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: membrane. The protein localization is believed to be membrane because 
although it is a partial protein, because both trans -membrane region prediction programs predict 
1 0 that this protein has a trans - membrane region. 

Variant protein T99080 JPEA_4_P 1 2 is encoded by the following transcript(s): 
T99080_PEA_4_T14, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T99080_PEA_4JT14 is shown in bold; this coding portion starts at 
15 position 1 and ends at position 282. 

Variant protein T99080_PEA_4_P13 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 

20 T99080_PEA_4_T17. The location of the variant protein was determined according to results 
from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: membrane. The protein localization is believed to be membrane because 
although it is a partial protein, because both trans -membrane region prediction programs predict 

25 that this protein has a trans- membrane region. 
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Variant protein T99080JPEA_4_P13 is encoded by the following transcript(s): 
T99080_PEA_4JT17, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T99080_PEA_4_T17 is shown in bold; this coding portion starts at 
5 position 1 and ends at position 207. 



Variant protein T99080J?EA_4JP14 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 

10 T99080JPEA_4_T18. The location of the variant protein was determined according to results 
from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: secreted. The protein localization is believed to be secreted because both 
signatpeptide prediction programs predict that this protein has a signal peptide, and neither 

15 trans- membrane region prediction program predicts that this protein has a trans -membrane 
region. 

Variant protein T99080JPEAj4_P14 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 765, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
20 the SNP is known or not; the presence of known SNPs in variant protein T99080_PEA_4_P14 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 765 - Amino acid mutations 



SNP positicm(s) on amino acid 
sequence , 


Alternative amino acid(s) 


; Prwwsl^idiown SNP? 


23 


A->V 


Yes 



25 Variant protein T99080JPEA_4JP14 is encoded by the following transcript(s): 

T99080_PEA_4_T1 8, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T99080JPEA_4_T18 is shown in bold; this coding portion starts at 
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position 226 and ends at position 480. The transcript also has the following SNPs as listed in 
Table 766 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein T99080_PEA_4JP14 sequence provides support for the deduced 



5 sequence of this variant protein according to the present invention). 
Table 766 - Nucleic acid SNPs 



SNP position on nucleotide; ; 
sequence ' V't': ■ ; 


Alternative nucleic acid 


Previously known SNP? , 


293 


C->T 


Yes 


776 


A->G 


Yes 


856 


A->C 


Yes 


895 


-> A 


No 



Variant protein T99080 JPEA_4JP1 5 according to the present invention has an amino acid 
10 sequence as given at the end of the application; it is encoded by transcript(s) 

T99080JPEA_4_T19. The location of the variant protein was determined according to results 
from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: secreted. The protein localization is believed to be secreted because both 
1 5 signal-peptide prediction programs predict that this protein has a signal peptide, and neither 
trans- membrane region prediction program predicts that this protein has a trans- membrane 
region. 

Variant protein T99080_PEA_4_P 1 5 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 767, (given according to their position(s) on the 
20 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein T99080JPEA_4JP15 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 767 -Amino acid mutations 
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SNP position(s) on amino acid 
sequence .... . ):■. 


Alternative amino acid(s) 


Previously known SNP? 


23 


A-> V 


Yes 



Variant protein T99080JPEA_4_P15 is encoded by the following transcript(s): 
T99080JPEA_4JT19, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T99080JPEA_4_T19 is shown in bold; this coding portion starts at 
5 position 226 and ends at position 459. The transcript also has the following SNPs as listed in 
Table 768 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein T99080JPEA_4JP1 5 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 768 - Nucleic acid SNPs 



SNP position onfttjucleotidlf •/. 
sequence ■ i .'■ -J" ": £'/• 


Alternative nucleic acid 


Previously known SNP? 


293 


C->T 


Yes 


488 


G->T 


Yes 


811 


A->G 


Yes 


891 


A->C 


Yes 


930 


->A 


No 



Variant protein T99080JPEA_4_P16 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
15 T99080JPEA_4_T20. The location of the variant protein was determined according to results 
from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: secreted. The protein localization is believed to be secreted because both 
signatpeptide prediction programs predict that this protein has a signal peptide, and neither 
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trans- membrane region prediction program predicts that this protein has a trans -membrane 
region. 

Variant protein T99080JPEA_4JP16 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 769, (given according to their position(s) on the 
5 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein T99080_PEA_4_P 1 6 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 769 - Amino acid mutations 



SNP : positionJ(s) on^amino acid 
sequence v /. > W : ' : v "t 


Alt^priative mmno acid(s) \ 


Previdusly knovyn SNP? 


23 


A->V 


Yes 



10 



Variant protein T99080__PEA_4JP 1 6 is encoded by the following transcript(s): 
T99080JPEA_4_T20 5 for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T99080JPEA_4_T20 is shown in bold; this coding portion starts at 
position 226 and ends at position 501. The transcript also has the following SNPs as listed in 
15 Table 770 (given according to their position on the nucleotide sequence, with the alternative 

nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein T99080_PEA_4JP1 6 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 770 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previouslyknown SNP? 


293 


C->T 


Yes 



Variant protein T99080_PEA_4JP17 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
T99080JPEA_4_T2L The location of the variant protein was determined according to results 
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from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: secreted. The protein localization is believed to be secreted because both 
signal-peptide prediction programs predict that this protein has a signal peptide, and neither 
5 trans- membrane region prediction program predicts that this protein has a trans -membrane 
regioa 

Variant protein T99080_PEA_4JP17 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 77 1, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
10 the SNP is known or not; the presence of known SNPs in variant protein T99080JPEA 4_P 1 7 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 771 - Amino acid mutations 



SUP pos|tion(s)Pdn amino acid 


Mtematiw . 


Previously Imown: SNP? 


23 


A-> V 


Yes 



15 Variant protein T99080_PEA_4_P17 is encoded by the following transcript(s): 

T99080JPEA_4_T21, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T99080_PEA__4_T21 is shown in bold; this coding portion starts at 
position 226 and ends at position 426. The transcript also has the following SNPs as listed in 
Table 772 (given according to their position on the nucleotide sequence, with the alternative 

20 nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein T99080JPEA_4_P17 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 772 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


293 


C->T 


Yes 
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As noted above, cluster T99080 features 1 1 segment(s), which were listed in Table 2 
above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
provided. 

Segment cluster T99080_PEA_4_node_l according to the present invention is supported 
by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T99080_PEA_4 JTO, T99080_PEA_4_T6, 
T99080J>EA_4_T13, T99080_PEA_4__T18, T99080JPEA_4_T19, T99080JPEA__4_T20 and 
T99080JPEA_4_T21. Table 773 below describes the starting and ending position of this 
segment on each transcript. 
Table 773 - Segment location on transcripts 



Transcript name , ■ 


Segment -M ? 


Segment " > 




starting position 


. ending position 


T99080_PEA_4_T0 


1 


307 


T99080_PEA_4_T6 


1 


307 


T99080_PEA_4_T13 


1 


307 


T99080_PEA_4_T18 


1 


307 


T99080_PEA_4_T19 


1 


307 


T99080_PEA_4_T20 


1 


307 


T99080_PEA_4_T21 


1 


307 



Segment cluster T99080_PEA_4_node_6 according to the present invention is supported 
by 3 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T99080J>EA_4JT17 and T99080_PEA_4_T21. 
Table 774 below describes the starting and ending position of this segment on each transcript. 



WO 2006/131783 



PCT/IB2005/004037 



852 

Table 774 - Segment location on transcripts 



Transcript name : 


Segment 
starting position 


Segment .> 
ending position 


T99080_PEA_4_T17 


181 


627 


T99080_PEA_4_T21 


400 


846 



Segment cluster T99080_PEA_4_node_l 1 according to the present invention is supported 
5 by 7 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T99080_PEA_4_T14 and T99080JPEA_4T20. 
Table 775 below describes the starting and ending position of this segment on each transcript. 

Table 775 - Segment location on transcripts 



TranscrijptnamQ / ; . ; 


Segment r . 


Segment - , " 
ending position^ ^ 


T99080_PEA_4__T14 


260 


782 


T99080_PEA_4JI20 


479 


1001 



10 

Segment cluster T99080_PEA_4_node_19 according to the present invention is supported 
by 59 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T99080JPEA_4_T0, T99080JPEA_4T2 and 
T99080JPEA_4_T4. Table 776 below describes the starting and ending position of this segment 
15 on each transcript. 



Table 776 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


T99080_PEA_4_T0 


449 


1736 


T99080_PEA_4_T2 


230 


1517 


T99080_PEA_4_T4 


78 


1365 
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Segment cluster T99080_PEA_4_node_20 according to the present invention is supported 
by 98 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T99080JPEA_4_T0, T99080_PEA_4_T2, 
T99080_PEA_4_T4, T99080JPEA_4T6, T99080JPEAJMT9, T99080JPEA_4JT10, 
T99080JPEA_4T11, T99080_PEA_4_T13 ? T99080J>EA_4_T18 and T99080JPEA_4_T19. 
Table 777 below describes the starting and ending position of this segment on each transcript. 

Table 777 - Segment location on transcripts 



TraMcript name / *\" M'' ' '■ 


Segment . ;, 
starting position x 


? Segment ''=?\ 
lending position 


T99080_PEA_4_T0 


1737 


2175 


T99080_PEA_4_T2 


1518 


1956 


T99080_PEA_4_T4 


1366 


1804 


T99080_PEA_4_T6 


400 


838 


T99080_PEA_4_T9 


168 


606 


T99080_PEA_4_T10 


260 


698 


T99080_PEA_4_T11 


295 


733 


T99080_PEA_4_T13 


308 


746 


T99080_PEA_4_T18 


479 


917 


T99080_PEA_4_T19 


514 


952 



the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



Segment cluster T99080_PEA_4_node_3 according to the present invention is supported 
15 by 40 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T99080_PEA__4_T2 5 T99080JPEA_4T9, 
T99080_PEA_4_T10 :> T99080JPEA_4_T11, T99080JPEA_4_T14 and T99080_PEA_4__T17. 
Table 778 below describes the starting and ending position of this segment on each transcript. 
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Table 778 - Segment location on transcripts 



Transcript name ^ ? 


Segment 


Segment 




starting position 


ending position 


T99080_PEA_4_T2 


1 


88 


T99080_PEA_4_T9 


1 


88 


T99080_PEA_4_T10 


1 


88 


T99080_PEA_4_T1 1 


1 


88 


T99080_PEA_4_T14 


1 


88 


T99080_PEA_4_T1 7 


1 


88 



Segment cluster T99080JPEA_4_node_5 according to the present invention is supported 
5 by 57 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T99080_PEA_4_T0, T99080J>EAj4_T2, 
T99080JPEA_4T6, T99080JPEA_A_T10, T99080_PEA_4JT11 5 T99080JPEA_4_T14, 
T99080J>EA_4_T17, T99080JPEA_4JT18, T99080JPEA_4_T19, T99080_PEA„4_T20 and 
T99080_PEA_4JT21. Table 779 below describes the starting and ending position of this 
1 0 segment on each transcript. 

Table 779 - Segment location on transcripts 



'Transcript name ; $ ■ ■ '"' . ,p • ' 


Segment 
starting position 


Segment - f 'J ; . 
ending, position, . 


T99080_PEA_4_T0 


308 


399 


T99080_PEA_4_T2 


89 


180 


T99080_PEA_4_T6 


308 


399 


T99080_PEA_4_T10 


89 


180 


T99080_PEA_4_T11 


89 


180 


T99080_PEA_4_T14 


89 


180 


T99080_PEA_4_T17 


89 


180 


T99080JPEA_4_T18 


308 


399 
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T99080_PEA_4_T19 


308 


399 


T99080_PEA_4_T20 


308 


399 


T99080JPEA_4_T21 


308 


399 



Segment cluster T99080_PEA_4_node_8 according to the present invention is supported 
by 12 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T99080_PEA_4_T9, T99080_PEA_4_T10, 
T99080_PEA__4_T14, T99080_PEA_4_T18 and T99080_PEA_4_T20. Table 780 below 
describes the starting and ending position of this segment on each transcript. 



Table 780 - Segment location on transcripts 





Segment % i 


Segment ., ^'f- : ; , 4, 




starting position '" : 


ending position V : * 


T99080_PEA_4_T9 


89 


167 


T99080_PEA_4_T1 0 


181 


259 


T99080_PEA_4_T14 


181 


259 


T99080_PEA_4_T1 8 


400 


478 


T99080_PEA_4_T20 


400 


478 



Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 781. 

Table 781 - Oligonucleotides related to this segment 



Oligonucleotide name 


Overexpressed in cancers 


Chip reference 


T99080_0_0_58896 


lung malignant tumors 


LUN 
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Segment cluster T99080_PEA_4_node_13 according to the present invention is supported 
by 2 libraries. The number of libraries was determined, as previously described. This segment 
can be found in the following transcript(s): T99080_PEA_4_T4. Table 782 below describes the 
starting and ending position of this segment on each transcript. 

Table 782 - Segment location on transcripts 



Transcript name / ; r 


Segment" 

starting position t 


■■ Segment % . 
ending position ■ 


T99080J?EA_4_T4 


1 


11 



Segment cluster T99080JPEA_4_node_15 according to the present invention is supported 
by 6 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T990 8 OJPE A_4_T 1 1 and T99080JPEA_4JT19. 
Table 783 below describes the starting and ending position of this segment on each transcript. 
Table 783 - Segment location on transcripts 



Transcript name . . iji 


;....Sfgmeht . . ■ 
starting position % . 


Segment , % ' 
ending position | 


T99080_PEA_4_T1 1 


181 


294 


T99080_PEA_4_T19 


400 


513 



Segment cluster T99080_PEA_4_node_18 according to the present invention is supported 
by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T99080_PEA_4_T0 and T99080JPEA_4_T2. Table 
784 below describes the starting and ending position of this segment on each transcript. 

Table 784 - Segment location on transcripts 



Transcript name 


Segment 
starting positiom 


r Segment 
ending position 


T99080JPEA_4_T0 


400 


448 
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T99080JPEA_4_T2 


181 


229 









5 



Variant protein alignment to the previously known protein: 
Sequence name: ACY0_HUMAN_V1 

10 

Sequence documentation : 

Alignment of: T9 90 80_PEA_4_P5 x ACY0_HUMAN_V1 
15 Alignment segment 1/1: 

Quality: 

Escore: 0 

Matching length: 
20 length: 9 9 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 
25 Gaps : 

Alignment : 

31 MAE GN TLISVDYEIF GK VQ G VF FRKH T Q AE GKKL G L VG W VQN T DRG T VQ G 8 0 
30 I I I I I I I I I I I I t i I I I I I I I I ! I I I I I I I I I I I I I I I M I I I I I I I I I I 



973.00 

99 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 
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1 MAEGNTLI SVDYEI FGKVQGVFFRKHTQAEGKKLGLVGWVQNTDRGTVQG 5 0 
81 QLQGPISKVRHMQEWLETRGSPKSHIDKANFNISIEKVILKLDySDFQIVK 12 9 

I I ! | I I I I I i 1 1 I I I I I ! I I I I I I I I I ! I I I I M I I i t I I I I I I I i I I I 

51 QLQGPISKVRHMQEWLETRGSPKSHIDKANFNNEKVILKLDYSDFQIVK 9 9 



Sequence name: ACY0_HUMAN_V1 
Sequence documentation : 
Alignment of: T99080_PEA_4__P8 
Alignment segment 1/1: 

Quality: 

Escore: 0 

Matching length: 
length: 72 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 

Alignment : 



ACY0_HUMAN_V1 

711 .00 

72 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 
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2 QAEGKKLGLVGWVQNTDRGTVQGQLQGPISKVRHMQEWLETRGSPKSHID 51 

I I I I I I i I I I I II I I I I 1 t I I i ! I I I I I I I I I I I I I i I I I I I 11 I I I I I I 

2 8 QAEGKKLGLVGWVQNTDRGTVQGQLQGPISKVRHMQEWLETRGSPKSHID 77 

52 KANFNNEKVILKLDYSDFQIVK 73 

I 11 I I I I I I I I I I I I I I I I 1 11 
7 8 KANFNNEKVILKLDYSDFQIVK 9 9 



DESCRIPTION FOR CLUSTER T08446 
Cluster T08446 features 2 transcript(s) and 36 segment(s) of interest, the names for which 
are given in Tables 785 and 786, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 787. 

Table 785 - Transcripts of interest 



Transcript. Namei I * 


Sequence ID No. » ¥: ; v ' '.' -M*-:' ■ 


T08446_PEA_1_T2 


97 


T08446_PEA_1_T22 


98 


Table 786 - Segments of interest 


Segment Name ; ' ; -< v h ;.• 


Sequence ID No. _ .'f 


T08446_PEA_l_node_2 


706 


T08446_PEA_l_node_9 


707 


T08446_PEA_l_node_l 5 


708 


T08446_PEA_l_node_l 7 


709 


T08446_PEA_l_node_25 


710 


T08446_PEA_l_node_29 


711 


T08446_PEA_l_node_3 8 


712 


T08446_PEA_l_node_43 


713 


T08446_PEA_l_node_5 1 


714 
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T08446_PEA_l_node_52 


715 


T08446_PEA_l_node_55 


716 


T08446_PEA_l_node_57 


717 


T08446_PEA_l_node_59 


718 


T08446_PEA_l_node_62 


719 


T08446_PEA_l_node_63 


720 


T08446_PEA_l_node_3 


721 


T08446_PEA_1 _node_5 


722 


T08446_PEA_l_node_7 


723 


T08446_PEA_l_node_l 2 


724 


T08446_PEA_l_node_l 3 


725 


T08446_PEA_l_node_l 9 


726 


T08446 JPEA_l_node_2 1 


727 


T08446_PEA_l_node_23 


728 


T08446JPEA_l_node_27 


729 


T08446_PEA_l_node_32 


730 


T08446_PEA_l_node_34 


731 


T08446_PEA_l_node_45 


732 


T08446_PEA_l_node_46 


733 


T08446_PEA_l_node_48 


734 


T08446_PEA_l_node_54 


735 


T08446_PEA_l_node_58 


736 


T08446_PEA_l_node_60 


737 


T08446_PEA_l_node_61 


738 


T08446_PEA_l_node_64 


739 


T08446_PEA_l_node_65 


740 


T08446_PEA_l_node_66 


741 



Table 787 - Proteins of interest 
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Protein Name 


Sequence ID No. 


: Corresponding Transcript(s) 


T08446_PEA_1_P18 


1370 


T08446_PEA_1_T2 


T08446_PEA_1_P19 


1371 


T08446_PEA_1_T22 



These sequences are variants of the known protein Sorting nexin 26 (SwissProt accession 
identifier SNXQJHUMAN), SEQ ID NO: 1442, referred to herein as the previously known 
protein. 

5 Protein Sorting nexin 26 is known or believed to have the following function(s): May be 

involved in several stages of intracellular trafficking (By similarity). The sequence for protein 
Sorting nexin 26 is given at the end of the application, as "Sorting nexin 26 amino acid 
sequence". 

The following GO Annotation(s) apply to the previously known protein. The following 
10 annotation(s) were found: intracellular protein traffic, which are annotation(s) related to 
Biological Process; and protein transporter, which are annotation(s) related to Molecular 
Function. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
1 5 from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

As noted above, cluster T08446 features 2 transcript(s), which were listed in Table 1 
above. These transcript(s) encode for protein(s) which are variant(s) of protein Sorting nexin 26. 
A description of each variant protein according to the present invention is now provided. 

20 Variant protein T08446_PEA_1 JP1 8 according to the present invention has an amino acid 

sequence as given at the end of the application; it is encoded by transcript(s) 
T08446 PEA 1T2. An alignment is given to the known protein (Sorting nexin 26) at the end 
of the application. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
25 variant protein according to the present invention to each such aligned protein is as follows: 
Comparison report between T08446_PEA_1_P 18 and SNXQHUM AN : 
LAn isolated chimeric polypeptide encoding for T08446JPEA_1_P18, comprising a first 
amino acid sequence being at least 90 % homologous to 
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MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 

PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 

DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQMLVPLLLQYLETLSGLVDSNLNC 

GPVLTWME corresponding to amino acids 1-185 of SNXQJHUMAN, which also 

corresponds to amino acids 1-185 of T08446_PEA_1_P18, and a second amino acid sequence 

being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 

90% and most preferably at least 95% homologous to a polypeptide having the sequence 

LDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIVSVIDMPPTEDRSW 

WRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAVPRPRGKLA 

GLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSEFIEAHGVV 

DGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPNPLLTYQLY 

GKFSEAMSVPGEEERLVRVHDVIQQLPPPHYRTLEYLLRHLARMARHSANTSMHARNL 

AIVWAPNLLRSMELESVGMGGAAAFREVRVQSVVVEFLLTHVDVLFSDTFTSAGLDPA 

GRCLLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAERRKGERGEK 

QRKPGGSSWKTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRSAKSEESLS 

SQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSCESLSSSSSSESSSSESSSSSSESSAAGL 

GALSGSPSHRTSAWLDDGDELDFSPPRCLEGLRGLDFDPLTFRCSSPTPGDPAPPASPAP 

PAPASAFPPRVTPQAISPRGPTSPASPAALDISEPLAVSVPPAVLELLGAGGAPASATPTP 

ALSPGRSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLPPPPLSLLR 

PGGAPPPPPKNPARLMALALAERAQQVAEQQSQQECGGTPPASQSPFHRSLSLEVGGEP 

LGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQRPMGTSRRGLRGPAQVSAQ 

LRAGGGGRDAPEAAAQSPCSVPSQVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSFQP 

SSPAPVWRSSLGPPAPLDRGENLYYEIGASEGSPYSGPTRSWSPFRSMPPDRLNASYGM 

LGQSPPLHRSPDFLLSYPPAPSCFPPDHLGYSAPQHPARRPTPPEPLYVNLALGPRGPSPA 

SSSSSSPPAHPRSRSDPGPPVPRLPQKQRAPWGPRTPHRVPGPWGPPEPLLLYRAAPPAY 

GRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHSEGQTRSYC coiTesponding to 

amino acids 186 - 1305 of T08446JPEA_1_P18, wherein said first amino acid sequence and 

second amino acid sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of T08446_PEA_1_P18, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
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sequence 

LDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIVSVIDMPPTEDRSW 

WRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAVPRPRGKLA 

GLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSEFIEAHGVV 

DGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPNPLLTYQLY 

GKFSEAMSVPGEEERLVRVHDVIQQLPPPHYRTLEYLLRHLARMARHSANTSMHARNL 

AIVWAPNLLRSMELESVGMGGAAAFREVRVQSVVVEFLLTHVDVLFSDTFTSAGLDPA 

GRCLLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAERRKGERGEK 

QRKPGGSSWKTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRSAKSEESLS 

SQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSCESLSSSSSSESSSSESSSSSSESSAAGL 

GALSGSPSHRTSAWLDDGDELDFSPPRCLEGLRGLDFDPLTFRCSSPTPGDPAPPASPAP 

PAPASAFPPRVTPQAISPRGPTSPASPAALDISEPLAVSVPPAVLELLGAGGAPASATPTP 

ALSPGRSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLPPPPLSLLR 

PGGAPPPPPKjNPARLMALALAERAQQVAEQQSQQECGGTPPASQSPFHRSLSLEVGGEP 

LGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQRPMGTSRRGLRGPAQVSAQ 

LRAGGGGRD APEAAAQSPCSVPSQVPTPGFFSPAP RECLPPFLGVPKPGLYPLGPPSFQP 
SSPAPVWRSSLGPPAPLDRGENLYYEIGASEGSPYSGPTRSWSPFRSMPPDRLNASYGM 
LGQSPPLHRSPDFLLSYPPAPSCFPPDHLGYSAPQHPARRPTPPEPLYVNLALGPRGPSPA 
SSSSSSPPAHPRSRSDPGPPVPRLPQKQRAPWGPRTPHRVPGPWGPPEPLLLYRAAPPAY 
GRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHSEGQTRSYC in 

T08446_PEA_1_P18. 

Comparison report between T08446_PEA_1_P18 and Q9NT23 (SEQ ID NO: 1443): 
l.An isolated chimeric polypeptide encoding for T08446JPEA_1_P18, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence 

MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 
PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 
DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQMLVPLLLQYLETLSGLVDSNLNC 
GPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIVSVIDM 
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PPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAV 
PRPRGKLAGLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSE 
FIEAHGVVDGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPNP 
LLTYQLYGKFSEAMSVPGEEERLVRV corresponding to amino acids 1 - 443 of 
5 T08446_PEA_1_P18, a second amino acid sequence being at least 90 % homologous to 

HDVIQQLPPPHYRTLEYLLRHLARMARHSANTSMHARNLAIVWAPNLLRSMELESVG 
MGGAAAFREVRVQSVVVEFLLTHVDVLFSDTFTSAGLDPAGRCLLPRPKSLAGSCPSTR 
LLTLEEAQARTQGRLGTPTEPTTPKAPASPAERRKGERGEKQRKPGGSSWKTFFALGRG 
PSWRKKPLPWLGGTRAPPQPSGSRPDTVTLRSAKSEESLSSQASGAGLQRLHRLRRPHS 

1 0 SSDAFPVGPAPAGSCESLSSSSSSESSSSESSSSSSESSAAGLGALS GSPSHRTS AWLDDG 
DELDFSPPRCLEGLRGLDFDPLTFRCSSPTPGDPAPPASPAPPAPASAFPPRVTPQAISPRG 
PTSPASPAALDISEPLAVSVPPAVLELLGAGGAPASATPTPALSPGRSLRPHLIPLLLRGA 
EAPLTDACQQEMCSKLRGAQGPLGPDMESPLPPPPLSLLRPGGAPPPPPiCNPARLMALA 
LAERAQQVAEQQSQQECGGTPPASQSPFHRSLSLEVGGEPLGTSGSGPPPNSLAHPGAW 

1 5 VPGPPPYLPRQQSDGSLLRSQRPMGTSRRGLRGPAQVSAQLRAGGGGRDAPEAAAQSP 
CSVPSQVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGPPAPLDR 
GENLYYEIGASEGSPYSG corresponding to amino acids 1 - 674 of Q9NT23, which also 
corresponds to amino acids 444 - 1117 of T08446_PEA_1_P18, abridging amino acid P 
corresponding to amino acid 1118 of T08446_PEA_1_P18, and a third amino acid sequence 

20 being at least 90 % homologous to 

TRSWSPFRSMPPDRLNASYGMLGQSPPLHRSPDFLLSYPPAPSCFPPDHLGYSAPQHPAR 
RPTPPEPL Y VNL ALGPRGP SPAS S S S S SPP AHPRSRSDPGPP VPRLPQKQRAP WGPRTPHR 
VPGPWGPPEPLLLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHS 
EGQTRSYC corresponding to amino acids 676 - 862 of Q9NT23, which also corresponds to 

25 amino acids 1 1 19 - 1305 of T08446_PEA_1_P18, wherein said first amino acid sequence, 

second amino acid sequence, bridging amino acid and third amino acid sequence are contiguous 
and in a sequential order. 

2 An isolated polypeptide encoding for a head of T08446_PEA_1_P18, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 

30 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 
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MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 

PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 

DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQMLVPLLLQYLETLSGLVDSNLNC 

GPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIVSVIDM 

PPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAV 

PRPRGKLAGLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSE 

FIEAHGWDGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPNP 

LLTYQLYGKFSEAMSVPGEEEPvLVRV of T08446_PEA_1_P 1 8. 

Comparison report between T08446_PEA_1_P18 and Q96CP3 (SEQ ID NO: 1444): 
l.An isolated chimeric polypeptide encoding for T08446_PEA_1_P18, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence 

MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 

PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 

DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQMLVPLLLQYLETLSGLVDSNLNC 

GPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIVSVIDM 

PPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAV 

PRPRGKLAGLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSE 

FIEAHGVVDGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPNP 

LLTYQLYGKFSEAMSVPGEEERLVRVHDVIQQLPPPHYRTLEYLLRHLARMARHSANT 

SMHARNLAIVWAPNLLRSMELESVGMGGAAAFREVRVQSVVVEFLLTHVDVLFSDTF 

TSAGLDPAGRCLLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAER 

RKGERGEKQRKPGGSSWKTFFALGRGPSWRKKPLPWLGGTRAPPQPSGSRPDTVTLRS 

AKSEESLSSQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSCESLSSSSSSESSSSESSSSS 

SESSAAGLGALSGSPSHRTSAWLDDGDELDFSPPRCLEGLRGLDFDPLTFRCSSPTPGDP 

APPASPAPPAPASAFPPRVTPQAISPRGPTSPASPAALDISEPLAVSVPPAVLELLGAGGA 

PASATPTPALSPGRSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLP 

PPPLSLLRPGGAPPPPPKNPARLMALALAERAQQVAEQQSQQECGGTPPASQSPFHRSLS 

LEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQRPMGTSRRG 
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corresponding to amino acids 1 - 1010 of T08446_PEA_1_P18, and a second amino acid 
sequence being at least 90 % homologous to 

LRGPAQVSAQLPvAGGGGRDAPEAAAQSPCSVPSQVPTPGFFSPAPRECLPPFLGVPKPG 
LYPLGPPSFQPSSPAPVWRSSLGPPAPLDRGENLYYEIGASEGSPYSGPTRSWSPFRSMPP 
5 DRLNASYGMLGQSPPLHRSPDFLLSYPPAPSCFPPDHLGYSAPQHPARRPTPPEPLYVNL 
ALGPRGPSPASSSSSSPPAHPRSRSDPGPPVPRLPQKQRAPWGPRTPHRVPGPWGPPEPL 
LLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHSEGQTRSYC 
corresponding to amino acids 1 - 295 of Q96CP3, which also corresponds to amino acids 101 1 - 
1305 of T08446_PEA_1_P18, wherein said first amino acid sequence and second amino acid 
10 sequence are contiguous and in a sequential order. 

2 An isolated polypeptide encoding for a head of T08446_PEA_1_P18, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

1 5 MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGS VQPLPTAGGPSVKGKPGKRLSAPRG 
PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 
DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQMLVPLLLQYLETLSGLVDSNLNC 
GPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVIKllYTAQAPDELSFEVGDrVSVIDM 
PPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAV 

20 PRPRGKLAGLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSE 
FIEAHGVVDGrYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVSSLCKLYFRELPlSIP 
LLTYQLYGKFSEAMSVPGEEERLVRVHDVIQQLPPPHYRTLEYLLRHLARMARHSANT 
SMHARNLAIVWAPNLLRSMELESVGMGGAAAFREVRVQSWVEFLLTHVDVLFSDTF 
TSAGLDPAGRCLLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAER 

25 RKGERGEKQRKPGGSSWKTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRS 
AKSEESLSSQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSCESLSSSSSSESSSSESSSSS 
SESSAAGLGALSGSPSHRTSAWLDDGDELDFSPPRCLEGLRGLDFDPLTFRCSSPTPGDP 
APPASPAPPAPASAFPPRVTPQAISPRGPTSPASPAALDISEPLAVSVPPAVLELLGAGGA 
PASATPTPALSPGRSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLP 

30 PPPLSLLRPGGAPPPPPKNPARLMALALAERAQQVAEQQSQQECGGTPPASQSPFHRSLS 
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LEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQRPMGTSRRG of 
T08446_PEA_1_P18. 

Comparison report between T08446_PEA_1JP18 and BAC86902 (SEQ ID NO: 1445): 
l.An isolated chimeric polypeptide encoding for T08446_PEA_1_P18, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence 

MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 

PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 

DDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQ corresponding to amino acids 1-154 

of T08446JPEA1JP18, a second amino acid sequence being at least 90 % homologous to 

MLVPLLLQYLETLSGLVDSNLNCGPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVI 

KRYTAQAPDELSFEVGDIVSVIDMPPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPG 

LKADADGPPCGIPAPQGISSLTSAVPRPRGKLAGLLRTFMRSRPSRQRLRQRGILRQRVF 

GCDLGEHLSNSGQDVPQVLRCCSEFIEAHGVVDGIYRLSGVSSNIQRLRHEFDSERIPEL 

SGPAFLQDIHSVSSLCKLYFRELPNPLLTYQLYGKFSEAMSVPGEEERLVRVHDVIQQLP 

PPHYRTLEYLLRHLARMARHSANTSMHARNLAIVWAPNLLRSMELESVGMGGAAAFR 

EVRVQSVWEFLLTHVDVLFSDTFTSAGLDPAGRCLLPRPKSLAGSCPSTRLLTLEEAQ 

ARTQGRLGTPTEPTTPKAPASPAERRKGERGEKQRKPGGSSWKTFFALGRGPSVPRBCKP 

LPWLGGTRAPPQPSGSRPDTVTLRSAKSEESLSSQASGAGLQRLHRLRRPHSSSDAFPVG 

PAPAGSCESLSSSSSSESSSSESSSSSSESSAAGLGALSGSPSHRTSAWLDDGDELDFSPPR 

CLEGLRGLDFDPLTFRCSSPTPGDPAPPASPAPPAPASAFPPRVTPQAISPRGPTSPASPAA 

LDISEPLAVSVPPAVLELLGAGGAPASATPTPALSPGRSLRPHLIPLLLRGAEAPLTDACQ 

QEMCSKLRGAQGPLGPDMESPLPPPPLSLLRPGGAPPPPPKNPARLMALALAERAQQVA 

EQQSQQECGGTPPASQSPFHRSLSLEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPR 

QQSDGSLLRSQRPMGTSRRGLRGPA corresponding to amino acids 1-861 of BAC86902, 

which also corresponds to amino acids 155 - 1015 of T08446_PEA_1_P18, a third amino acid 

sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 

least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 

QVSAQLRAGGGGRDAPEAAAQSPCSVPS corresponding to amino acids 1016 - 1043 of 
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T08446JPEA_1_P18, a fourth amino acid sequence being at least 90 % homologous to 
QVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGPPAPLDRGENLY 
YEIGASEGSPYSGPTRSWSPFRSMPPDRLNASYGMLGQSPPLHRSPDFLLSYPPAPSCFPP 
DHLGYS corresponding to amino acids 862 - 989 of BAC86902, which also corresponds to 
5 amino acids 1044 - 1 171 of T08446JPEA_1 JP18, and a fifth amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
APQHPARRPTPPEPLYVNLALGPRGPSPASSSSSSPPAHPRSRSDPGPPVPRLPQKQRAP 
WGPRTPHRVPGPWGPPEPLLLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYP 
1 0 TPSWSLHSEGQTRSYC corresponding to amino acids 1 1 72 - 1 305 of T08446__PEA_1JP1 8, 
wherein said first amino acid sequence, second amino acid sequence, third amino acid sequence, 
fourth amino acid sequence and fifth amino acid sequence are contiguous and in a sequential 
order. 

2 An isolated polypeptide encoding for a head of T08446_PEA_1JP18, comprising a 
15 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRG 
PFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSY 
20 DDFRSLDAHLHRCIFDRRFSCLP ELPPPPEGARAAQ of T08446JPEA_1 JP1 8. 

3. An isolated polypeptide encoding for an edge portion of T08446_PEA__1JP18, 
comprising an amino acid sequence being at least 70%, optionally at least about 80%, preferably 
at least about 85%, more preferably at least about 90% and most preferably at least about 95% 
homologous to the sequence encoding for QVSAQLRAGGGGRDAPEAAAQSPCSVPS, 

25 corresponding to T08446_PEA_1 JP1S. 

4. An isolated polypeptide encoding for a tail of T08446JPEA1JP18, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

30 APQHPARPIPTPPEPLYVNLALGPRGPSPASSSSSSPPAHPRSRSDPGPPVPRLPQKQRAP 
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WGPRTPHRVPGPWGPPEPLLLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYP 
TPSWSLHSEGQTRSYC in T08446JPEA_1 JP18. 

The location of the variant protein was determined according to results from a number of 
5 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

10 Variant protein T08446JPEA_1JP18 also has the following non- silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 788, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein T08446 PEA 1 JP18 
sequence provides support for the deduced sequence of this variant protein according to the 

15 present invention). 

Table 788 -Amino acid mutations 



SNP position(s) ; on amino acid 
sequence 


Alternative amino aciijfs) : 


Previously known SNP? 


714 


S->C 


Yes 


1000 


S->N 


No 


1273 


R->S 


No 


1274 


N->H 


No 



Variant protein T08446_PEA_1_P18 is encoded by the following transcript(s): 
T08446JPEA_1_T2, for which the sequence(s) is/are given at the end of the application. The 
20 coding portion of transcript T08446JPEA_1_T2 is shown in bold; this coding portion starts at 
position 228 and ends at position 4142. The transcript also has the following SNPs as listed in 
Table 789 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
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known SNPs in variant protein T08446_PEA_1_P18 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 789 -Nucleic acid SNPs 



SNP position on nucleotide 
sequence ■ '_ »• 


Alternative nucleic acid 


Previously known SNP? . 

" .; ; ; ■"«"' 


212 


G->A 


Yes 


431 


C->T 


Yes 


809 


C->T 


Yes 


1547 


G->A 


Yes 


2368 


C ->G 


Yes 


3226 


G->A 


No 


3284 


C->G 


Yes 


3377 


C->T 


Yes 


4046 


A->C 


No 


4047 


A->C 


No 



5 

Variant protein T08446_PEA_1_P19 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
T08446JPEA_1_T22. The location of the variant protein was determined according to results 
from a number of different software programs and analyses, including analyses from SignalP 
10 and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: secreted. The protein localization is believed to be secreted because both 
signal-peptide prediction programs predict that this protein has a signal peptide, and neither 
trans- membrane region prediction program predicts that this protein has a trans -membrane 
region. 

15 Variant protein T08446_PEA_1_P19 also has the following non-silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 790, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein T08446JPEA_1_P19 
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sequence provides support for the deduced sequence of this variant protein according to the 

present invention). 

Table 790 - Amino acid mutations 



. SNE positions) on amino acid 
sequence :■ y\ ■ ^ir' ,y : ' 


Alternative amino acid(s)^ ,4 


Previously known SNP? • 


194 


D->G 


Yes 



10 



Variant protein T08446JPEA_1 JP19 is encoded by the following transcript(s): 
T08446_PEA_1_T22 5 for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript T08446JPEA_1_T22 is shown in bold; this coding portion starts at 
position 228 and ends at position 965. The transcript also has the following SNPs as listed in 
Table 791 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein T08446JPEA_1 JP19 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 791 -Nucleic acid SNPs 



SNP position on nucleotide 
sequence [ " ' * ■ ""■'■"< 


Alternative nucleic acid 


Previously known SNP? 


212 


G-> A 


Yes 


431 


C->T 


Yes 


808 


A->G 


Yes 



15 



20 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
provided. 

Segment cluster TO 8446_PEA_l_node_2 according to the present invention is supported 
by 1 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): T08446_PEA_1_T2 and T08446JPEAJ JT22. 
Table 792 below describes the starting and ending position of this segment on each transcript. 

Table 792 - Segment location on transcripts 



Transcript name i •• 


Segment 
starting position 


. Segment ; / , 
ending position ™ 


T08446_PEA_1_T2 


1 


287 


T08446_PEA_1_T22 


1 


287 



5 

Segment cluster T08446_PEA_l_node_9 according to the present invention is supported 
by 17 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1_T2 and T08446_PEA_1_T22. 
Table 793 below describes the starting and ending position of this segment on each transcript. 

10 Table 793 - Segment location on transcripts 



^anscript riam^ a. V 5 ; ;\ 


Segment t^, 
starting position - ' 


'Segment \, ' 'X' i 
: ending position ■ % f 


T08446_PEA_1_T2 


552 


689 


T08446_PEA_1_T22 


552 


689 



Segment cluster T08446JPEA_l_node_15 according to the present invention is supported 
by 0 libraries. The number of libraries was determined as previously described. This segment 
1 5 can be found in the following transcript(s): T08446 JPEA_1 JT22. Table 794 below describes the 
starting and ending position of this segment on each transcript. 

Table 794 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


T08446_PEA_1_T22 


829 


968 



WO 2006/131783 



PCT/IB2005/004037 



873 

Segment cluster T08446_PEA_l_node_17 according to the present invention is supported 
by 22 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446JPEA_1_T2. Table 794 below describes the 



5 starting and ending position of this segment on each transcript. 
Table 794 - Segment location on transcripts 



Transcript name ^ 4 J ; ; 


Segment 
; starting poisition ./ 


Segment 

ending position iv 

- - %. w r f 


T08446_PEAJ_T2 


783 


905 



Segment cluster T08446_PEA_l_node_25 according to the present invention is supported 
10 by 24 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446JPEA_1 JT2. Table 12 below describes the 
starting and ending position of this segment on each transcript. 

Table 12 - Segment location on transcripts 





Segmfct % r ff. /V 
stmtifig pdsitidp. " 


Segment " :\ ■ ■■ 
I encpntg j^sitiorir U 


T08446JPEA_1_T2 


1111 


1263 



15 

Segment cluster T08446_PEA_l_node_29 according to the present invention is supported 
by 25 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1_T2. Table 795 below describes the 
starting and ending position of this segment on each transcript. 

20 Table 795 - Segment location on transcripts 



Transcript name 


\ Segment 
starting position 


Segment 
ending position 


T08446__PEA__1_T2 


1367 


1511 
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Segment cluster T08446JPEA_l_node_38 according to the present invention is supported 
by 20 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): T08446_PEA_1 JT2. Table 796 below describes the 
starting and ending position of this segment on each transcript. 

Table 796 - Segment location on transcripts 



Transcript name 


Segment . ' . / . 
stajffiiig jp^osjiti on ; 


Segment 
ending position 


T08446JPEA_1JT2 


1703 


1848 



1 0 Segment cluster T08446_PEA_l_node_43 according to the present invention is supported 

by 15 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1_T2. Table 797 below describes the 
starting and ending position of this segment on each transcript. 

Table 797 - Segment location on transcripts 



Transcript iipiie ^ 


Segment 
starting position 


Segment 

ending position ; 


T08446_PEA_1_T2 


1849 


2002 



15 

Segment cluster T08446JPEA_l_node_51 according to the present invention is supported 
by 19 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446JPEA_1_T2. Table 798 below describes the 
20 starting and ending position of this segment on each transcript. 

Table 798 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 
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875 

2224 ' I 2571 



Segment cluster T08446_PEA_l_node_52 according to the present invention is supported 
by 15 libraries. The number of libraries was determined as previously described. This segment 



can be found in the following transcript(s): T08446_PEA_1_T2. Table 799 below describes 
starting and ending position of this segment on each transcript. 
Table 799 - Segment location on transcripts 



Transcript name ;\ ? '; t \ 


^Se^ent \ : \ / 
starting position ; v r' 


Segment, ?-\ ' 
ending position / ; 4- 


T08446_PEA_1_T2 


2572 


2694 



10 



Segment cluster T08446_PEA_l_node_55 according to the present invention is supported 
by 21 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1_T2. Table 800 below describes the 
starting and ending position of this segment on each transcript. 
Table 800 - Segment location on transcripts 



15 



.T^^ptname^ u , \ , ., r \ \ 


Segment - ■ r 
starting position 


' Segment j * : \ ;; ; 
i. ending position 


T08446JPEA_1_T2 


2707 


2883 



20 



Segment cluster T08446_PEA_l_node_57 according to the present invention is supported 
by 37 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1_T2. Table 801 below describes the 
starting and ending position of this segment on each transcript. 
Table 801 - Segment location on transcripts 
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TraBSCiipt name ^ y 


Segment 
starting position 


Segment 
ending position 


T08446_PEAJJT2 


2884 


3275 



Segment cluster T08446_PEA_l_node_59 according to the present invention is supported 
by 36 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): T08446JPEAJLT2. Table 802 below describes the 
starting and ending position of this segment on each transcript. 



Table 802 - Segment location on transcripts 





stfemg^positibn r 


y : Segriienti ; ^,.. % 
ending positionl r f 


T08446_PEA_1JT2 


3360 


3670 



10 Segment cluster T08446_PEA_l_node_62 according to the present invention is supported 

by 36 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446JPEA_1_T2. Table 803 below describes the 
starting and ending position of this segment on each transcript. 



Table 803 - Segment location on transcripts 



Traaseri.pt name 


Segment 

: starting position 


Segment . .yj ' 
ending position 


T08446_PEA_1_T2 


3783 


3988 



15 

Segment cluster T08446_PEA_l_node_63 according to the present invention is supported 
by 64 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1 JT2. Table 804 below describes the 
20 starting and ending position of this segment on each transcript. 
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Table 804 - Segment location on transcripts 



10 



Transcript name ■ / : 


Segment 

starting position * 


Segment 

ending position ■} 


T08446J > EA_1JT2 


3989 


4414 



the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

Segment cluster T08446JPEA_l_node_3 according to the present invention is supported 
by 14 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446JPEA_1_T2 and T08446J>EA_1_T22. 
Table 805 below describes the starting and ending position of this segment on each transcript. 

Table 805 - Segment location on transcripts 



.JTrariscriipt n afce . ■'[/ ^ - 


Segment • ^ .s - 
starting position : 


Segment .. 
ending position sf 


T08446_PEA_1_T2 


288 


385 


T08446_PEA_1_T22 


288 


385 



Segment cluster T08446JPEA_l_node_5 according to the present invention is supported 
by 17 libraries. The number of libraries was determined as previously described. This segment 
1 5 can be found in the following transcript(s): T08446JPEA_1_T2 and T08446_PEA_1_T22. 



Table 806 below describes the starting and ending position of this segment on each transcript. 
Table 806 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


T08446_PEA_1_T2 


386 


470 


T08446_PEA_1_T22 


386 


470 
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Microarray (chip) data is also available for this segment as follows. As described aboA 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 807. 



5 Table 807 - Oligonucleotides related to this segment 





Overexpressed in cancers 


; Chip reference, . , t . 


T08446JL9J) 


lung malignant tumors 


LUN 



Segment cluster T08446_PEA_l_node_7 according to the present invention is supported 
by 19 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1_T2 and T08446_PEA_1_T22. 
Table 808 below describes the starting and ending position of this segment on each transcript. 



Table 808- Segment location on transcripts 



Transcript: n^me ; r ; : 2 : " 


Segment ] ' ■ ~X ^ 
starting position ? V 


Segment % v '; ' 
ending position t i?" 


T08446_PEA_1_T2 


All 


551 


T08446_PEA_1_T22 


All 


551 



Microarray (chip) data is also available for this segment as follows. As described above 
1 5 with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 809. 
Table 809- Oligonucleotides related to this segment 



Oligonucleotide name 


Overexpressed in cancers 


I Chip reference 


T08446_0_9_0 


lung malignant tumors 


LUN 



20 
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Segment cluster T08446_PEA_1 jnode_12 according to the present invention is supported 
by 14 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1_T2 and T08446JPEA_1_T22. 
Table 810 below describes the starting and ending position of this segment on each transcript. 



5 Table 810- Segment location on transcripts 



Transcript name ' _ :" : ^ f . t : ■ 


'Segment , v ' : ' ■ 
starting position 


Segment : - 
ending position 


T08446_PEA_1_T2 


690 


782 


T08446_PEA_1_T22 


690 


782 



Segment cluster T08446JPEA_l_node_13 according to the present invention is supported 
by 0 libraries. The number of libraries was determined as previously described. This segment 
10 can be found in the following transcript(s): T08446_PEA_1 JT22. Table 81 1 below describes the 
starting and ending position of this segment on each transcript. 

Table 811 - Segment location on transcripts 



Transcript name .ft' 


Segment • 
starting position 


■'Segment "f 
\ ending position *- ' 


T08446JPEA_1JT22 


783 


828 



15 Segment cluster T08446_PEA_l_node_19 according to the present invention is supported 

by 19 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1__T2. Table 812 below describes the 
starting and ending position of this segment on each transcript. 

Table 812 - Segment location on transcripts 



Transcript name 


; Segment 
starting position 


Segment 
ending position 


T08446_PEAJ_T2 


906 


983 
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Segment cluster T08446JPEA 1 jnode_21 according to the present invention is supported 
by 21 libraries. The number of libraries was determined as previously described. This segment 



5 can be found in the following transcript(s): T08446_PEA_1_T2. Table 813 below describes the 
starting and ending position of this segment on each transcript. 

Table 813 - Segment location on transcripts 



Transcript name W\ f :, .. 

.4 : ; , ■ ' "■■ Y ' -. " • ' V ; 


Segihent 
starting position. 


Segment 

jenSing position .5 


T08446JPEA_1_T2 


984 


1050 



10 Segment cluster T08446_PEA_ljtiode_23 according to the present invention is supported 

by 22 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446JPEA_1_T2. Table 814 below describes the 
starting and ending position of this segment on each transcript. 

Table 814 - Segment location on transcripts 



■ Xransmpt,i|ame pSf - r% 


; Segrnent;;| . 
1 starting position. 


Segment 
ending position 


T08446JPEAJJT2 


1051 


1110 | 



15 

Segment cluster T08446_PEA_l_node_27 according to the present invention is supported 
by 23 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1_T2. Table 815 below describes the 
20 starting and ending position of this segment on each transcript. 

Table 815 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 
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T08446_PEA_1_T2 


1264 


1366 









Segment cluster T08446JPEA_l_node_32 according to the present invention is supported 
by 23 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): T08446JPEA_1_T2. Table 816 below describes the 
starting and ending position of this segment on each transcript. 



Table 816- Segment location on transcripts 





Segments * l ; f 
starting position • ; P| 


Segment • r ... 
ending position 


T08446_PEA_1_T2 


1512 


1594 



10 Segment cluster T08446_PEA_l_node_34 according to the present invention is supported 

by 22 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446JPEA_1_T2. Table 817 below describes the 
starting and ending position of this segment on each transcript. 

Table 81 7- Segment location on transcripts 



Transcripi-name ;*f r % i " ; 


S Segment i 
! starting pdsiticm 


Segmept ;i 
ending position 


T08446J>EA_1JT2 


1595 


1702 



15 

Segment cluster T08446_PEA_l_node_45 according to the present invention is supported 
by 19 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1_T2. Table 818 below describes the 
20 starting and ending position of this segment on each transcript. 

Table 818- Segment location on transcripts 
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Trmseriptiiame : 


Segment 
starting position 


; Segment 
ending position 


T08446JPEA_1_T2 


2003 


2091 



Segment cluster T08446_PEA_l__node_46 according to the present invention is supported 
by 1 8 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): T08446JPEA_1_T2. Table 819 below describes the 
starting and ending position of this segment on each transcript. 

Table 819- Segment location on transcripts 



. Transcript ria^ \, '$*f <>v ' f |-) ■ • 


Segment :: yi f, 
starting position f:f | 


Segment J f\ 
} ending position : : 


T08446_PEA__1_T2 


2092 


2148 



10 Segment cluster T08446_PEA_l_node_48 according to the present invention is supported 

by 19 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446JPEA_1_T2. Table 820 below describes the 
starting and ending position of this segment on each transcript. 

Table 820- Segment location on transcripts 



: transcript name - : '■' • *" 


■' Segment / . : ; ' 
starting position 


Segment 5 ; 
I ending position 


T08446JPEA__1_T2 


2149 


2223 



15 

Segment cluster T08446JPEA_l_node_54 according to the present invention can be 
found in the following transcript(s): T08446_PEA_1_T2. Table 821 below describes the starting 
and ending position of this segment on each transcript. 

20 Table 821- Segment location on transcripts 
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Transcript name * ; / r l 


Segment 
starting position 


Segment 
ending position 


T08446_PEA_1_T2 


2695 


2706 



Segment cluster T08446_PEA_l_node_58 according to the present invention is supported 
by 13 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): T08446_PEAJ_T2. Table 822 below describes the 
starting and ending position of this segment on each transcript. 

Table 822- Segment location on transcripts 



Tf a^ript name , ' •# «- . ' :'| < . 4. 


Segmmtf C 
starting po|itii>n 


Segment : : l^}. 
ending position 


T08446_PEA_1_T2 


3276 


3359 



10 Segment cluster T08446JPEA_l_node_60 according to the present invention is supported 

by 27 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1_T2. Table 823 below describes the 
starting and ending position of this segment on each transcript. 

Table 823 - Segment location on transcripts 





| Segment . ' 
starting position 


Segment 
ending position 


T08446JPEA_1_T2 


3671 


3720 



15 

Segment cluster T08446_PEA_l_node_61 according to the present invention is supported 
by 25 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): TO 844 6_PE A_ 1 JT2 . Table 824 below describes the 
20 starting and ending position of this segment on each transcript. 
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Table 824 - Segment location on transcripts 



Transcript name ! ^ 


Segment 

starting position ; -;.p] 


Segment 
% exiting position 


T08446_PEA_1JT2 


3721 


3782 



Segment cluster T08446JPEA_l_node_64 according to the present invention can be 
5 found in the following transcript(s): T08446_PEA_1 JT2. Table 825 below describes the starting 
and ending position of this segment on each transcript. 



Table 825 - Segment location on transcripts 



Transcript name f'f i 


Segment 
startthg position 


< Segineht \. : i . 
ending position 


T08446_PEA_1_T2 


4415 


4420 



10 



Segment cluster T08446JPEA_lnode_65 according to the present invention is supported 
by 39 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T08446_PEA_1 JT2. Table 826 below describes the 
starting and ending position of this segment on each transcript. 

Table 826 - Segment location on transcripts 



Transcript name \?7 y Hj 


i S egment 

I starting position 


Segment 
ending position 


T08446_PEA_1_T2 


4421 


4472 



15 



20 



Segment cluster T08446_PEA_l_node 66 according to the present invention is supported 
by 29 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): TO 8446_PE A_ 1 _T2 . Table 827 below describes the 
starting and ending position of this segment on each transcript. 
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Table 827 - Segment location on transcripts 



Transcfipt name 7 


Segments 

starting position , 


Segment 

ending position : ; 


T08446JPEAJLT2 


4473 


4539 



5 



Variant protein alignment to the previously known protein: 
10 Sequence name: SNXQ_HUMAN 

Sequence documentation : 

Alignment of: T084 4 6JPEA_1_P1 8 x SNXQ_HUMAN 

15 

Alignment segment 1/1: 

Quality: 1835.00 

Escore: 0 
20 Matching length: 185 

length: 185 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
25 Identity: 100.00 

Gaps : 0 

Alignment : 



Total 
Matching Percent 
Total Percent 
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1 MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKP 5 0 

I I I I I I I I I I 1 I I I I I I I I I I 1 I I I I I I I I I 1 M I II I I I I I I I M I I I I 

1 MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKP 5 0 

51 GKRLSAPRGPFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELV 100 

| | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I 1 I 
51 GKRLSAPRGPFPRLADCAHFHYENVDFGHIQLLLSPDREGPSLSGENELV 100 

101 FGVQVTCQGRSWPVLRSYDDFRSLDAHLHRCIFDRRFSCLPELPPPPEGA 150 

I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 
101 FGVQVTCQGRSWPVLRSYDDFRSLDAHLHRCIFDRRFSCLPELPPPPEGA 150 

151 RAAQMLVPLLLQYLETLSGLVDSNLNCGPVLTWME 185 

I I M I I I I I I I I I I I I I I I I i I I I I M I I I I I I I I 

151 RAAQMLVPLLLQYLETLSGLVDSNLNCGPVLTWME 185 



Sequence name: Q9NT23 
Sequence documentation : 

Alignment of: T0 8 446_PEA_1_P18 x Q9NT23 
Alignment segment 1/1: 
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Quality: 8548.00 

Escore: 0 

Matching length: 862 
length: 8 62 

5 Matching Percent Similarity: 99.88 
Identity: 99.88 

Total Percent Similarity: 99.88 
Identity: 99.88 

Gaps: 0 

10 

Alignment : 



Total 



Matching Percent 



Total Percent 



15 



4 44 HDVIQQLPPPHYRTLEYLLRHLARMARHSANTSMHARN LAI VWAPNLLRS 4 93 

I I I I I I I I I I I t i 1 I I I I I I i I I I I I I I I I I I I I I I I I I I I I I ! I I 1 I I I 

1 HDVIQQLPPPHYRTLEYLLRHLARMARHSANTSMHARNLAI VWAPNLLRS 5 0 



20 



4 94 MELESVGMGGAAAFREVRVQSWVEFLLTHVDVLFSDTFTSAGLDPAGRC 543 

I I I I I I I ! I I 1 I I I I II I I I I I I I I i I 1 I II I I I I I I I I I I I I I I I I I I I 

51 MELESVGMGGAAAFREVRVQSVVVEFLLTHVDVLFSDTFTSAGLDPAGRC 100 
. « » • * 

544 LLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAERR 593 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I 

101 LLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAERR 150 



25 



594 KGERGEKQRKPGGSSWKTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSR 643 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

151 KGERGEKQRKPGGSSWKTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSR 200 



30 



644 PDTVTLRSAKSEESLSSQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSC 693 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
201 PDTVTLRSAKSEESLSSQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSC 250 
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• • • * * 

694 ESLSSSSSSESSSSESSSSSSESSAAGLGALSGSPSHRTSAWLDDGDELD 743 

I I I I t I I I I I 1 I i I 1 I I I I I Ml I I I i I I I M M I II I I I I I I ! II I I i I 

251 ESLSSSSSSESSSSESSSSSSESSAAGLGALSGSPSHRTSAWLDDGDELD 300 

74 4 FSPPRCLEGLRGLDFDPLTFRCSSPTPGDPAPPASPAPPAPASAFPPRVT 7 93 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
301 FSPPRCLEGLRGLDFDPLTFRCSSPTPGDPAPPASPAPPAPASAFPPRVT 350 

794 PQAISPRGPTSPASPAALDISEPLAVSVPPAVLELLGAGGAPASATPTPA 843 

I I I I I I I I I I I I I I I I I 1 I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
351 PQAISPRGPTSPASPAALDISEPLAVSVPPAVLELLGAGGAPASATPTPA 400 

84 4 LSPGRSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPL 8 93 

I I I M I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I 1 II I I I 
401 LS PGRSLRPHL I PLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMES PL 450 

8 94 PPPPLSLLRPGGAPPPPPKNPARLMALALAERAQQVAEQQSQQECGGTPP 94 3 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I 

451 PPPPLSLLRPGGAPPPPPKNPARLMALALAERAQQVAEQQSQQECGGTPP 500 

94 4 ASQSPFHRSLSLEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQS 993 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

501 ASQSPFHRSLSLEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQS 550 

994 DGSLLRS QRPMGT SRRGLRGPAQVS AQLRAGGGGRDAPEAAAQS PC S VP S 1043 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I 

551 DGSLLRS QRPMGT SRRGLRGPAQVS AQLRAGGGGRDAPEAAAQS PCS VPS 60 0 

1044 QVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGP 1093 
I | | | I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
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601 QVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGP 650 

10 94 PAPLDRGENLYYEIGASEGSPYSGPTRSWSPFRSMPPDRLNASYGMLGQS 1143 

1 1 I I I I I I I I 1 I 1 I I I I I I I i I I I I I I I I I I I I I I I 1 I M I I I I I I M I 

5 651 PAPLDRGENLYYEIGASEGSPYSGLTRSWSPFRSMPPDRLNASYGMLGQS 700 

1144 PPLHRSPDFLLSYPPAPSCFPPDHLGYSAPQHPARRPTPPEPLYVNLALG 1193 

I I I I I I I ! I I I I I I I I i I II I I I I I I I M I I 1 I I I I I I I M M I I I i I I I 

701 PPLHRSPDFLLSYPPAPSCFPPDHLG YSAPQHPARRPTPPEPLYVNLALG 750 

10 • 

1194 PRGPSPASSSSSSPPAHPRSRSDPGPPVPRLPQKQRAPWGPRTPHRVPGP 1243 

I I I I I I I I I I I I I I M I I I I I I I I I I I I 1 I I I I I I I 1 I M I I I I I I I I I I 

751 PRGPSPASSSSSSPPAHPRSRSDPGPPVPRLPQKQRAPWGPRTPHRVPGP 800 
15 124 4 WGPPEPLLLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSW 12 93 

| I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 

801 WGPPEPLLLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSW 850 

1294 SLHSEGQTRSYC 1305 

20 I I I I I I I I I I I I 

851 SLHSEGQTRSYC 862 



25 



Sequence name: Q9 6CP3 
30 Sequence documentation: 
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Alignment of: T08 4 4 6_PEA__1_P1 8 x Q96CP3 



Alignment segment 1/1: 

5 Quality: 
Escore: 0 

Matching length: 
length: 295 
Matching Percent Similarity: 
10 Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



3019.00 

295 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 



15 Alignment: 

1011 LRGPAQVSAQLRAGGGGRDAPEAAAQSPCSVPSQVPTPGFFSPAPRECLP 10 60 

I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I 11 I i I I I i M I I I I I 1 I I I I 

1 LRGPAQVSAQLRAGGGGRDAPEAAAQSPCSVPSQVPTPGFFSPAPRECLP 50 
20 ..... 

1061 PFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGPPAPLDRGENLYYEIGAS 1110 

I I I I I I I I I I I I I I I I I I I I I I i I I I I I I M I I I I I I I I I I i I I I I i I I I 

51 PFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGPPAPLDRGENLYYEIGAS 100 
• • • • • 

25 1111 EGSPYSGPTRSWSPFRSMPPDRLNASYGMLGQSPPLHRSPDFLLSYPPAP 1160 

I I ! I I I I I I I I I I 1 I I I I I I I I I I I I I t I I I I I I I I I i I I I I I I I 1 I I I I 

101 EGSPYSGPTRSWSPFRSMPPDRLNASYGMLGQSPPLHRSPDFLLSYPPAP 150 

1161 SCFPPDHLGYSAPQHPARRPTPPEPLYVNLALGPRGPSPASSSSSSPPAH 1210 

30 I I I I I I I I I 1 I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I 1 I I I I I 

151 SCFPPDHLGYSAPQHPARRPTPPEPLYVNLALGPRGPSPASSSSSSPPAH 200 
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1211 PRSRSDPGPPVPRLPQKQRAPWGPRTPHRVPGPWGPPEPLLLYRAAPPAY 1260 

I I I I 1 I I I I I I 1 I I I I I I I ! I I I I I I I I I I I I ! 1 I I ! i M I I I I I I I I I 1 
201 PRSRSDPGPPVPRLPQKQRAPWGPRTPHRVPGPWGPPEPLLLYRAAPPAY 250 

12 61 GRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHSEGQTRSYC 1305 

1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 i i n 1 1 1 it i i i 

251 GRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHSEGQTRSYC 2 95 



Sequence name: BAC86902 
Sequence documentation : 

Alignment of: T0 84 4 6_PEA_1JP18 x BAC86902 
Alignment segment 1/1: 

Quality: 9651.00 

Escore: 0 

Matching length: 991 Total 

length: 1019 

Matching Percent Similarity: 99.90 Matching Percent 
Identity: 99.90 

Total Percent Similarity: 97.15 Total Percent 

Identity: 97.15 

Gaps: 1 



WO 2006/131783 



PCT/IB2005/004037 



892 



Alignment : 

. . • - * 

155 MLVPLLLQYLETLSGLVDSNLNCGPVLTWMELDNHGRRLLLSEEASLNIP 204 

I I I I M I I I I I I I I I 1 I I I I I I I I I ! I I I I ! I i t I I I I I M I I I I I I I 1 I 

1 MLVPLLLQYLETLSGLVDSNLNCGPVLTWMELDNHGRRLLLSEEASLNIP 5 0 

205 AVAAAHVIKRYTAQAPDELSFEVGDIVSV1DMPPTEDRSWWRGKRGFQVG 254 
I | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I ) I I 1 I I 1 II I I II I I II M 
51 AVAA AH V I KR Y T AQ A P DE L S FE VG D I VS V I DMP PTE DRS W WRGKRG FQ V G 100 

- 

255 FFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAVPRPRGKLA 304 

I I I I I I I I I I II I I t I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I III 
101 FFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAVPRPRGKLA 150 

305 GLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCS 354 

I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I M I I I I I I I I I I I I I I 

151 GLLRTFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCS 200 

„ . - • - 

355 EFIEAHGVVDGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVS 404 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
201 EFIEAHGVVDGIYRLSGVSSNIQRLRHEFDSERIPELSGPAFLQDIHSVS 250 

• • • - 

4 05 SLCKLYFRELPNPLLTYQLYGKFSEAMSVPGEEERLVRVHDVIQQLPPPH 454 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
251 SLCKLYFRELPNPLLTYQLYGKFSEAMSVPGEEERLVRVHDVIQQLPPPH 300 

4 55 YRTLEYLLRHLARMARHSANTSMHARNLAIVWAPNLLRSMELESVGMGGA 504 

I I I I I I I I I I I II I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

301 YRTLEYLLRHLARMARHSANTSMHARNLAIVWAPNLLRSMELESVGMGGA 350 
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505 AAFREVRVQSVVVEFLLTHVDVLFSDTFTSAGLDPAGRCLLPRPKSLAGS 554 

I I I I I I ! I i I I I I I I ! I I II I I I I I M I II I I i I I I I I I I 11 I I I I I ! I I 

351 AAFREVRVQSVVVEFLLTHVDVLFSDTFTSAGLDPAGRCLLPRPKSLAGS 40 0 

555 CPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAERRKGERGEKQRKP 604 

I I I I I I I 1 II I I I I I I I I I I I I I M I I I I I I I I II II I I I I I I I I I I I I I 

401 CPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAERRKGERGEKQRKP 450 

60 5 GGSSWKTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRSAKS 654 

I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I II I I I I I 1 I I I I I I I I I I I 

451 GGSSWKTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRSAKS 500 

655 EE S L S SQAS GAGLQRLHRLRRPH S S S DAFPVGPAPAG S CE SLSSSSSSES 704 

I I I I I M I I I I I I I I I I 1 I I I I I 1 I I I I I I I M I I I I I I I 1 I I I I I I I I I 
501 EESLSSQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSCESLSSSSSSES 550 

705 SSSESSSSSSESSAAGLGALSGSPSHRTSAWLDDGDELDFSPPRCLEGLR 754 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 1 I I I I M I I I 

551 SSSESSSSSSESSAAGLGALSGSPSHRTSAWLDDGDELDFSPPRCLEGLR 600 
755 GLDFDPLTFRCSSPTPGDPAPPASPAPPAPASAFPPRVTPQAISPRGPTS 804 

I I I I I I I I I I I I I 1 I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

601 GLDFDPLTFRCSSPTPGDPAPPASPAPPAPASAFPPRVTPQAISPRGPTS 650 

80 5 PASPAALDISEPLAVSVPPAVLELLGAGGAPASATPTPALSPGRSLRPHL 854 

M I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
651 PAS PAALD I SE PLAVS VP P AVLELLGAGGAPAS AT PT PALS PGRS LRP Hl» 7 00 

855 IPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLPPPPLSLLRPG 904 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
701 IPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLPPPPLSLLRPG 750 



WO 2006/131783 



PCT/IB2005/004037 



894 



905 G A P P P P PKN P ARLMAL AL AE R AQ Q V AE QQSQQECGGTP PA SQSPFHRSLS 95 4 

I I I I I I I I I I I I I I I 1 I I I I I I ! I 1 I I I I I 1 I I I I I I I I 11 I I 1 I 1 I I I 1 

751 GAPPPPPKNPARLMALALAERAQQVAEQQSQQECGGTPPASQSPFHRSLS 80 0 

955 LEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQRPM 1004 

I I I I I I I I I I I I i I I I I II I I I I I I I M I II I I I I I I I I I I I I I I I I I I I 

801 LEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQRPM 850 

10 1005 GTSRRGLRGPAQVSAQLRAGGGGRDAPEAAAQSPCSVPSQVPTPGFFSPA 1054 
I I I I I I I I I I I I M I I I I I I I I 

851 GTSRRGLRGPA QVPTPGFFSPA 872 

1055 PRECLPPFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGPPAPLDRGENLY 1104 
15 | | | | | | | | || | | | | | | M | | | | | | 1 | | | | | | M | | | | | | I | | | | | | | I I I 

873 PRECLPPFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGPPAPLDRGENLY 922 

1105 YEIGASEGSPYSGPTRSWSPFRSMPPDRLNASYGMLGQSPPLHRSPDFLL 1154 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

20 923 YEIGASEGSPYSGPTRSWSPFRSMPPDRLNASYGMLGQSPPLHRSPDFLL 972 

1155 SYPPAPSCFPPDHLGYSAP 1173 

I I I II I I I I I I I I I I I I I 
973 SYPPAPSCFPPDHLGYSPP 991 

25 

DESCRIPTION FOR CLUSTER HUMCA1XIA 
Cluster HUMCA1XIA features 4 transcript(s) and 46 segment(s) of interest, the names for 
which are given in Tables 828 and 829, respectively, the sequences themselves are given at the 
30 end of the application. The selected protein variants are given in table 830 

Table 828 - Transcripts of interest 
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Transcript Name " : , 


Sequence ID No. ; 


HUMCA1XIA_T16 


99 


HUMCA1XIA_T17 


100 


HUMCA1XIA_T19 


101 


HUMCA1XIA_T20 


102 



Table 829 - Segments of interest 





Keriii£ ties- -TT5''- : n (i \ '$ »4-""- > 


T TT TA A 1 VT A f\ 

H U MCA 1 XI A_no cie (J 


1 A O 

/4z 


T TT TA T/~^ A 1 VT A - A „ <"■» 

H U MCA 1 XI A_node_2 


/43 


T TT TA /T/~~* A 1 VT A „ „ J „ /I 

HUMCAlXlA_node_4 


HA A 

/44 


T TT TA /f A 1 "VT A ^ J ^ /f 

H UMCA 1 XIA_noae 6 


/4D 


T TT TA iff* 1 A 1 VT A ~. ^ A ~ O 

H UMC A 1 X1A noae_S 


/4o 


T TT TA /I"/" 1 A 1 "VT A J ^ f\ 

HUMC A 1 XlA_noae_9 


H A1 

747 




I Hr O 


HUMCAlXIA_node_54 


749 


HUMC A 1 XI A_node_5 5 


750 


HUMCAlXIA_node_92 


751 


HUMCAlXIA_node_l 1 


752 


HUMCAlXIA_node_l 5 


753 


HUMCA1 XIA_node_l 9 


754 


HUMCAlXIA_node_2 1 


755 


HUMCAlXIA_node_23 


756 


HUMCAlXIA_node_25 


757 


HUMCAlXIA_node_27 


758 


HUMCA1 XIA_node_29 


759 


HUMCAlXIA_node_3 1 


760 


HUMCAlXIA_node_33 


761 


HUMCAlXIA_node_35 


762 
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HUMCA lXIA_node_37 


763 


HUMCA 1 XIA_node_39 


764 


HUMCA 1 XIA_node_4 1 


765 


HUMCAlXIA_node_43 


766 


HUMCA 1 XIA_node_45 


767 


HUMCA 1 XIA_node_47 


769 


HUMCA 1 XIA_node_49 


769 


HUMCA 1 XIA_node_5 1 


770 


HUMCAlXIA_node_57 


771 


HUMCAlXIA_node_59 


772 


HUMCA 1 XIA_node_62 


773 


HUMCAlXIA_node_64 


774 


HUMCAlXIA_node_66 


775 


HUMCAlXIA_node_68 


776 


HUMCAlXIA_node_70 


777 


HUMCAlXIA_node_72 


778 


HUMCA lXIA_node_74 


779 


HUMCAlXIA_node_76 


780 


HUMCAlXIA_node_78 


782 


HUMCAlXIA_node_81 


783 


HUMCAlXIA_node_83 


784 


, HUMCA 1 XIA_node_85 


785 


HUMCAlXIA_node_87 


786 


HUMCAlXIA_node_89 


787 


HUMCAlXIA_node_91 


788 



Table 830 - Proteins of interest 



Protein Name 


Sequence ID No. 


Corresponding Transeript(s) 


HUMCA 1XIA_P 14 


1372 


HUMCA 1XIA_T 16 
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HUMCA1XIA_P15 


1373 


HUMCA1XIA_T17 


HUMCA1XIA_P16 


1374 


HUMCA1XIAJT19 


HUMCA1XIA_P17 


1375 


HUMCA1XIA_T20 



These sequences are variants of the known protein Collagen alpha 1 (SwissProt accession 
identifier CA1BJHHJMAN), SEQ ID NO: 1446, referred to herein as the previously known 
protein. 

5 Protein Collagen alpha 1 is known or believed to have the following function(s): May 

play an important role in fibrillogenesis by controlling lateral growth of collagen II fibrils. The 
sequence for protein Collagen alpha 1 is given at the end of the application, as "Collagen alpha 
1 amino acid sequence". Known polymorphisms for this sequence are as shown in Table 831. 

Table 831 - Amino acid mutations for Known Protein 



SNP position(s) on 
ammo acicl sequence 


Contt^enfc "■ \: •■ ' ;• - ' : - 


625 


G -> V (in STL2). /FTId=VAR_013583. 


676 


G -> R (in STL2; overlapping phenotype with Marshall 
syndrome). /FTTd=VAR_013584. 


921 - 926 


Missing (in STL2; overlapping phenotype with Marshall 
syndrome). /FTId=VAR_013585. 


1313- 1315 


Missing (in STL2; overlapping phenotype with Marshall 
syndrome). /FTId=VAR_013586. 


1516 


G -> V (in STL2; overlapping phenotype with Marshall 
syndrome). /FTId=VAR_013587. 


941 - 944 


KDGL -> RMGC 


986 


Y->H 


1074 


R->P 


1142 


G->D 


1218 


M->W 


1758 


T->A 


1786 


S->N 
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The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: cartilage condensation; vision; hearing; cell-cell adhesion; 
extracellular matrix organization and biogenesis, which are annotation(s) related to Biological 
5 Process; extracellular matrix structural protein; extracellular matrix protein, adhesive, which are 
annotation(s) related to Molecular Function; and extracellular matrix; collagen; collagen type 
XI, which are annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
1 0 from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

Cluster HUMCA1XIA can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
15 the table and the numbers on the 3^ axis of figure 32 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
20 Figure 32 and Table 832. This cluster is overexpressed (at least at a minimum, level) in the 
following pathological conditions: bone malignant tumors, epithelial malignant tumors, a 
mixture of malignant tumors from different tissues and lung malignant tumors. 

Table 832 - Normal tissue distribution 



Name' of Tissue 


Number 


adrenal 


0 


bone 


207 


brain 


13 


colon 


0 


epithelial 


11 


general 


11 
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head and neck 


0 


kidney 


0 


lung 


0 


breast 


8 


pancreas 


0 


stomach 


73 


uterus 


9 



Table 833 - P values and ratios for expression in cancerous tissue 



Name, of Tissue 


PI 


\w: ■ 


SP1 ; 


R3 


SP2 


R4 -,vj 


adrenal 


4.2e-01 


1.9e-01 


9.6e-02 


3.4 


8.2e-02 


3.6 


bone 


2.4e-01 


6.3e-01 


7.7e-10 


4.3 


5.3e-03 


1.6 


brain 


5.0e-01 


6.9e-01 


1.8e-01 


2.1 


4.2e-01 


1.3 


colon 


1.3e-02 


2.9e-02 


2.4e-01 


3.0 


3.5e-01 


2.4 


epithelial 


3.9e-04 


3.2e-03 


1.3e-03 


2.3 


1.8e-02 


1.7 


general 


5.6e-05 


1.6e-03 


9.5e-17 


4.5 


l.le-09 


2.8 


head and neck 


1.2e-01 


2.1e-01 


1 


1.3 


1 


1.1 


kidney 


6.5e-01 


7.2e-01 


3.4e-01 


2.4 


4.9e-01 


1.9 


lung 


5.3e-02 


9.1e-02 


5.5e-05 


7.3 


5.0e-03 


4.0 


breast 


4.3e-01 


5.6e-01 


6.9e-01 


1.4 


8.2e-01 


1.1 


pancreas 


3.3e-01 


1.8e-01 


4.2e-01 


2.4 


1.5e-01 


3.7 | 


stomach 


5.0e-01 


6.1e-01 


6.9e-01 


1.0 


6.7e-01 


0.8 


Uterus 


7.1e-01 


7.0e-01 


6.6e-01 


1.1 


6.4e-01 


1.1 



As noted above, cluster HUMCA1XIA features 4 transcript(s), which were listed in Table 
1 above. These transcript(s) encode for protein(s) which are variant(s) of protein Collagen alpha 
5 LA description of each variant protein according to the present invention is now provided. 

Variant protein HUMC A 1 XIA P 1 4 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
HUMC A 1 XI AT 1 6 . An alignment is given to the known protein (Collagen alpha 1) at the end 
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of the application. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 

Comparison report between HUMCA1XIA_P14 and CA1B_HUMAN_V5 (SEQ ID NO: 
1447): 

l.An isolated chimeric polypeptide encoding for HUMCA1XIA_P14, comprising a first 
amino acid sequence being at least 90 % homologous to 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTT 

GFCTNRKNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIY 

NEHGIQQIGVEVGRSPVFLFEDHTGKPAPEDYPLFRTVNIADGKWHRVAISVEKKTVTM 

IVDCKIOCTTKPLDRSERAIVDTNGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEH 

YSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQT 

EAMVDDFQEYNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDSQRKNSED 

TLYENKEIDGRDSDLLVDGDLGEYDFYEYKEYEDKPTSPPNEEFGPGVPAETDITETSIN 

GHGAYGEKGQKGEPAWEPGMLVEGPPGPAGPAGIMGPPGLQGPTGPPGDPGDRGPPG 

RPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARIALRGPPGPM 

GLTGRPGPVGGPGSSGAKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMP 

GEPGAKGDRGFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAG 

PRGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQGLPGPQG 

PIGPPGEKGPQGKPGLAGLPGADGPPGHPGKEGQSGEKGALGPPGPQGPIGYPGPRGVK 

GADGVRGLKGSKGEKGEDGFPGFKGDMGLKGDRGEVGQIGPRGEDGPEGPKGRAGPT 

GDPGPSGQAGEKGKLGVPGLPGYPGRQGPKGSTGFPGFPGANGEKGARGVAGKPGPR 

GQRGPTGPRGSRGARGPTGKPGPKGTSGGDGPPGPPGERGPQGPQGPVGFPGPKGPPGP 

PGKDGLPGHPGQRGETGFQGKTGPPGPGGVVGPQGPTGETGPIGERGHPGPPGPPGEQG 

LPGAAGKEGAKGDPGPQGISGKDGPAGLRGFPGERGLPGAQGAPGLKGGEGPQGPPGP 

V corresponding to amino acids 1 - 1056 of CA1B_HUMAN_V5, which also corresponds to 

amino acids 1 - 1056 of HUMCA1XIAJP14, and a second amino acid sequence being at least 

70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 

preferably at least 95% homologous to a polypeptide having the sequence 

VSMMIINSQTIMVVNYSSSFITLML corresponding to amino acids 1057 - 1081 of 
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HUMCA 1XIA_P 14, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a tail of HUMCA 1XIAP 14, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
5 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence VSMMIINSQTIMVVNYSSSFITLML in HUMCA1XIAJP14. 

It should be noted that the known protein sequence (CA1BHUMAN) has one or more 
changes than the sequence given at the end of the application and named as being the amino 
10 acid sequence for CA1B_HUMAN_V5. These changes were previously known to occur and are 
listed in the table below. 

Table 834 - Changes to CAlBJiUMANJVS 



SNP position(s) on 
amino acid sequence 


Type of cjiange , 1 ! 


987 


conflict 



15 The location of the variant protein was determined according to results from a number of 

different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 

20 region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HUMCA 1 XI A_P 1 4 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 835, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indie ates whether 
the SNP is known or not; the presence of known SNPs in variant protein HUMCA 1 XI A JP 1 4 

25 sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 835 - Amino acid mutations 
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SNP position(s) on amino acid 


Alternative amino acid(s) 


Previously known SNP? 


sequence \ f 






8 


W->G 


Yes 


46 


D->E 


Yes 


559 


G->S 


Yes 


832 


G-> * 


Yes 


986 


H-> Y 


Yes 


1061 


I->M 


Yes 


1070 


V-> A 


Yes 



Variant protein HUMC A 1 XIA_P 1 4 is encoded by the following transcript(s): 
HUMCA1XIAT16, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HUMCA1XIAT16 is shown in bold; this coding portion starts at 
5 position 319 and ends at position 3561. The transcript also has the following SNPs as listed in 
Table 836 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HUMC A 1 XI A P 1 4 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 836- Nucleic acid SNPs 



SNP position on nucleotide 1 
sequence 


Alternative nucleic acid 


Previously known SNP? 


157 


A->G 


No 


241 


T->A 


Yes 


340 


T->G 


Yes 


456 


T->G 


Yes 


1993 


G-> A 


Yes 


2812 


G->T 


Yes 


3274 


C->T 


Yes 


3282 


C->T 


Yes 
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3501 


A->G 


Yes 


3527 


T->C 


Yes 



Variant protein HUMCA1XIA_P15 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMCA1XIA_T17. An alignment is given to the known protein (Collagen alpha 1) at the end 
of the application. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 
Comparison report between HUMCA1XIA_P15 and C A 1 B HUM AN : 
10 l.An isolated chimeric polypeptide encoding for HUMCA1XIA_P15, comprising a first 

amino acid sequence being at least 90 % homologous to 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTT 
GFCTNRKNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIY 
NEHGIQQIGVEVGRSPWLFEDHTGKPAPEDYPLFRTVNIAEX3KWHRVAISVEKKTVTM 

1 5 IVDCKKKTTKPLDRSERAIVDTNGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEH 
YSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQT 
EANIVDDFQEYrT^GTMESYQTEAPRHVSGTOEPNPVEEIFTEEYLTGEDYDSQRKNSED 
TLYENKEIDGRDSDLLVDGDLGEYDFYEYKEYEDKPTSPPNEEFGPGVPAETDITETSIN 
GHGAYGEKGQKGEPAWEPGMLVEGPPGPAGPAGIMGPPGLQGPTGPPGDPGDRGPPG 

20 RPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARIALRGPPGPM 
GLTGRPGPVGGPGSSGAKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMP 
GEPGAKGDRGFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAG 
PRGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQGLPGPQG 
PIGPPGEK corresponding to amino acids 1 - 714 of CA1B_HUMAN, which also corresponds 

25 to amino acids 1 - 714 of HUMCA1XIA_P15, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

MCCNLSFGILIPLQK corresponding to amino acids 715 - 729 of HUMCA1XIA_P15, wherein 
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said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

2.An isolated polypeptide encoding for a tail of HUMCA1XIAP15, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
5 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence MCCNLSFGILIPLQK in HUMCA1XIA_P15. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 

10 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HUMCA1XIAJP15 also has the following non- silent SNPs (Single 

15 Nucleotide Polymorphisms) as listed in Table 837, (given according to their position(s) on the 

amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HUMC A 1 XI A_P 1 5 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

20 Table 837- Amino acid mutations 



SNP positions) on amino acid 
sequence •">:■"■:..:' 


Alternative amino acid(s) 


Previously known SNP? , 


8 


W->G 


Yes 


46 


D->E 


Yes 


559 


G->S 


Yes 



The glycosylation sites of variant protein HUMC A 1 XI A_P 1 5 , as compared to the known 
protein Collagen alpha 1, are described in Table 838 (given according to their position(s) on the 
amino acid sequence in the first column; the second column indicates whether the glycosylation 
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site is present in the variant protein; and the last column indicates whether the position is 
different on the variant protein). 

Table 838 - Glycosylation site(s) 



Position(s) QU known amino 
acid sequence!! f : ■ 


Present in %ri^ protein? 


1640 


no 



5 Variant protein HUMCA1XIA_P15 is encoded by the following transcript(s): 

HUMCA1XIAT17, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HUMCA1XIAJT17 is shown in bold; this coding portion starts at 
position 319 and ends at position 2505. The transcript also has the following SNPs as listed in 
Table 839 (given according to their position on the nucleotide sequence, with the alternative 
10 nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HUMCA1XIAJP15 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 839 - Nucleic acid SNPs 



SNP position on nucleotide 


Alternative nucleic acid 


| Previously kiiovpt SNP? 


' sequence - " .;' i 4'&7 

■ ■ - X ■ ' . . 7: 
*'v - . ■ :;<:.; c - 






157 


A->G 


No 


241 


T->A 


Yes 


340 


T->G 


Yes 


456 


T->G 


Yes 


1993 


G-> A 


Yes 


2473 


C->T 


Yes 



Variant protein HUM C A 1 XI A P 1 6 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
HUMC A 1 XI A_T 1 9 . An alignment is given to the known protein (Collagen alpha 1) at the end 
of the application. One or more alignments to one or more previously published protein 
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sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 
Comparison report between HUMCA1XIAJP16 and C A 1 B_HUM AN : 
LAn isolated chimeric polypeptide encoding for HUMCA1XIAP16, comprising a first 
5 amino acid sequence being at least 90 % homologous to 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTT 
GFCTNRXNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIY 
NEHGIQQIGVEVGRSPVFLFEDHTGKPAPEDYPLFRTVNIAIXjKWHRVAISVEKKTVTM 
IVDCKKKTTKPLDRSERAIVDTNGITVFGTmLDEEVFEGDIQQFLITGDPKAAYDYCEH 

10 YSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQT 
EANIVDDFQEYNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDSQRKNSED 
TLYENKEIDGRDSDLLVDGDLGEYDFYEYKEYEDKPTSPPNEEFGPGVPAETDITETSIN 
GHGAYGEKGQKGEPAVVEPGMLVEGPPGPAGPAGIMGPPGLQGPTGPPGDPGDRGPPG 
RPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARIALRGPPGPM 

15 GLTGRPGPVGGPGSSGAKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMP 
GEPGAKGDRGFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEA 
corresponding to amino acids 1 - 648 of CA1B_HUMAN, which also corresponds to amino 
acids 1 - 648 of HUMCA1XIAJP16, a second amino acid sequence being at least 90 % 
homologous to GMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQGLPGPQGPIGPPGEK 

20 corresponding to amino acids 667 - 714 of CA1BHUMAN, which also corresponds to amino 
acids 649 - 696 of HUMC A 1 XI A_P 1 6, and a third amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
VSFSFSLFYKKVIKFACDKRFVGRHDERKVVKLSLPLYLIYE corresponding to amino 

25 acids 697 - 738 of HUMCA1XIA_P16, wherein said first amino acid sequence, second amino 
acid sequence and third amino acid sequence are contiguous and in a sequential order. 

2. An isolated chimeric polypeptide encoding for an edge portion of HUMCA1XIA_P16, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 

30 acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise AG, having a 
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structure as follows: a sequence starting from any of amino acid numbers 648-x to 648; and 
ending at any of amino acid numbers 649+ ((n-2) - x), in which x varies from 0 to n-2. 

3 An isolated polypeptide encoding for a tail of HUMCA 1XIAJP 16, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
5 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence VSFSFSLFYKKVIKFACDKRFVGRHDERKVVKLSLPLYLIYE in 
HUMCA1XIA_P16. 

The location of the variant protein was determined according to results from a number of 
10 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 
15 Variant protein HUMCA 1 XIA JP 1 6 also has the following non-silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 840, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HUMCA 1 XI A JP 1 6 
sequence provides support for the deduced sequence of this variant protein according to the 
20 present invention). 

Table 840 - Amino acid mutations 



SNP position(s) on amino acid 
sequence i ■ 


Alternative amino acid(s) 


Previously known SNP? 


8 


W->G 


Yes 


46 


D->E 


Yes 


559 


G->S 


Yes 



The glycosylation sites of variant protein HUMCA 1X1A_P 16, as compared to the known 
protein Collagen alpha l s are described in Table 841 (given according to their position(s) on the 
25 amino acid sequence in the first column; the second column indicates whether the glycosylation 
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site is present in the variant protein; and the last column indicates whether the position is 
different on the variant protein). 

Table 841 - Glycosylation site(s) 



Positi6n(s) on known amino 
acid sequence t 


Present in variant protein? °y' 


1640 


no 



5 Variant protein HUMCA1XIA_P16 is encoded by the following transcript(s): 

HUMCA1XIA_T19 ? for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HUMCA1XIAJT19 is shown in bold; this coding portion starts at 
position 319 and ends at position 2532. The transcript also has the following SNPs as listed in 
Table 842 (given according to their position on the nucleotide sequence, with the alternative 
10 nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HUMCA1XIAJP16 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 842 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence; ; '<?}. - 


Alternative nucleic acid , 


Previously known SNP? ; 'f 


157 


A->G 


No 


241 


T-> A 


Yes 


340 


T->G 


Yes 


456 


T->G 


Yes 


1993 


G-> A 


Yes 


2606 


C-> A 


Yes 


2677 


T->G 


Yes 


2849 


C->T 


Yes 



15 

Variant protein HUMC A 1 XIA_P 1 7 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
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HUMCA 1XIAT20. An alignment is given to the known protein (Collagen alpha 1) at the end 
of the application. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 
5 Comparison report between HUMCA 1 XI A JP 1 7 and C A 1 BJHUMAN : 

LAn isolated chimeric polypeptide encoding for HUMCA 1XIA P 17, comprising a first 
amino acid sequence being at least 90 % homologous to 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTT 
GFCTNRKNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIY 

10 NEHGIQQIGVEVGRSPWLFEDHTGKPAPEDYPLFRTVNIADGKWHRV 

IVDCKKKTTKPLDRSERAIVDTNGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEH 
YSPDCDSSAPKAAQAQEPQIDE corresponding to amino acids 1 - 260 of CA1B_HUMAN, 
which also corresponds to amino acids 1 - 260 of HUMCA 1XIAJP 17, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 

15 least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
VRSTRPEKVFVFQ corresponding to amino acids 261 - 273 of HUMCA1XIA_P17, wherein 
said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

2.An isolated polypeptide encoding for a tail of HUMCA1XIAJP17, comprising a 
20 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence VRSTRPEKVFVFQ in HUMCA1XIA_P17. 

The location of the variant protein was determined according to results from a number of 
25 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 
30 Variant protein HUMCA1XIA_P17 also has the following non-silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 843, (given according to their position(s) on the 
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amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HUMC A 1 XI A JP 1 7 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 



5 Table 843 - Amino acid mutations 



SNP position(s) on amino acid 
sequence ? ; ; , ; ' » r ' 


Alternative amino acid(s) ■.;/ ■• 


Previously known SNP? { 


8 


W->G 


Yes 


46 


D->E 


Yes 



The glycosylation sites of variant protein HUMCA1XIA P17, as compared to the known 
protein Collagen alpha 1, are described in Table 844 (given according to their position(s) on the 
amino acid sequence in the first column; the second column indicates whether the glycosylation 
1 0 site is present in the variant protein; and the last column indicates whether the position is 
different on the variant protein). 



Table 844 - Glycosylation site(s) 



:Posjtion(s) on known amino 
acid sequence i 


Present in Valiant protein? 


1640 


no 



Variant protein HUMCA1XIAJP17 is encoded by the following transcript(s): 
15 HUMC A 1 XIA_T20 ? for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HUMCA1XIAT20 is shown in bold; this coding portion starts at 
position 319 and ends at position 1137. The transcript also has the following SNPs as listed in 
Table 845 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
20 known SNPs in variant protein HUMCA1XIA_P17 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 845 - Nucleic acid SNPs 
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SNP position on nucleotide ; 

sequence; : -s f . i*i 


Alternative nucleic acid 


Previously known SNP? 


157 


A->G 


No 


241 


T->A 


Yes 


340 


T->G 


Yes 


456 


T->G 


Yes 


1150 


A->C 


Yes 


As noted above, cluster B 


[UMCA1XIA features 46 segment(s), which were listed in Table 



2 above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
5 provided. 

Segment cluster HUMCA1XIA node 0 according to the present invention is supported 
by 13 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1 XIA JT 1 6, HUMCA1XIA_T17, 
10 HUMCA 1XIA_T 19 and HUMCA 1 XI A_T20. Table 846 below describes the starting and 
ending position of this segment on each transcript. 

Table 846 - Segment location on transcripts 



Transcript name V. ,-, "■. ■% 1' % If 


Segment / ' ~{r 
starting position 


Segment 

ending position 5 


HUMCA1XIA_T16 


1 


424 


HUMCA1XIA_T17 


1 


424 


HUMCA1XIA_T19 


1 


424 


HUMCA1XIA_T20 


1 


424 



15 Segment cluster HUMCA lXIA node 2 according to the present invention is supported 

by 9 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1 XL\_T 1 6 5 HUMCA1XIAJT17, 
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HUMCA1XIAJT19 and HUMCA1XIAJT20. Table 847 below describes the starting and 
ending position of this segment on each transcript. 

Table 847 - Segment location on transcripts 



Transcript name V; ' ' ■]■ .] '\ 


| Segment • r |F ' 
starting position 


Segment 
ending posi tion 


HUMCA1XIA_T16 


425 


592 


HUMCA1XIA_T17 


425 


592 


HUMC A 1 XI AT 1 9 


425 


592 


HUMCA1XIA_T20 


425 


592 



Segment cluster HUMC A 1 XI A node 4 according to the present invention is supported 
by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIA_T16, HUMCA1XIAJT17, 
HUMCA1XIAJT19 and HUMCA1XIAJT20. Table 848 below describes the starting and 
10 ending position of this segment on each transcript. 



Table 848 - Segment location on transcripts 



Transcript name * ? : I ; 


; Segment }.■ 


• Segment: . ■■'<. 




'starting position 


\ ending position v v • 


HUMCA1XIA_T16 


593 


806 


HUMCA1XIA_T17 


593 


806 


HUMCA1XIA_T19 


593 


806 


HUMCA1XIA_T20 


593 


806 



Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
15 expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 849. 

Table 849 - Oligonucleotides related to this segment 
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Oligonucleotide name 


Overexpressed in cancers 


Chip reference ^ 


HUMCA 1 XI A_0_1 8^0 


lung malignant tumors 


LUN 



Segment cluster HUMCA lXIA_node_6 according to the present invention is supported 
by 5 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): HUMCA 1 XI A_T 1 6, HUMCA 1 XIA_T 1 7, 

HUMCA1XIAJT19 and HUMCA 1 XIA_T20. Table 850 below describes the starting and 
ending position of this segment on each transcript. 

Table 850 - Segment location on transcripts 



Transcript name • 


Segment 

starting position ^ 


Segment .'4 -i) y 
ending position - : J 


HUMCA1XIA_T16 


807 


969 


HUMCA1XIA_T17 


807 


969 


HUMCA1XIA_T19 


807 


969 


HUMCA1XIA_T20 


807 


969 



10 Microarray (chip) data is also available for this segment as follows. As described above 

with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 851. 

Table 851 - Oligonucleotides related to this segment 



Oligonucleotide name 


O verexpressed in cancers 


Chip reference 




HUMCAIXIAJM 8_0 


lung malignant tumors 


LUN 



15 

Segment cluster HUMCA 1 XI A_node_8 according to the present invention is supported 
by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1XIA_T 16, HUMCA 1XIA_T 17, 
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HUMCA1XIAJT19 and HUMCA1XIAJT20. Table 852 below describes the starting and 
ending position of this segment on each transcript. 

Table 852 - Segment location on transcripts 



Transcript name . ! 


Segment ,, . ■ % _ 
starting position ,vl 


Segment \ . V ; f/j 
ending position *| 


HUMCA1XIA_T16 


970 


1098 


HUMCA1XIA_T17 


970 


1098 


HUMCA1XIA_T19 


970 


1098 


HUMCA1XIA_T20 


970 


1098 



Segment cluster HUMC A 1 XI A_node_9 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIA_T20. Table 853 below describes the 
starting and ending position of this segment on each transcript. 

1 0 Table 853 - Segment location on transcripts 



Transcript name 'S 


Sbgmeirf r ; \ . 
string position 


Segment ; 3| 
ending position 


HUMCA1XIA_T20 


1099 


1271 



Segment cluster HUMCAlXIA__node_18 according to the present invention is supported 
by 6 libraries. The number of libraries was determined as previously described. This segment 
15 can be found in the following transcript(s): HUMCA1XIA_T16 3 HUMCA1X1AJT17 and 

HUMC A 1 XI AT 1 9 . Table 854 below describes the starting and ending position of this segment 
on each transcript. 

Table 854 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 
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HUMCA1XIA_T16 


1309 


1522 


HUMCA1XIA_T17 


1309 


1522 


HUMCA1XIA_T19 


1309 


1522 



Segment cluster HUMCAlXIA_node_54 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIA_T19. Table 855 below describes the 
starting and ending position of this segment on each transcript. 

Table 855 - Segment location on transcripts 



Transcnpt %me f§ - ' 


Segment ' $ ' 
staritig petition - 


Segment ; . 
betiding, position 


HUMCA1XIA_T19 


2407 


2836 



Segment cluster HUMCAlXIA_node_55 according to the present invention is supported 
by 4 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIAJT17 and HUMCA1XIA_T19. Table 
856 below describes the starting and ending position of this segment on each transcript. 

Table 856 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment •'*• . ; 
ending position 


HUMCA1XIA_T17 


2461 


2648 


HUMCA1XIA_T19 


2837 


3475 



Segment cluster HUMCAlXIA_node_92 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1 XIA_T 1 6 . Table 857 below describes the 
starting and ending position of this segment on each transcript. 
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Table 857 - Segment location on transcripts 



Transcript name 


Segment :■. 
starting position "/ 


Segment 
ending position 


HUMCA1XIAJT16 


3487 


3615 



According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

5 

Segment cluster HUMCAlXIA_node_l 1 according to the present invention is supported 
by 3 libraries. The number of libraries was deteirained as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIAJT16, HUMCA1XIA_T17 and 
HUMCA1XIA_T19. Table 858 below describes the starting and ending position of this segment 
10 on each transcript. 



Table 858 - Segment location on transcripts 



Transcript name ^.4- : ./. : \ . 


Segment ■ 


'. Segment I'l'A: • 




starting position 


ending position |", 


HUMCA1XIA_T16 


1099 • 


1215 


HUMCA1XIA_T17 


1099 


1215 


HUMCA1XIA_T19 


1099 


1215 



Segment cluster HUMC A 1 XIA node 1 5 according to the present invention is supported 
15 by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMC A 1 XI AT 1 6 ? HUMCA1XIA_T17 and 
HUMC A 1 XIAT 1 9 . Table 859 below describes the starting and ending position of this segment 
on each transcript. 

Table 859 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 
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HUMCA1XIA_T16 


1216 


1308 


HUMCA1XIA_T17 


1216 


1308 


HUMCA1XIA_T19 


1216 


1308 



Segment cluster HUMCAlXIAjaode 1 9 according to the present invention is supported 
by 3 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): HUMCA 1XIAJT 16, HUMCA1XIA_T17 and 

HUMCA1XIAJT19. Table 860 below describes the starting and ending position of this segment 
on each transcript. 

Table 860 - Segment location on transcripts 



Transcript name • /?" 


': Segment - . •; 
starting position 


Segment S. ; <}f; f ' . 
ending position ; 


HUMCA1XIA_T16 


1523 


1563 


HUMCA1XIA_T17 


1523 


1563 


HUMCA1XIA_T19 


1523 


1563 



10 

Segment cluster HUMCA 1 XI A node_2 1 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIAJT16, HUMCA1XIA_T17 and 
HUMCA 1 XI A T 1 9 . Table 861 below describes the starting and ending position of this segment 
15 on each transcript. 

Table 861 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


: Segment 
ending position 


HUMCA1XIA_T16 


1564 


1626 


HUMCA1XIA_T17 


1564 


1626 


HUMCA1XIA_T19 


1564 


1626 
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Segment cluster HUMCAlXIA_node_23 according to the present invention is supported 
by 3 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): HUMCA 1XIAJT 16, HUMCA1XIAJT17 and 

HUMCA 1 XI A T 19. Table 862 below describes the starting and ending position of this segment 
on each transcript. 

Table 862 - Segment location on transcripts 



{Transcript name : 


Segment . f t 
starting position . 


Segment; -'t : . 
ending position 


HUMCA1XIA_T16 


1627 


1668 


HUMCA1XIA_T17 


1627 


1668 


HUMCA1XIA_T19 


1627 


1668 



10 

Segment cluster HUMCA 1 XIA_node_25 according to the present invention is supported 
by 3 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1 XI A_T 1 6 , HUMCA 1 XIA_T 1 7 and 
HUMCA1XIAT19. Table 863 below describes the starting and ending position of this segment 
15 on each transcript. 

Table 863- Segment location on transcripts 



Transcript name . 


Segment 


Segment 




; starting position 


I ending position 


HUMCA1XIA_T16 


1669 


1731 


HUMCA1XIA_T17 


1669 


1731 


HUMCA1XIA_T19 


1669 


1731 



Segment cluster HUMCA 1 XI A_node_27 according to the present invention is supported 
20 by 2 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): HUMCA 1XIAJT 16, HUMCA1XIAJT17 and 
HUMCA 1XIA_T 19. Table 864 below describes the starting and ending position of this segment 
on each transcript 

Table 864 - Segment location on transcripts 



Transcript name \ ): . 


Segment X 


Segment - %• 




starting position g 


ending position 


HUMCA1XIA_T16 


1732 


1806 


HUMCA1XIA_T17 


1732 


1806 


HUMCA1XIA_T19 


1732 


1806 



5 



Segment cluster HUMCA 1 XIA_node_29 according to the present invention is supported 
by 3 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1XIA_T 16, HUMCA1XIAJT17 and 
10 HUMCA 1XIA T 19. Table 865 below describes the starting and ending position of this segment 
on each transcript. 

Table 865 ~ Segment location on transcripts 



Transcript name r ■ . .. ,V 


^Segment 
starting position 


t Segment 

;?ending position V; 


HUMCA1XIA_T16 


1807 


1890 


HUMCA1XIA_T17 


1807 


1890 


HUMCA1XIA_T19 


1807 


1890 



15 Segment cluster HUMCA lXIA_node_31 according to the present invention is supported 

by 3 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1 XI A_T16, HUMCA 1XIA_T 17 and 
HUMCA 1 XI A T 1 9 . Table 866 below describes the starting and ending position of this segment 
on each transcript. 



WO 2006/131783 



PCT/IB2005/004037 



920 



Table 866- Segment location on transcripts 



Transcript naine ; ; % 


Segment 
starting position 


Segment 
ending position 


HUMCA1XIA_T16 


1891 


1947 


HUMCA1XIA_T17 


1891 


1947 


HUMCA1XIA_T19 


1891 


1947 



Segment cluster HUMCAlXIA_node_33 according to the present invention is supported 
5 by 3 libraries. The number of libraries was detemiined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIA_T16, HUMCA1XIAJT17 and 
HUMCA1XIAJT19. Table 867 below describes the starting and ending position of this segment 
on each transcript. 

Table 867 - Segment location on transcripts 



; Transcript name " . : < [ " 


Segment £ ' . J : 
starting position ffi:>. 


Segment . .J'f- . ; 
; ending positioii ; 


HUMCA1XIA_T16 


1948 


2001 


HUMCA1XIA_T17 


1948 


2001 


HUMCA1XIA_T19 


1948 


2001 



10 

Segment cluster HUM C A 1 XI A_no de_3 5 according to the present invention is supported 
by 4 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIA_T16 5 HUMCA1XIA_T17 and 
15 HUMCA 1 XI A_T 1 9 . Table 868 below describes the starting and ending position of this segment 
on each transcript. 

Table 868 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 



I 



WO 2006/131783 



PCT/IB2005/004037 



921 



HUMCA1XIA_T16 


2002 


2055 


HUMCA1XIA_T17 


2002 


2055 


HUMCA1XIA_T19 


2002 


2055 



Segment cluster HUMCAlXIA_node_37 according to the present invention is supported 
by 4 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): HUMCA1XIA_T16, HUMCA1XIAJT17 and 

HUMCA1XIA_T19. Table 869 below describes the starting and ending position of this segment 
on each transcript. 

Table 869 - Segment location on transcripts 



Transcript name . ... 


Segment 


Segment 




stalling position 


ending position / , 


HUMCA1XIA_T16 


2056 


2109 


HUMCA1XIA_T17 


2056 


2109 


HUMCA1XIA_T19 


2056 


2109 



10 

Segment cluster HU1V1CA 1 XI A node_3 9 according to the present invention is supported 
by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIA_T16, HUMCA1XIAJT17 and 
HUMC A 1 XIA_T 1 9 . Table 870 below describes the starting and ending position of this segment 
15 on each transcript. 



Table 870 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 


HUMCA1XIA_T16 


2110 


2163 


HUMCA1XIA_T17 


2110 


2163 


HUMCA1XIA_T19 


2110 


2163 
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Segment cluster HUMCAlXIA_node_41 according to the present invention is supported 
by 4 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): HUMCA1XIA_T16 5 HUMCA1XIA_T17 and 

HUMCA1XIA__T19. Table 871 below describes the starting and ending position of this segment 
on each transcript. 

Table 871 - Segment location on transcripts 



Tratispript naipe J V ^ ; \ 

- - ; '? ■ ' 


. Segment. " C| 
starting positiori : 

: S^; , - ' 


. Segment 4. ''.v''' : . _ 
.ending position; ; 


HUMCA1XIA_T16 


2164 


2217 


HUMCA1XIAJT17 


2164 


2217 


HUMCA1XIAJT19 


2164 


2217 



10 

Segment cluster HUMCAlXIA_node_43 according to the present invention is supported 
by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIA_T16, HUMCA1XIA_T17 and 
HUMCA1XIA_T19. Table 872 below describes the starting and ending position of this segment 
15 on each transcript. 



Table 872 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




:, starting position 


ending position 


HUMCA1XIA_T16 


2218 


2262 


HUMCA1XIAT17 


2218 


2262 


HUMCA1XIA_T19 


2218 


2262 



Segment cluster HUMC A 1 XIA no de_45 according to the present invention is supported 
20 by 4 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): HUMCA1XIAJT16 and HUMCA1XIAJT17. Table 
873 below describes the starting and ending position of this segment on each transcript. 

Table 873 - Segment location on transcripts 



Transcript name ; - / 


Segment 
starting position 


Segment 

ending position : 


HUMCA1XIA_T16 


2263 


2316 


HUMCA1XIA_T17 


2263 


2316 



5 

Segment cluster HUMCA lXIA_node_47 according to the present invention is supported 
by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIAJT16, HUMCA1XIA_T17 and 
HUMCA 1 XI A T 1 9 . Table 874 below describes the starting and ending position of this segment 
10 on each transcript. 



Table 874 - Segment location on transcripts 



Transcript name , . v * ^ i 


: Segment ; 


; Segment' ■• , / 


starring position; 


ending position ; 


HUMCA1XIA_T16 


2317 


2361 


HUMCA1XIA_T17 


2317 


2361 


HUMCA1XIA_T19 


2263 


2307 



Segment cluster HUMCA lXIA_node_49 according to the present invention is supported 
15 by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1XIAJT 16, HUMCA1XIAJT17 and 
HUMCA 1XIA T 19. Table 875 below describes the starting and ending position of this segment 
on each transcript. 

Table 875 - Segment location on transcripts 
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Transcript name r J 


Segment 

starting position . - 


Segment 
ending position 


HUMCA1XIA_T16 


2362 


2415 


HUMCA1XIA_T17 


2362 


2415 


HUMCA1XIA_T19 


2308 


2361 



Segment cluster HUMCAlXIA_node_51 according to the present invention is supported 
by 7 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following tmnscript(s): HUMCA1XIAJT16, HUMCA1XIAJT17 and 

HUMCA1XIA_T19. Table 876 below describes the starting and ending position of this segment 
on each transcript. 

Table 876 - Segment location on transcripts 



Transcript name t .y -" ■ ,-i\f-' f 


Segr6e4|f- 

starting position ' '<".. : Je \ •. : 


■ ' Segment ■ , '4 5 - .' 
ending position ? 


HUMCA1XIA_T16 


2416 


2460 


HUMCA1XIA_T17 


2416 


2460 


HUMCA1XIA_T19 


2362 


2406 



10 

Segment cluster HUMCAlXIAjtiode_57 according to the present invention is supported 
by 4 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1 XIA_T 1 6 . Table 877 below describes the 
starting and ending position of this segment on each transcript. 

15 Table 877 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


HUMCA1XIAJT16 


2461 


2514 
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Segment cluster HUMCA 1 XIA node 59 according to the present invention is supported 
by 3 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1XIAJT 16. Table 878 below describes the 
5 starting and ending position of this segment on each transcript. 



Table 878 - Segment location on transcripts 



Transcript name . fe e f 

i . : 'Jfe \ ' * ; ' ■ ' it' ';t' ii 

¥ ■■ -4 • • it &r 


Segment 
starting position 


Segment f 
ending position 


HUMCA1XIAJT16 


2515 


2559 



Segment cluster HUMCA lXIA_node_62 according to the present invention is supported 
10 by 3 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1XIA_T 16. Table 879 below describes the 
starting and ending position of this segment on each transcript. 

Table 879 - Segment location on transcripts 





Segmearf^ Yi. 
starting positipn : . 


Segment " 
ending position 


HUMCA1XIAJT16 


2560 


2613 



15 

Segment cluster HUMCA 1 XI A_node_64 according to the present invention is supported 
by 4 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1 XI A_T 1 6 . Table 880 below describes the 
starting and ending position of this segment on each transcript. 

20 Table 880 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
, ending position 


HUMCA1XIA_T16 


2614 


2658 
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Segment cluster HUMCA 1 XI A_node_66 according to the present invention is supported 
by 4 Ebraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): HUMCA1XIA_T16. Table 881 below describes the 
starting and ending position of this segment on each transcript. 

Table 881 - Segment location on transcripts 



Transcript name V* 


Segment : \_ 
stalling position 


^Segment;? 
ending position |, j 


HUMCA1XIAJT16 


2659 


2712 



10 Segment cluster HUMCA 1 XI A no de_6 8 according to the present invention is supported 

by 7 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1 XIA T 1 6 . Table 882 below describes the 
starting and ending position of this segment on each transcript. 

Table 882 - Segment location on transcripts 



■ l%iscr$t name? / : /- j: . 


| SdgcSent \ "J < *. 
v starting position, 


.Segment : 
ending position ; . 


HUMCA1XIA_T16 


2713 


2820 



15 

Segment cluster HUMCA 1 XI A_nodeJ70 according to the present invention is supported 
by 6 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1 XIAJT 1 6 . Table 883 below describes the 
20 starting and ending position of this segment on each transcript. 

Table 883 - Segment location on transcripts 



Transcript name 


i Segment 


Segment 




\ starting position 


ending position 
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HUMCA1XIA_T16 


2821 


2874 









Segment cluster HUMCAlXIAjtiode__72 according to the present invention is supported 
by 6 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): HUMCA1XIAT16. Table 884 below describes the 
starting and ending position of this segment on each transcript. 

Table 884 - Segment location on transcripts 



Transcript ,name<: I I g 


Segment U: 
starting position 


: Segment;; ^% 
I ending position j J 


HUMCA1XIAJT6 


2875 


2928 



10 Segment cluster HUMCAlXIA_node_74 according to the present invention is supported 

by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIAJT16. Table 885 below describes the 
starting and ending position of this segment on each transcript. 

Table 885 - Segment location on transcripts 



Transcript name . t- 


Segment 
, starting position 


'vSegrneiit • ; ; : : -. * \ 
■ ending position! t 1 


HUMCA1XIA_T16 


2929 


2973 



15 

Segment cluster HUMCAlXIA_node_76 according to the present invention is supported 
by 6 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMC A 1 XIA T 1 6 . Table 886 below describes the 
20 starting and ending position of this segment on each transcript. 

Table 886 - Segment location on transcripts 
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Transcript name • r 

' " ^ ?' .J V':./ 1 ". 


Segment 
starting position 


Segment * 
ending position 


HUMCA1XIA_T16 


2974 


3027 



Segment cluster HUMCAlXIA_node_78 according to the present invention is supported 
by 6 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): HUMCA 1XIA_T 16. Table 887 below describes the 
starting and ending position of this segment on each transcript. 

Table 887 - Segment location on transcripts 





/Segment ■ > 
starting position 


Segment 

?en<%;gpQsition . 


HUMCA1XIAJT16 


3028 


3072 



10 Segment cluster HUMCA 1 XIA_node_8 1 according to the present invention is supported 

by 8 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIA_T16. Table 888 below describes the 
starting and ending position of this segment on each transcript. 

Table 888 - Segment location on transcripts 



Ttxm&cti^ name ■ < " : \ 


f Segmeflf 
starting position 


Segment - . 
ending position 


HUMCA1XIAJT16 


3073 


3126 



15 

Segment cluster HUMCA 1 XI A_node_83 according to the present invention is supported 
by 7 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIA_T16. Table 889 below describes the 
20 starting and ending position of this segment on each transcript. 
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Table 889 - Segment location on transcripts 



Transcript name; m j 


Segment ■ 
starting position 


, Segment .^V:-' 
ending position . 


HUMCA1XIAJT16 


3127 


3180 



Segment cluster HUMCA 1 XIA_node 85 according to the present invention is supported 
by 6 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIAJT16. Table 890 below describes the 
starting and ending position of this segment on each transcript. 

Table 890 - Segment location on transcripts 



Transcript nam$ r : " \ v . f[ 


; Segment,, ; / 
§tarting^positIbp : , 4. 


Segment v „ _ ' - fV 
Periding;position > d 


HUMCA1XIAJT16 


3181 


3234 



Segment cluster HUMCA 1 XIA_node_8 7 according to the present invention is supported 
by 10 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA 1 XI A_T 1 6 . Table 891 below describes the 
starting and ending position of this segment on each transcript. 



Table 891 - Segment location on transcripts 



Transcript name , s 


Segment 
starting position 


Segment 
ending position 


HUMCA1XIA_T16 


3235 


3342 



Segment cluster HUMCA 1 XI A_node_89 according to the present invention is supported 
by 9 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): HUMCA1XIA_T16. Table 892 below describes the 
starting and ending position of this segment on each transcript. 

Table 892 - Segment location on transcripts 



Transcript name 


stattiftgpositidh ; 


"Segment V : i ' ' 
ending position i 

" • •• f »■ - 


HUMCA1XIAJT16 


3343 


3432 



Segment cluster HUMCAlXIA_node_91 according to the present invention is supported 
by 11 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): HUMCA1XIAT16. Table 893 below describes the 
starting and ending position of this segment on each transcript. 

10 Table 893 - Segment location on transcripts 



Transcript name " 


| Segment \ , ; . r > ; * 
\ ! starting position : \; 


Segment . /' ■ \ | : 
jading position .*'7 


HUMCA1XIAJT16 


3433 


3486 



15 

Variant protein alignment to the previously known protein: 
Sequence name: CA1B_HUMAN_V5 

20 Sequence documentation: 

Alignment of: HUMCA1XIA_P14 x CA1B_HUMAN_V5 
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Alignment segment 1/1: 

Quality: 10456.00 

Escore: 0 

5 Matching length: 1058 Total 

length: 1058 

Matching Percent Similarity: 99.91 Matching Percent 
Identity: 99.91 

Total Percent Similarity: 99.91 Total Percent 

10 Identity: 99.91 

Gaps : 0 

Alignment : 

15 1 MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNS 5 0 

I I ! I I I I I II I I I i I I II I i I I I I I 1 I I I II I I i I I I I I II I i I I I I I M 
1 MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNS 50 

51 PEGISKTTGFCTNRKNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFS 100 

20 | | | | M 1 | | | I I I I II I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I 

51 PEGISKTTGFCTNRKNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFS 10 0 

101 ILFTVKPKKGIQSFLLSIYNEHGIQQIGVEVGRSPVFLFEDHTGKPAPED 150 

I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I ! I I I II I I I I I I I I I I I I I I I 

25 101 I LFT VKPKKG I Q S FLLS I YNEHG I QQ I GVEVGRS PVFLFE DHTGKPAPED 150 

151 YPLFRTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDT 200 

I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
151 YPLFRTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDT 200 



30 



201 NGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEHYSPDCDSSAPKA 250 
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I I I 1 I I I I I 1 I 1 I I II I I I I I I I I I I I I I I I I I I M I 1 II I I I I I I I I M 
201 NGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEHYSPDCDSSAPKA 250 

251 AQAQEPQIDEYAPEDI IEYDYEYGEAEYKEAESVTEGPTVTEETIAQTEA 30 0 

I I I I I I t I I I I I I I I I I I I I II 1 I I I I I I I I I M I I I I I I I I I I I I I ! I I 

251 AQAQEPQIDEYAPEDI IEYDYEYGEAEYKEAESVTEGPTVTEETIAQTEA 300 

301 NIVDDFQEYNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDS 350 

| | | | | | | | | I I I I I I 1 I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I I 
301 NIVDDFQEYNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDS 350 

351 QRKNSEDTLYENKE I DGRDS DLLVDGDLGEYDFYE YKEYEDKPT S PPNEE 40 0 

I | | II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

351 QRKNSEDTLYENKEI DGRDS DLLVDGDLGEYDFYEYKEYEDKPTS PPNEE 400 
401 FGPGVPAETDITETSINGHGAYGEKGQKGEPAVVEPGMLVEGPPGPAGPA 450 

M | I I t I I I I I I I I I I I I I I I I 1 I I I I 1 I M I I I I I I I I I I I I I I I I I I I 

401 FGPGVPAETDITETSINGHGAYGEKGQKGEPAVVEPGMLVEGPPGPAGPA 45 0 

451 GIMGPPGLQGPTGPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYG 500 

| | I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

451 GIMGPPGLQGPTGPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYG 50 0 

501 GDGSKGPTISAQEAQAQAILQQARIALRGPPGPMGLTGRPGPVGGPGSSG 550 

I | | | | | | | | I I t I I I I I I I t II I I I I I I I I I I I I I I I I I M I I M I I I I I 

501 GDGSKGPTISAQEAQAQAILQQARIALRGPPGPMGLTGRPGPVGGPGSSG 550 

551 AKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDR 600 

I | | | | | I I | I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I II I I I I I I I 

551 AKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDR 60 0 
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601 GFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAGP 650 

I 1 I I ! I I I I ! I 1 I! I i I I I I I I I I I I I ! I I I I I i i I I I i I I I I I ! I I I I I 

6 01 GFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAGP 650 
. . . * • 

5 651 RGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQG 70 0 

I I I I I 1 I ! I I 1 II I I ! I I I I I I I I I I I I I I I II 1 I I I i ! I 1 I I I I I I I I I 

651 RGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQG 700 

701 LPGPQGPIGPPGEKGPQGKPGLAGLPGADGPPGHPGKEGQSGEKGALGPP 750 

10 | | I | | I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I t I I 

701 LPGPQGPIGPPGEKGPQGKPGLAGLPGADGPPGHPGKEGQSGEKGALGPP 750 

- • 

751 GPQGPIGYPGPRGVKGADGVRGLKGSKGEKGEDGFPGFKGDMGLKGDRGE 800 
1 I 1 I I I I II I I I I I ! I I I I I I I I I I I I 1 I 1 ) 1 1 I I I ! I I I I I I I I I I I I I 
15 751 GPQGPIGYPGPRGVKGADGVRGLKGSKGEKGEDGFPGFKGDMGLKGDRGE 8 00 

801 VGQIGPRGEDGPEGPKGRAGPTGDPGPSGQAGEKGKLGVPGLPGYPGRQG 850 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
801 VGQIGPRGEDGPEGPKGRAGPTGDPGPSGQAGEKGKLGVPGLPGYPGRQG 850 

20 . 

851 PKGSTGFPGFPGANGEKGARGVAGKPGPRGQRGPTGPRGSRGARGPTGKP 90 0 

I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
851 PKGSTGFPGFPGANGEKGARGVAGKPGPRGQRGPTGPRGSRGARGPTGKP 900 

25 901 GPKGTSGGDGPPGPPGERGPQGPQGPVGFPGPKGPPGPPGKDGLPGHPGQ 950 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

901 GPKGTSGGDGPPGPPGERGPQGPQGPVGFPGPKGPPGPPGKDGLPGHPGQ 950 

• • • • • 

951 RGETGFQGKTGPPGPGGVVGPQGPTGETGPIGERGHPGPPGPPGEQGLPG 1000 
30 I I I I I I I I I I I I I I I I I I I I I I I I I IS I I I I I I I II I I I I I I M I I I I I I 

951 RGETGFQGKTGPPGPGGVVGPQGPTGETGPIGERGHPGPPGPPGEQGLPG 1000 
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■ ■ • . • 

1001 AAGKEGAKGDPGPQGISGKDGPAGLRGFPGERGLPGAQGAPGLKGGEGPQ 1050 

I I 1 I I I I I I I I I I I I I I I I 1 I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I 
1001 AAGKEGAKGDPGPQGISGKDGPAGLRGFPGERGLPGAQGAPGLKGGEGPQ 1050 

1051 GPPGPVVS 1058 
I I I I I I I 

1051 GPPGPVGS 1058 



15 Sequence name: C A 1 B_HUMAN 



Sequence documentation : 



20 



Alignment of: HUMCA1XIA_P15 x CA1B_HUMAN 



Alignment segment 1/1: 



Quality: 7073.00 

Escore: 0 
25 Matching length: 714 

length: 714 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
30 Identity: 100.00 

Gaps : 0 



Total 



Matching Percent 



Total Percent 
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Alignment : 

1 ME P WS S RWKTKRWLWDFTVT TL ALT FL FQ ARE VRGAAP VDVLKALD FHN S 5 0 

I I i I I I 11 I I I I I t I ! I I i I ! 1 1 I t I I I I ! I M M I I I M I I I I 1 I I 1 I I 

1 ME PWSSRWKTKRWLWDFTVTTLALTFLFQARE VRGAAP VDVLKALDFHNS 5 0 
51 PEGI SKTTGFCTNRKNSKGS DTAYRVSKQAQLSAPTKQLFPGGTFPEDFS 100 

M I I I I i I I I I I 1 I I I I I I 1 I M ! I I ! I I I I I I I I I 1 I I I I I I I I I I I M 

51 PEG I SKTTGFCTNRKNSKGS DTAYRVSKQAQLSAPTKQLFPGGTFPEDFS 100 
101 ILFTVKPKKGIQSFLLSIYNEHGIQQIGVEVGRSPVFLFEDHTGKPAPED 150 

I I I M I I I II I I I I I I I I I I I 1 I M 1 I I I I I I I I I I I I I I I 1 I I I I I I I I 

101 ILFTVKPKKGIQSFLLSIYNEHGIQQIGVEVGRSPVFLFEDHTGKPAPED 150 
151 YPLFRTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDT 200 

I I i I I I I I I I I I I I ! I I I I I I I I I I II I I I I I M I I I I I I 1 I M I I I I I I 

151 YPLFRTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDT 200 
201 NGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEHYSPDCDSSAPKA 250 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I M I 

201 NGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEHYSPDCDSSAPKA 250 
251 AQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQTEA 300 

I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 1 I I I I 

251 AQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQTEA 300 

301 NIVDDFQEYNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDS 350 

| I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
301 N I VDDFQE YNY GTME S YQTE APRHVS GTNE PNP VEE I FTEE YLTGEDYDS 350 
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351 QRKNSEDTLYENKEIDGRDSDLLVDGDLGEYDFYEYKEYEDKPTSPPNEE 400 

I I 1 I I I I I I I 1 1 I I I I I I I I I 1 I I t 1 1 I 1 I I I ! 1 I I I I I I I I I I I I I I I I 
351 QRKNSEDTLYENKEIDGRDSDLLVDGDLGEYDFYEYKEYEDKPTSPPNEE 400 

5 401 FGPGVPAETDITETSINGHGAYGEKGQKGEPAVVEPGMLVEGPPGPAGPA 450 

I I I I I I I I I I I I I I I I I I 1 1 I I i I I I I I I I I I I I I I I I I I I I I I I M I ) 1 
401 FGPGVPAETDITETSINGHGAYGEKGQKGEPAVVEPGMLVEGPPGPAGPA 450 

. . - • ■ 

4 51 GIMGPPGLQGPTGPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYG 50 0 

10 I I I I I I It I I I I I 1 I I I 1 I I I I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I 

451 GIMGPPGLQGPTGPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYG 50 0 

501 GDGSKGPTISAQEAQAQAILQQARIALRGPPGPMGLTGRPGPVGGPGSSG 55 0 

I I I I I I I I I I I I I 1 1 I I I I I 1 I I I I I I I I ! i I I I I I I 1 I I I I I I I I II I I 

15 501 GDGSKGPTISAQEAQAQAILQQARIALRGPPGPMGLTGRPGPVGGPGSSG 550 

551 AKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDR 600 

I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I II II I 1 I I I I 
551 AKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDR 600 
20 ..... 

601 GFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAGP 650 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
601 GFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAGP 650 

25 651 RGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQG 700 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I 
651 RGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQG 70 0 

701 LPGPQGPIGPPGEK 714 
30 I I I I I I I I I II I I I 

701 LPGPQGPIGPPGEK 714 
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10 



Sequence name: CA1B_HUMAN 



Sequence documentation : 



Alignment of: HUMCA1XIA_P1 6 x CA1B__HUMAN 



Alignment segment 1/1: 



15 Quality: 6795.00 

Escore: 0 

Matching length: 696 
length: 714 
Matching Percent Similarity: 100.00 
20 Identity: 100.00 

Total Percent Similarity: 97.48 
Identity: 97.48 

Gaps : 1 



Total 



Matching Percent 



Total Percent 



25 Alignment: 



30 



1 MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNS 50 

I I I I I ) M I I I i I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I ! I I I 

1 MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNS 50 

51 PEGISKTTGFCTNRKNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFS 100 
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I I M I I I 1 I II I 1 I I I I I I I I I I I I I I I 1 I I ! I I I I I I I I I I I I I ! I I I I 

51 PEGISKTTGFCTNRKNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFS 100 

101 ILFTVKPKKGIQSFLLSIYNEHGIQQIGVEVGRSPVFLFEDHTGKPAPED 150 

5 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I II 

101 I LFTVKPKKG I Q S FLLS I YNEHGI QQ I GVE VGRS P VFLFE DHTGKPA PE D 150 

151 YPLFRTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDT 200 

I I I I i I I I I I I I I II I I I I I I I I I I ! I I I I I I I I I I I I I II I I I I I I I I I 

10 151 YPLFRTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDT 200 

201 NGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEHYSPDCDSSAPKA 250 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
201 NGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEHYSPDCDSSAPKA 250 
15 ..... 

251 AQAQEPQI DEYAPEDI IE YDYEYGEAEYKEAESVTEGPTVTEET I AQTEA 300 

I I I II I I I I I II I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I II 

251 AQAQEPQI DEYAPEDI IEYDYEYGEAEYKEAESVTEGPTVTEETI AQTEA 300 

20 301 NIVDDFQEYNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDS 350 

I I! I I I I I I I I I I M I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I 

301 NIVDDFQEYNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDS 350 

351 QRKNSEDTLYENKEIDGRDSDLLVDGDLGEYDFYEYKEYEDKPTSPPNEE 400 
25 | | | | | | | | | | | | | | | || | | | | | | | | | | | j | | | | | | | | | | | | | | | | | | | | | 

351 QRKNSEDTLYENKEIDGRDSDLLVDGDLGEYDFYEYKEYEDKPTSPPNEE 400 

401 FG P G VP AE TDITETSI NGHG AYGEKGQKGE PA WE P GMLVE GP P G PAG PA 450 

I I I I II I II i i I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 

30 401 FG P GVPAETD I TE TS I NGH G AYGEKGQKGE P A WEP GMLVE GPP G PAG PA 4 50 
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4 51 GIMGPPGLQGPTGPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYG 50 0 

I I I I I I i I I II I I I ! I I I 1 ! I I I I I I II 1 I I I I I I I I I I I I I I I I I I I I i 

4 51 GIMGPPGLQGPTGPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYG 5 0 0 
. » • • • 

5 501 GDGSKGPTI S AQE AQ AQ A I LQQ ARI ALRG P PG PMGL T GRP G P VG G P G S S G 550 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II 

501 GDGSKGPTISAQEAQAQAILQQARIALRGPPGPMGLTGRPGPVGGPGSSG 550 

. • • - • 

551 AKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDR 60 0 

10 I I I I II I I I I II I I 1 I! I I I I I I I I I II I I I I II I I II I I I I I I I I I I I I 

551 AKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDR 600 

601 GFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEA. . 64 8 

II I I I I I I I I I I I I I I I II I I M I I I I I I I I I I I I I I I I I I I I I I I I I 

15 601 GFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAGP 650 

649 GMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQG 682 

I I I I I I I II I I I I I I I I I I I II I I I II I I I I I I I 
651 RGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQG 700 



20 



25 



683 LPGPQGPIGPPGEK 696 
I I I I I I I I I I I I I I 

7 01 LPGPQGPIGPPGEK 714 



30 Sequence name: CA1B_HUMAN 



WO 2006/131783 



PCT/IB2005/004037 



940 



Sequence documentation : 



Alignment of: HUMCA1XIA_P17 x CA1B_HUMAN 



5 Alignment segment 1/1: 



Quality: 2561.00 

Escore: 0 

Matching length: 260 
10 length: 260 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 
15 Gaps: 0 



Total 



Matching Percent 



Total Percent 



Alignment : 



20 



1 MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNS 5 0 

I I I I ! I I I I I I I I I I I I I I I I 1 i I ) I ! I I I I I I I I I I I I I I I I II I I I I I 

1 MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNS 50 



25 



51 PEGISKTTGFCTNRKNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFS 100 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I! I I I I I I I 

51 PEGISKTTGFCTNRKNSKGSDTAYRVSKQAQLSAPTKQLFPGGTFPEDFS 100 



30 



101 I LFTVKPKKG I QS FLLS I YNEHG I QQ I GVEVGRS PVFLFEDHTGKPAPED 150 

I I I I I I I I I I I 1 I I I I I I I I I I I I I I I II I I I I I I 1 I I I I I I I I I I I I I I 
101 I LFTVKPKKGIQS FLLS IYNEHGIQQ I GVEVGRS PVFLFEDHTGKPAPED 150 

• . a • • 

151 YPLFRTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDT 200 
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I I I I I I I I I I I 1 I I I I I I i I I I I I I I I I I I I I ! I I I I I I ! I I I I I I I I M 

151 YPLFRTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDT 20 0 
201 NGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEHYSPDCDSSAPKA 250 

I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I M I M I I I I I I I I I I i I 1 1 

201 NGITVFGTRILDEEVFEGDIQQFLITGDPKAAYDYCEHYSPDCDSSAPKA 250 

251 AQAQEPQIDE 2 60 

I I i i I I I i 1 i 

251 AQAQEPQIDE 2 60 



Expression of Homo sapiens collagen, type XI, alpha 1 (COL11A1) HUMCA1X1A transcripts 
which are detectable by amplicon as depicted in sequence name HUMCA1X1 A seg55 in normal 

and cancerous lung tissues 
Expression of Homo sapiens coUagen, type XI, alpha 1 (COL11A1) transcripts detectable 
by or according to seg55, HUMCA1X1A seg55 amplicon (SEQ ID NO:1663) and primers 
HUMCA1X1A seg55F (SEQ ID NO:1661) and HUMCA1X1A seg55R (SEQ ID NO:1662) was 
measured by real time PCR. In parallel the expression of four housekeeping genes -PBGD 
(GenBank Accession No. BC019323; amplicon - PBGD-amplicon, SEQ ID NO:334), HPRT1 
(GenBank Accession No. NM_000194; amplicon - HPRT1 -amplicon, SEQ ID NO: 1297), 
Ubiquitin (GenBank Accession No. BC000449; amplicon - Ubiquitin- amplicon, SEQ ID 
NO:328) and SDHA (GenBank Accession No. NM_004168; amplicon - SDHA-amplicon, SEQ 
ID NO:331), was measured similarly. For each RT sample, the expression of the above 
amplicon was normalized to the geometric mean of the quantities of the housekeeping genes. 
The normalized quantity of each RT sample was then divided by the median of the quantities of 
the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, above), to 
obtain a value of fold up-regulation for each sample relative to median of the normal PM 
samples. 

Figure 67 is a histogram showing over expression of the above -indicated Homo sapiens 
collagen, type XI, alpha 1 (COL1 1 Al) transcripts in cancerous lung samples relative to the 
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normal samples. Values represent the average of duplicate experiments. Error bars indicate the 
minimal and maximal values obtained. 

As is evident from Figure 67, the expression of Homo sapiens collagen, type XI, alpha 1 
(COL1 1A1) transcripts detectable by the above amplicon(s) in cancer samples was significantly 
5 higher than in the non-cancerous samples (Sample Nos. 47-50, 90-93, 96-99 Table 2). Notably 
an over- expression of at least 5 fold was found in 11 out of 15 adenocarcinoma samples, 11 out 
of 16 squamous cell carcinoma samples, and in 2 out of 4 large cell carcinoma samples. 

Primer pairs are also optionally and preferably encompassed within the present 
10 invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: HUMCA1X1A seg55F forward 
primer; and HUMCA1X1 A seg55R reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
15 was obtained as a non- limiting illustrative example only of a suitable amplicon: HUMCA1X1A 
seg55. 

Forward primer -HUMCA1X1 A seg55F (SEQ ID NO:1661): 
TTCTCATAGTATTCCATTGATTGGGTA 
Reverse primer- HUMCA1X1A seg55R (SEQ ID NO:1662): 
20 CACCGGTATGGAGAATAGCGA 
Amplicon (SEQ ID NO:1663): 

TTCTCATAGTATTCCATTGATTGGGTATACCAGGTTCTGTTTACTTTTACTTGGCAGT 
TGATAGAATAGGTGTAGTTTATACTTTTTCGCTATTCTCCATACCGGTG 



25 
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DESCRIPTION FOR CLUSTER Tl 1628 
Cluster Tl 1628 features 6 transcript(s) and 25 segment(s) of interest, the names for which 
are given in Tables 894 and 895, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 896. 

5 Table 894 - Transcripts of interest 



Transcript Name . - ' ',<'.; 


Sequence ID No. 


T11628_PEA_1_T3 


103 


T11628_PEA_1_T4 


104 


T11628_PEA_1_T5 


105 


T11628_PEA_1_T7 


106 


T11628_PEA_1_T9 


107 


T11628_PEA_1_T11 


108 


Table 895 - Segments of interest 


Segment Name \ fj • ■ ;a' ; * |J 


Sequence ID No. ■ 


Tl 1628_PEA_l_node_7 


789 


Tl 1628_PEA_l_node_l 1 


790 


Tl 1 628_PEA_l_node_l 6 


791 


Tl 1628_PEA_l_node_22 


792 


Tl 1628_PEA_l_node_25 


793 


Tl 1 628_PEA_l_node_3 1 


794 


Tl 1628_PEA_l_node_37 


795 


Tl 1628_PEA_l_node_0 


796 


Tl 1628_PEA_l_node_4 


797 


Tl 1628_PEA_l_node_9 


798 


Tl 1628_PEA_l_node_13 


799 


Tl 1 628_PEA_l_node_14 


800 


Tl 1628_PEA_l_node_17 


801 


Tl 1628_PEA_l_node_l 8 


802 
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T 1 1 628_PEA_l_node_l 9 


803 


Tl 1 628_PEA_l_node_24 


804 


Tl 1628_PEA_l_node_27 


805 


Tl 1628_PEA_l_node_28 


806 


Tl 1628_PEA_l_node_29 


807 


Tl 1628_PEA_l_node_30 


808 


Tl 1628_PEA_l_node_32 


809 


Tl 1628_PEA_l_node_33 


810 


Tl 1 628_PEA_l_node_34 


811 


Tl 1628_PEA_l_node_35 


812 


Tl 1628_PEA_l_node_36 


813 



Table 896 - Proteins of interest 



Protein Name 


Sequence ID No. 


Corresponding Transcripts) 


T11628_PEA_1_P2 


1376 


T11628_PEA_1_T3; 
T11628_PEA_1_T5; 
T11628_PEA_1_T7 


T11628_PEA_1_P5 


1377 


T11628_PEA_1_T9 


T11628_PEA_1_P7 


1378 


T11628_PEA_1_T11 


T11628_PEA_1_P10 


1379 


T11628_PEA_1_T4 



These sequences are variants of the known protein Myoglobin (SwissProt accession 
5 identifier MYG HUMAN), SEQ ID NO: 1448, referred to herein as the previously known 
protein. 

Protein Myoglobin is known or believed to have the following function(s): Serves as a 
reserve supply of oxygen and facilitates the movement of oxygen within muscles. The sequence 
for protein Myoglobin is given at the end of the application, as "Myoglobin amino acid 
10 sequence". Known polymorphisms for this sequence are as shown in Table 897. 

Table 897 -Amino acid mutations for Known Protein 
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SNP position(s) on ' 
amino acid sequence 


■Comment • ■ s '"c ; ■; ■ 


54 


E -> K. /FTId=VAR_003 1 80. 


133 


K -> N. /FTId=VAR_003 181. 


139 


R -> Q. /FTId=VAR_003182. 


139 


R -> W. /FTId=VAR_003 183. 


128 


Q->E 



As noted above, cluster Tl 1628 features 6 transcript(s), which were listed in Table 1 
above. These transcript(s) encode for protein(s) which are variant(s) of protein Myoglobin. A 
description of each variant protein according to the present invention is now provided. 



5 Variant protein Til 628 JPEA_1_P2 according to the present invention has an amino acid 

sequence as given at the end of the application; it is encoded by transcript(s) 
Tl 1628_PEA_1_T3. An alignment is given to the known protein (Myoglobin) at the end of the 
application. One or more alignments to one or more previously published protein sequences are 
given at the end of the application. A brief description of the relationship of the variant protein 
10 according to the present invention to each such aligned protein is as follows: 

Comparison report between Tl 1628JPEA_1_P2 and Q8WVH6 (SEQ ID NO:1450): 
l.An isolated chimeric polypeptide encoding for Tl 1628JPEA_1 JP2, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
1 5 the sequence 

MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDE 
corresponding to amino acids 1 - 55 of Tl 1628_PEA_1_P2, and a second amino acid sequence 
being at least 90 % homologous to 

MKASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEF 
20 LQSKHPGDFGADAQGAJVDNn^ corresponding to amino 

acids 1 - 99 of Q8WVH6, which also corresponds to amino acids 56 - 154 of 
Tl 1628JPEA 1_P2, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 
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2.An isolated polypeptide encoding for a head of Tl 1628_PEA_1JP2, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

5 MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKPKHLKSEDEof 
T11628JPEA_1JP2. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
10 programs. The variant protein is believed to be located as follows with regard to the cell: 

intracellularly. The protein localization is believed to be intracellularly because neither of the 
trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signal-pep tide prediction programs predict that this protein is a non- secreted 
protein. 

15 Variant protein Tl 1628_PEA_1_P2 also has the following non-silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 898, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein Tl 1628JPEA_1 JP2 
sequence provides support for the deduced sequence of this variant protein according to the 

20 present invention). 

Table 898 - Amino acid mutations 



SNP p6sitiori(s)on.arnino acid 
sequence 


Alternative amino acid(s) 


! Previously known SNP? 


26 


G-> 


No 


44 


F-> 


No 


92 


Q->R 


No 


135 


A-> 


No 


141 


K-> 


No 


153 


Q-> 


No 
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Variant protein Tl 1628_PEA_1_P2 is encoded by the following transcript(s): 
Tl 1628_PEA_1_T3, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Tl 1628JPEAJMT3 is shown in bold; this coding portion starts at 
position 220 and ends at position 681. The transcript also has the following SNPs as listed in 
5 Table 899 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein T11628_PEA_1JP2 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 



Table 899- Nucleic acid SNPs 



SNP position on nucleotide J ? 
sequence r .• $t 


Alternative nucleic acid. 


Keyiously Icnbwii SNP? 


83 


G->A 


Yes 


93 


G-> A 


Yes 


95 


G->A 


Yes 


146 


G->A 


Yes 


295 


G-> 


No 


349 


T-> 


No 


393 


G->A 


Yes 


423 


C->T 


Yes 


494 


A->G 


No 


498 


G->A 


No 


623 


C-> 


No 


642 


G-> 


No 


678 


G-> 


No 


686 


C-> 


No 


686 


C->A 


No 


717 


C-> 


No 


787 


T->G 


No 


820 


G->T 


No 


826 


G->T 


No 
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850 


C -> 


No 


934 


T->G 


No 


975 


A->G 


Yes 


1117 


G-> 


No 


1218 


A->G 


No 



Variant protein T11628_PEA_1_P5 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 Tl 1628JPEA_1_T9. An alignment is given to the known protein (Myoglobin) at the end of the 
application. One or more alignments to one or more previously published protein sequences are 
given at the end of the application. A brief description of the relationship of the variant protein 
according to the present invention to each such aligned protein is as follows: 

Comparison report between Tl 1628JPEA_1_P5 and M YG_HUM AN V 1 (SEQ ID 
10 NO:1449): 

LAn isolated chimeric polypeptide encoding for T11628_PEA_1_P5, comprising a first 
amino acid sequence being at least 90 % homologous to 
MKASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKI 
LQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG corresponding to amino 
15 acids 56 - 154 of MYG_HUMAN_V 1 , which also corresponds to amino acids 1 - 99 of 
T11628_PEA_1JP5. 

It should be noted that the known protein sequence (MY GJHUMAN) has one or more 
changes than the sequence given at the end of the application and named as being the amino 
20 acid sequence for M YGHUMAN JV 1 . These changes were previously known to occur and are 
listed in the table below. 

Table 900 - Changes to AdYG_HUMAN_Vl 



SNP position(s) on 
amino acid sequence 


Type of change 


1 


init_met 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
5 programs. The variant protein is believed to be located as follows with regard to the cell: 

intracellularly. The protein localization is believed to be intracellularly because neither of the 
trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signal-peptide prediction programs predict that this protein is a non- secreted 
protein. 

10 Variant protein Tl 1 628 JPEA_1 JP5 also has the following non- silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 90 1, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein T11628_PEA_1JP5 
sequence provides support for the deduced sequence of this variant protein according to the 



1 5 present invention) . 

Table 901 - Amino acid mutations 



SNP position(s) on amino acid , 
sequence - \ 


Alternative amino acid(s) ■ 


Previously faiownSHP? - .. 


37 


Q->R 


No 


80 


A-> 


No 


86 


K-> 


No 


98 


Q-> 


No 



Variant protein Tl 1628JPEA_1JP5 is encoded by the following transcript(s): 
Tl 1628 PEA1T9, for which the sequence(s) is/are given at the end of the application. The 
20 coding portion of transcript Tl 1628JPEA_1 JT9 is shown in bold; this coding portion starts at 
position 211 and ends at position 507. The transcript also has the following SNPs as listed in 
Table 902 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
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known SNPs in variant protein Tl 1628_PEA_1_P5 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 902 -Nucleic acid SNPs 



SNP posiijon on nucleotide 
sequence • ,-0\ : ■ . ; - X ■ " ■ 


Alternative nucleic acid 

f. ' ■„'.' " i ' 


Previously Ioiown SNP? - - ; 


2 


C->T 


Yes 


175 


T-> 


No 


219 


G-> A 


Yes 


249 


C ->T 


Yes 


320 


A->G 


No 


324 


G->A 


No 


449 


C-> 


No 


468 


G-> 


No 


504 


G-> 


No 


512 


C-> 


No 


512 


C->A 


No 


543 


C-> 


No 


613 


T->G 


No 


646 


G->T 


No 


652 


G->T 


No 


676 


C -> 


No 


760 


T->G 


No 


801 


A->G 


Yes 


943 


G-> 


No 


1044 


A->G 


No 



5 

Variant protein Tl 1628_PEA_1 JP7 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
Tl 1628JPEA_1_T11. An alignment is given to the known protein (Myoglobin) at the end of the 
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application. One or more alignments to one or more previously published protein sequences are 
given at the end of the application. A brief description of the relationship of the variant protein 
according to the present invention to each such aligned protein is as follows: 

Comparison report between Tl 1628JPEA_1_P7 and MYGJHUMAN_V1: 
5 LAn isolated chimeric polypeptide encoding for Tl 1628_PEA_1 JP7, comprising a first 

amino acid sequence being at least 90 % homologous to 

MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKPDKPKHLKSEDEMK 
ASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQ 
SKHPGDFGADAQGAMNK corresponding to amino acids 1-134 of MYG_HUMAN_V 1 , 
10 which also corresponds to amino acids 1-134 of Tl 1 628_PEA_1_P7, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence G 
corresponding to amino acids 135 - 135 of Tl 1628_PEA_1_P7, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 

15 

It should be noted that the known protein sequence (MY G_HUMAN) has one or more 
changes than the sequence given at the end of the application and named as being the amino 
acid sequence for MYG HUMAN Vl . These changes were previously known to occur and are 
listed in the table below. 

20 Table 903 - Changes to MYGJHUMAN_V1 



SNPposition(s)on 
amino acid sequence 


;Type of change' ; . : ;> v; 


1 


init_met 



The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
25 programs. The variant protein is believed to be located as follows with regard to the cell: 

intracellularly. The protein localization is believed to be intracellularly because neither of the 
trans- membrane region prediction programs predicted a trans- membrane region for this protein. 
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In addition both signal-peptide prediction programs predict that this protein is a non-secreted 
protein. 

Variant protein Tl 1628JPEA_1 JP7 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 904, (given according to their position(s) on the 
5 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein Tl 1628_PEA_1_P7 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 904 - Amino acid mutations 



SNP posit ion(s) on amino acid 
: sequence' W*!; *^ J | ' | ; 


Alternative amino acid(s) 


Previously known SNP? 


26 


G-> 


No 


44 


F-> 


No 


92 


Q->R 


No 



10 



Variant protein Tl 1628JPEA_1 JP7 is encoded by the following transcript(s): 
Tl 1628_PEA_1_T1 1, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Tl 1628JPEA_1_T1 1 is shown in bold; this coding portion starts at 
position 319 and ends at position 723. The transcript also has the following SNPs as listed in 
15 Table 905 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein T11628JPEA_1_P7 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 



Table 905 -Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


394 


G-> 


No 


448 


T-> 


No 


492 


G->A 


Yes 
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522 


C ->T 


Yes 


593 


A->G 


No 


597 


G-> A 


No 


728 


C-> 


No 


728 


C->A 


No 


759 


C -> 


No 


829 


T->G 


No 


862 


G->T 


No 


868 


G->T 


No 


892 


C-> 


No 


976 


T->G 


No 


1017 


A->G 


Yes 


1159 


G-> 


No 


1260 


A->G 


No 



Variant protein Tl 1628_PEA_1_P10 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 Tl 1628JPEA_1_T4. An alignment is given to the known protein (Myoglobin) at the end of the 
application. One or more alignments to one or more previously published protein sequences are 
given at the end of the application. A brief description of the relationship of the variant protein 
according to the present invention to each such aligned protein is as follows: 

Comparison report between T11628JPEA_1JP10 and Q8WVH6 (SEQ ID NO: 1450): 
10 l.An isolated chimeric polypeptide encoding for Tl 1628_PEA__1_P10 5 comprising a first 

amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence 

MGLSDGEWQLVLNWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDE 
15 corresponding to amino acids 1-55 of T11628_PEA_1 JP10, and a second amino acid sequence 
being at least 90 % homologous to 

MKASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPV^ 
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LQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG corresponding to amino 
acids 1 - 99 of Q8WVH6, which also corresponds to amino acids 56 - 154 of 
T11628 PEA_1JP10, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 
5 2. An isolated polypeptide encoding for a head of Tl 1628JPEA1JP10, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDE of 
10 T11628JPEA_1_P10. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
15 intracellularly. The protein localization is believed to be intracellularly because neither of the 
trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signal-peptide prediction programs predict that this protein is a non- secreted 
protein. 

Variant protein Tl 1628JPEA_1_P10 also has the following non-silent SNPs (Single 
20 Nucleotide Polymorphisms) as listed in Table 906, (given according to their position(s) on the 

amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein Tl 1628JPEA_1_P10 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

25 Table 906 ~ Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) 1 


: Previously known SNP? 


26 


G-> 


No 


44 


F-> 


No 


92 


Q ->R 


No 
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135 


A-> 


No 


141 


K-> 


No 


153 


Q-> 


No 



Variant protein Tl 1628_PEA_1_P10 is encoded by the following transcript(s): 
T11628__PEA__1_T4, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Tl 1628JPEA_1 JT4 is shown in bold; this coding portion starts at 
5 position 205 and ends at position 666. The transcript also has the following SNPs as listed in 
Table 907 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein T11628_PEAJMP10 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 907 - Nucleic acid SNPs 



SJNP posits ■■■■{■' 

: seqi|tfoe ....P* : . | „' _ . - ; " r % 


Alternative nucleic acid 


Previously known SNt>? >.' . 

%i' ' ■ ' ■ \ '■ ■ R' • 


280 


G-> 


No 


334 


T-> 


No 


378 


G->A 


Yes 


408 


C ->T 


Yes 


479 


A->G 


No 


483 


G->A 


No 


608 


C-> 


No 


627 


G-> 


No 


663 


G-> 


No 


671 


C-> 


No 


671 


C->A 


No 


702 


C-> 


No 


772 


T->G 


No 


805 


G->T 


No 


811 


G->T 


No 
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835 


C -> 


No 


919 


T->G 


No 


960 


A->G 


Yes 


1102 


G-> 


No 


1203 


A->G 


No 



As noted above, cluster Tl 1628 features 25 segment(s), which were listed in Table 2 
above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
provided. 



Segment cluster Tl 1628JPEA_l_nodeJ7 according to the present invention is supported 
by 9 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T11628_PEA_1_T3. Table 908 below describes the 
10 starting and ending position of this segment on each transcript. 

Table 908 - Segment location on transcripts 



' Transcript namer ■•• . 


uSegmebtj ' \ - J? J;-, . 
I starting position \ V 


? ending |K)sition 


T11628_PEA_1_T3 


1 


211 



Segment cluster T11628JPEA_l_node_ll according to the present invention is supported 
15 by 1 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T11628_PEA_1_T5. Table 909 below describes the 
starting and ending position of this segment on each transcript. 

Table 909 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


T11628„PEA_1JT5 


48 


178 
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Segment cluster Tl 1628JPEA l_node_16 according to the present invention is supported 
by 38 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T11628JPEAJL.T1 1. Table 910 below describes the 



5 starting and ending position of this segment on each transcript. 
Table 910 - Segment location on transcripts 



Transcript name .;*' ";> . 


Segment . -V 
starting position 

& • 


^Segment ■;■/': ^ ; ; " 
; ending position 


T11628_PEA_1JT11 


1 


214 



Segment cluster Tl 1628_PEA_l_node_22 according to the present invention is supported 
10 by 1 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Tl 1628JPEA_1_T9. Table 91 1 below describes the 
starting and ending position of this segment on each transcript. 

Table 911 - Segment location on transcripts 





1 SegmMf^; 
stating position 


Segment 

ending position §1^ 


T11628_PEA_1_T9 


1 


140 



15 

Segment cluster Tl 1628JPEA_l_node_25 according to the present invention is supported 
by 129 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Tl 1628 JPEAJLT3, Tl 1628JPEA_1JT4, 
T11628JPEAJMT5, T11628JPEA_1_T7 ? T11628_PEA_1_T9 and Tl 1628_PEA_1_T11. Table 
20 912 below describes the starting and ending position of this segment on each transcript. 



Table 912- Segment location on transcripts 



Transcript name 


Segment 


| Segment 




! starting position 


ending position 
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T11628_PEA_1_T3 


395 


537 


T11628_PEA_1_T4 


380 


522 


T11628_PEA_1_T5 


362 


504 


T11628_PEA_1_T7 


347 


489 


T11628_PEA_1_T9 


221 


363 


T11628_PEA_1_T11 


494 


636 



Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 



5 were found to hit this segment (in relation to lung cancer), shown in Table 913. 
Table 913 - Oligonucleotides related to this segment 



Oligonucleotide name -;• 1 


Overexposed m cancers 


: Chip reference " . 


T11628_0_9_0 


lung malignant tumors 


LUN 



Segment cluster Tl 1 628_PEA_l_node__3 1 according to the present invention is supported 
10 by 137 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T11628_PEA_1_T3, Tl 1628_PEA_1JT4 5 
T11628JPEA_1JT5, Tl 1628_PEA_1JT7, T11628_PEA_1_T9 and T11628JPEA_1„T11. Table 
914 below describes the starting and ending position of this segment on each transcript. 



Table 914 - Segment location on transcripts 



Transcript name 


! Segment 


Segment 




starting position 


ending position 


T11628_PEA_1_T3 


702 


831 


T11628_PEA_1_T4 


687 


816 


T11628_PEA_1_T5 


669 


798 


T11628_PEA_1_T7 


654 


783 


T11628_PEA_1_T9 


528 


657 
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T11628_PEA_1_T11 


744 


873 









Segment cluster Tl 1628_PEA_l_node_37 according to the present invention is supported 
by 99 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript (s): Tl 1628JPEA_1_T3, Tl 1628JPEA_1_T4, 

T11628JPEA_1_T5, Tl 1628_PEA_1_T7, Tl 1628JPEA_1_T9 and T11628JPEA_1_T11. Table 
915 below describes the starting and ending position of this segment on each transcript. 

Table 915 - Segment location on transcripts 



Transcript name " '• . ! 


starting position H; 


Segment ' ,' 
ending position • 


T11628_PEA_1_T3 


1086 


1225 


T11628_PEA_1_T4 


1071 


1210 


T11628_PEA_1_T5 


1053 


1192 


T11628_PEA_1_T7 


1038 


1177 


T11628_PEA_1_T9 


912 


1051 


T11628_PEA_1_T11 


1128 


1267 



According to an optional embodiment of the present invention, short segments related to 
10 the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

Segment cluster Tl 1628JPEA_l_node_0 according to the present invention is supported 
by 1 libraries. The number of libraries was determined as previously described. This segment 
15 can be found in the following transcript(s): Tl 1628_PEA_1_T4. Table 916 below describes the 
starting and ending position of this segment on each transcript. 

Table 916 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


; Segment 
ending position 


T11628_PEA_1_T4 


1 


93 
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Segment cluster Tl 1628JPEA_l_node_4 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T11628JPEA_1_T4. Table 917 below describes the 



5 starting and ending position of this segment on each transcript. 
Table 917 - Segment location on transcripts 





Segment 

startMg position J - e 

- " - '"■ 33f 


Se^ent? ,■ \:A 
ending position | '. • ? 


T11628_PEA_1_T4 


94 


196 



Segment cluster T11628JPEA_l_node_9 according to the present invention is supported 
10 by 16 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Tl 1628_PEA_1_T5 and Tl 1628JPEA__1_T7. Table 
918 below describes the starting and ending position of this segment on each transcript. 



Table 918 - Segment location on transcripts 



Transcript name 1 


Segment ; Ha 
\ , stetog position 


Segment .. 
. ending position^ >■! 


T11628_PEA_1_T5 


l 


47 


T11628_PEA_1_T7 


i 


47 



15 

Segment cluster Tl 1628_PEA_l_node_13 according to the present invention can be 
found in the following transcript(s): T11628_PEA_1_T7. Table 919 below describes the starting 
and ending position of this segment on each transcript. 

Table 919 - Segment location on transcripts 



Transcript name 


Segment 

! starting position 


Segment 
ending positioxi 


T11628JPEA_1_T7 


48 


65 
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Segment cluster Tl 1628JPEA__l_node_14 according to the present invention is supported 
by 1 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Tl 1628JPEA_1_T7. Table 920 below describes the 
starting and ending position of this segment on each transcript. 

Table 920 - Segment location on transcripts 



Transcript name . v 


:Segment " " f 4 
starting position^ ~ i 


Segment v 
ending position 


T11628_PEA_1_T7 


66 


163 



10 Segment cluster Tl 1628_PEA_l_node_17 according to the present invention is supported 

by 55 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Tl 1628JPEAJ JT1 1. Table 921 below describes the 
starting and ending position of this segment on each transcript. 

Table 921 - Segment location on transcripts 



: T^script'name;:, fy > • - . ! , ^ 


Segment- " 
starting position f. 


' Se^njent^ ' [h*$j ?5 ? 
endixig position ; t 5 


T11628JPEAJLT11 


215 


310 



15 

Segment cluster T11628_PEA_l_node_18 according to the present invention is supported 
by 98 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T11628JPEA__1JT3, T11628_PEA_1JT4, 
20 Tl 1628JPEA_1_T5, Tl 1628_PEA_1_T7 and Tl 1628_PEA_J JT1 1 . Table 922 below describes 
the starting and ending position of this segment on each transcript. 

Table 922 - Segment location on transcripts 
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Transcript name .;;* j5\ •' :•"■< 


Segment 
starting position 


Segment 

ending position . 


T11628_PEA_1_T3 


212 


289 


T11628_PEA_1_T4 


197 


274 


T11628_PEA_1_T5 


179 


256 


T11628_PEA_1_T7 


164 


241 


T11628_PEA_1_T11 


311 


388 



Segment cluster Tl 1628JPEA_l_node_19 according to the present invention can be 
found in the following transcript(s): T11628JPEA_1_T3, Tl 1628_PEA__1_T4, 
5 Tl 1628_PEA_1„T5, Tl 1628JPEA_1_T7 and Tl 1628_PEA_1_T1 1. Table 923 below describes 
the starting and ending position of this segment on each transcript. 

Table 923 - Segment location on transcripts 



Transcript name ' ■-'->§MW' • i V, V 


Segment 


Segment 




starting position^ :. * 


ending position 


T11628_PEA_1_T3 


290 


314 


T11628_PEA_1_T4 


275 


299 


T11628_PEA_1_T5 


257 


281 


T11628_PEA_1_T7 


242 


266 


T11628_PEA_1_T11 


389 


413 



10 Segment cluster Tl 1628JPEA_l_node_24 according to the present invention is supported 

by 112 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T11628JPEAJLT3, Tl 1628_PEA_1 JT4, 
T11628JPEAJL.T5, T11628J>EA_1JT7, T11628_PEA__1_T9 and T11628_PEA_1_T11. Table 
924 below describes the starting and ending position of this segment on each transcript. 

1 5 Table 924 - Segment location on transcripts 
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Transcript narq.e r - 

■ - • ■ "i ■ . ' > 


Segment 
starting position 


Segment 
ending position 


T11628_PEA_1_T3 


315 


394 


T11628_PEA_1_T4 


300 


379 


T11628_PEA_1_T5 


282 


361 


T11628_PEA_1_T7 


267 


346 


T11628_PEA_1_T9 


141 


220 


T11628_PEA_1_T11 


414 


493 



Segment cluster Tl 1628JPEA_l_node_27 according to the present invention is supported 
by 1 19 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Tl 1628 JPEA_1_T3, Tl 1628_PEA_1_T4 ? 

Tl 1628 J>EA_1 JT5, Tl 1628_PEA_1_T7, Tl 1628JPEAJ JT9 and Tl 1628JPEAJLJ1 L Table 
925 below describes the starting and ending position of this segment on each transcript. 



Table 925 - Segment location on transcripts 



. TraJisoipt nama v / " T /•? ; 


Segment 


" Segment ; " ; : . 




starting, position 


ending position 


T11628_PEA_1_T3 


538 


621 


T11628_PEA_1_T4 


523 


606 


T11628_PEA_1_T5 


505 


588 


T11628_PEA_1_T7 


490 


573 


T11628_PEA_1_T9 


364 


447 


T11628_PEA_1_T11 


637 


720 



10 Microarray (chip) data is also available for this segment as follows. As described above 

with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 926 
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Table 926 - Oligonucleotides related to this segment 



Oligonucleotide name .. 


Overexpressed in cancers 


Chip arpference ' 


T11628_0_9J) 


lung malignant tumors 


LUN 



Segment cluster Tl 1628JPEA_l_node 28 according to the present invention is supported 
5 by 1 15 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Tl 1628JPEAJ JT3, Tl 1628_PEA_1 JT4, 
Tl 1628_PEA_1_T5 ? Tl 1628_PEA_1 JT7 and Tl 1628_PEA_1_T9. Table 927 below describes 
the starting and ending position of this segment on each transcript. 



Table 927 - Segment location on transcripts 



Transcript name • M 


Segment 


■Segment . " ;' ■ <'■'<■ j 




I starting position 


ending position 


T11628_PEA_1_T3 


622 


650 


T11628_PEA_1_T4 


607 


635 


T11628_PEA_1_T5 


589 


617 


T11628_PEA_1_T7 


574 


602 


T11628_PEA_1_T9 


448 


476 



10 

Segment cluster T11628_PEA_l_node__29 according to the present invention is supported 
by 113 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcripts): T11628_PEA_1_T3 5 T11628JPEA_1__T4 ? 
15 Tl 1628_PEA_1_T5 ? Tl 1628_PEA_1_T7 and Tl 1628_PEA_1_T9. Table 928 below describes 
the starting and ending position of this segment on each transcript. 



Table 928 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


T11628_PEA_1_T3 


651 


678 
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T11628_PEA_1_T4 


636 


663 


T11628_PEA_1_T5 


618 


645 


T11628_PEA_1_T7 


603 


630 


T11628_PEA_1_T9 


477 


504 



Segment cluster Tl 1628JPEA_l__node_30 according to the present invention can be 
found in the following transcript(s): T11628_PEA_1_T3, Tl 1628JPEAJLT4, 
T11628_PEA_1 JT5, T11628J>EA_ljr7, T11628_PEA_1 JT9 and Tl 1628_PEA_1_T1 1. Table 
929 below describes the starting and ending position of this segment on each transcript. 

Table 929 - Segment location on transcripts 



Transcript name 4>k'' : *- v 


Segment •. :,-J: 
starting position 


Segment '[%'■ 
ending position 


T11628_PEA_1_T3 


679 


701 


T11628_PEA_1_T4 


664 


686 


T11628_PEA_1_T5 


646 


668 


T11628_PEA_1_T7 


631 


653 


T11628_PEA_1_T9 


505 


527 


T11628_PEA_1_T11 


721 


743 



Segment cluster Tl 1628JPEA_l_node_32 according to the present invention can be 
found in the following transcript(s): T11628_PEA_1_T3, T11628JPEA_1_T4, 
T11628JPEAJMT5, Tl 1628JPEA_1_T7, T11628JPEA_1_T9 and T11628_PEA__1_T11. Table 
930 below describes the starting and ending position of this segment on each transcript. 

Table 930 - Segment location on transcripts 



Transcript name 


! Segment 

| starting position 


Segment 
ending position 


T11628JPEA_1JT3 


832 


844 
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T11628_PEA_1_T4 


817 


829 


T11628_PEA_1_T5 


799 


811 


T11628_PEA_1_T7 


784 


796 


T11628_PEA_1_T9 


658 


670 


T11628_PEA_1_T11 


874 


886 



Segment cluster Tl 1628JPEA_l_node_33 according to the present invention can be 
found in the following transcript(s): T11628JPEA_1_T3, T11628_PEA_1_T4 ? 
5 T11628JPEA_1JT5, Tl 1628JPEA_1JT7, Tl 1628JPEA_1JT9 and T11628_PEA_1__T11. Table 
931 below describes the starting and ending position of this segment on each transcript. 

Table 931 - Segment location on transcripts 



Transcript name : ; %■ t x 

■ * .. . •• vi?- . "". <3 - 


! Segment # 
starting^positioii 


, Sbgmeriti' -V " . • 
! ending position 


T11628_PEA_1_T3 


845 


866 


T11628_PEA_1_T4 


830 


851 


T11628_PEA_1_T5 


812 


833 


T11628_PEA_1_T7 


797 


818 


T11628_PEA_1_T9 


671 


692 


T11628_PEA_1_T11 


887 


908 



10 Segment cluster T11628_PEA_l_node_34 according to the present invention is supported 

by 122 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): T11628_PEA_1_T3, T11628JPEA_1_T4, 
T11628_PEA_1_T5 ? T11628JPEA_1_T7, T11628_PEA_1JT9 and T11628_PEA_1_T1 1. Table 
932 below describes the starting and ending position of this segment on each transcript. 

15 Table 932 - Segment location on transcripts 
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Transcript name ' / 


Segment 


Segment \ , 




starting position 


ending position : 


T11628_PEA_1_T3 


867 


911 


T11628_PEA_1_T4 


852 


896 


T11628_PEA_1_T5 


834 


878 


T11628_PEA_1_T7 


819 


863 


T11628_PEA_1_T9 


693 


737 


T11628JPEAJMT11 


909 


953 



Segment cluster Tl 1628_PEA_l_node_35 according to the present invention is supported 
by 126 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Tl 1628JPEAJT3, T11628JPEA_1_T4, 

T11628_PEA_1JT5, T11628_PEA__1JT7, Tl 1628_PEA__1 JT9 andT11628JPEA_l_Tll. Table 
933 below describes the starting and ending position of this segment on each transcript. 



Table 933 - Segment location on transcripts 



Transcript name | 


Segment : v Wy<'-w 


' Segment -.' ' ; 




"starting position 


, ending position 


T11628_PEA_1_T3 


912 


967 


T11628_PEA_1_T4 


897 


952 


T11628_PEA_1_T5 


879 


934 


T11628_PEA_1_T7 


864 


919 


T11628_PEA_1_T9 


738 


793 


T11628_PEA_1_T11 


954 


1009 



10 

Segment cluster T 1 1 62 8_PE A_l _node_3 6 according to the present invention is supported 
by 122 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Tl 1628_PEA_1_T3 ? T11628_PEA_1JT4 5 
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Tl 1628JPEA_1 JT5, Tl 1628JPEAJJT7, Tl 1628 _PEA_1 JT9 and Tl 1628JPEA_1 JT1 1. Table 
934 below describes the starting and ending position of this segment on each transcript. 

Table 934 - Segment location on transcripts 



Transcript name v ~ : ~ ' 


Segment . ''1 
starting position ■ 


Segment / y ~/ '%. 
ending position 


T11628_PEA_1_T3 


968 


1085 


T11628_PEA_1_T4 


953 


1070 


T11628_PEA_1_T5 


935 


1052 


T11628_PEA_1_T7 


920 


1037 


T11628_PEA_1_T9 


794 


911 


T11628_PEA_1_T11 


1010 


1127 



5 



10 Variant protein alignment to the previously known protein: 
Sequence name: Q8WVH6 

Sequence documentation : 

15 Alignment of: Til 628_PEA_1_P2 x Q8WVH6 

Alignment segment 1/1: 

Quality: 962.00 

20 Escore: 0 

Matching length: 99 Total 

length: 9 9 
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Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

Alignment : 

5 6 MKASE DLKKHGAT VLT ALGG I LKKKGHHE AE I KPLAQS HATKHKI P VKYL 105 

I I ) I I I I I I I I I I I I I ! t I I ! I I I M I I I I M I I I I I I I I I I I ! ! I I I I I 

1 MKASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYL 50 
10 6 EFISECI IQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 

I | | I I I I 1 i I I I I i i I I 11 ! I I I I I I I ! I I I I I I I I i I I I II II M I I I 

51 EFISECI IQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG 9 9 



Sequence name: MYG_HUMAN_V1 
Sequence documentation : 

Alignment of: T1162 8_PEA_1_P5 x MY G_HUMAN__V 1 
Al ignment s egmen t 1 / 1 : 

Quality: 962.00 

Escore: 0 



WO 2006/131783 



PCT/IB2005/004037 



970 



Matching length: 
length: 99 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



99 



100 .00 



Total 



100.00 Matching Percent 



Total Percent 



10 



Alignment : 



1 MKASE DLKKHGAT VLT ALGG I LKKKGHHE AE I KPLAQ S HATKHK I PVKYL 50 

I I I I ! I I I I I I i I I I i I 1 I I I I M I I I I 1 I I I I I I i I I I I I I I I I I I I I I 

5 6 MKASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKI PVKYL 105 



15 



51 EFISECI IQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG 99 

1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 

10 6 EFISECI IQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 



20 



25 



Sequence name: MYG_HUMAN_V1 



Sequence documentation : 



Alignment of: T1162 8_PEA_1__P7 x MYG_HUMAN_V1 



30 Alignment segment 1/1: 
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Quality: 1315.00 

Escore: 0 

Matching length: 134 
length: 134 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps: 0 



Total 



Matching Percent 



Total Percent 



Alignment : 

1 MGL S DGEWQLVLNVWGKVE AD I PGHGQE VL I RL FKGH PETLEKFDKFKHL 5 0 

I I I I I I I i I I t I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I t I I I I I I I ! 

1 MGL S DGEWQLVLNVWGKVEADI PGHGQEVLIRLFKGHPETLEKFDKFKHL 50 
51 KSEDEMKASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKI 100 

I I I I I I I I I I I I I I I I I I I ! I ! I I I I II I I I I I I I I I I I I I I I I I I I I I i 

51 KSEDEMKASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKI 100 

101 PVKYLEFISECIIQVLQSKHPGDFGADAQGAMNK 134 

I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I 
101 PVKYLEFISECIIQVLQSKHPGDFGADAQGAMNK 134 



Sequence name : Q8WVH6 
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Sequence documentation : 



Alignment of: Til 628_PEA_1_P10 x Q8WVH6 



5 Alignment segment 1/1: 



Quality: 962.00 

Escore: 0 

Matching length: 99 
10 length: 99 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 
15 Gaps: 0 



Total 



100.00 Matching Percent 



Total Percent 



Alignment : 



20 



56 MKASE DLKKHG AT VLTAL GG I LKKKGHHE AE I KPLAQS HATKHK I PVKYL 105 

I 1 I I ! 1 I M I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I ! I I I I I I I I 

1 MKASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKI PVKYL 50 



25 



10 6 EFISECIIQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 

I I I 1 1 I I I I 1 1 I I I I I I I I I ! I I I II I I I I I I I I I I I I I I I I I I I I I I I 

51 EFISECIIQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG 9 9 



DESCRIPTION FOR CLUSTER HUMCEA 
Cluster HUMCEA features 5 transcipt(s) and 42 segment(s) of interest, the names for 
30 which are given in Tables 935 and 936, respectively, the sequences themselves are given at the 
end of the application. The selected protein variants are given in table 937. 
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Table 935 - Transcripts of interest 



Transcript Name . w • 


Sequence ID No. 


HUMCEA_PEA_1_T8 


109 


HUMCEA_PEA_1_T9 


110 


HUMCEA_PEA_1_T20 


111 


HUMCEA_PEA_1_T25 


112 


HUMCEA_PEA_1_T26 


113 


Table 936 - Segments of interest 




Rprtii p.tinf*. IT) TsTo ... *£■ f?f$ 


H U MLbA_rbA_ 1 _noae_(J 


01/1 


tttt-» f/'ri 1 A T"»T - ' A 1 JO 

HUMCEA_PEA_l_node_2 


815 


HUMCEA_PEA_l_node_l 1 


816 


HUMCEA_PEA_l_node_12 


817 


HUMCEA_PEA_1 _node_3 1 


818 


HUMCEA_PEA_l_node_36 


819 


HUMCEA_PEA_l_node_44 


820 


HUMCEA_PEA_l_node_46 


821 


HUMCEA_PEA_l_node_63 


822 


HUMCEA_PEA_l_node_65 


823 


HUMCEA_PEA_l_node_67 


824 


HUMCEA_PEA_l_node_3 


825 


HUMCEA_PEA_l_node_7 


826 


HUMCEA_PEA_l_node_8 


827 


HUMCEA_PEA_l_node_9 


828 


HUMCEA_PEA_l_node_l 0 


829 


HUMCEA_PEA_l_node_15 


830 


HUMCEA_PEA_l_node_l 6 


831 


HUMCEA_PEA_l_node_l 7 


832 
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HUMCEA_PEA_l_node_l 8 


833 


HUMCEA_PEA_l_node_l 9 


834 


HUMCEA_PEA_ l_node_20 


835 


HUMCEA_PEA_1 _node_2 1 


836 


HUMCEA_PEA_l_node_22 


837 


HUMCEA_PEA_l_node_23 


838 


HUMCEA_PEA_l_node_24 


839 


HUMCEA_PEA_l_node_27 


840 


HUMCEA_PEA_l_node_29 


841 


HUMCEA_PEA_1 _node_30 


842 


HUMCEA_PEA_l_node_33 


843 


HUMCEA_PEA_l_node_34 


844 


HUMCEA_PEA_l_node_35 


845 


HUMCEA_PEA_l_node_45 


846 


HUMCEA_PEA_l_node_50 


847 


HUMCEA_PEA_l_node_5 1 


848 


HUMCEA_PEA_l_node_56 


849 


HUMCEA_PEA_l_node_57 


850 


HUMCEA_PEA_l_node_58 


851 


HUMCEA_PEA_l_node_60 


852 


HUMCEA_PEA_l_node_61 


853 


HUMCEA_PEA_l_node_62 


854 


HUMCEA_PEA_l_node_64 


855 



Table 937 - Proteins of interest 



Protein Name 


Sequence ID No. 


Corresponding Transcript(s) 


HUMCEA_PEA_1_P4 


1380 


HUMCEA_PEA_1_T8 


HUMCEA_PEA_1_P5 


1381 


HUMCEA_PEA_1_T9 


HUMCEA_PEA_1_P 1 4 


1382 


HUMCEA_PEA_1_T20 
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HUMCEA_PEA_1_P 1 9 


1383 


HUMCEA_PEA_ 1_T25 


HUMCEA_PEA_1_P20 


1384 


HUMCEA_PEA_1_T26 



These sequences are variants of the known protein Carcinoembryonic antigen- related cell 
adhesion molecule 5 precursor (SwissProt accession identifier CEA5_HUMAN; known also 
according to the synonyms Carcinoembryonic antigen; CEA; Meconium antigen 100; CD66e 
5 antigen), SEQ ID NO: 145 1 , referred to herein as the previously known protein. 

The sequence for protein Carcinoembryonic antigen-related cell adhesion molecule 5 
precursor is given at the end of the application, as "Carcinoembryonic antigen-related cell 
adhesion molecule 5 precursor amino acid sequence". Known polymorphisms for this sequence 
are as shown in Table 938 

1 0 Table 938 - Amino acid mutations for Known Protein 



SNP positioh(s) on 
amino acid sequence 


Comment, • • 'r j: ' WW- '' 


320 


Missing 



Protein Carcinoembryonic antigen-related cell adhesion molecule 5 precursor localization 
is believed to be attached to the membrane by a GPI-anchor. 

1 5 The previously known protein also has the following indication(s) and/or potential 

therapeutic use(s): Cancer. It has been investigated for clinical/therapeutic use in humans, for 
example as a target for an antibody or small molecule, and/or as a direct therapeutic; available 
information related to these investigations is as follows. Potential pharmaceutically related or 
therapeutically related activity or activities of the previously known protein are as follows: 

20 Immunostimulant. A therapeutic role for a protein represented by the cluster has been predicted. 
The cluster was assigned this field because there was information in the drug database or the 
public databases (e.g., described herein above) that this protein, or part thereof, is used or can be 
used for a potential therapeutic indication: Imaging agent; Anticancer; Immunostimulant; 
Immunoconjugate; Monoclonal antibody, murine; Antisense therapy; antibody. 
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The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: integral plasma membrane protein; membrane, which are 
annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
5 Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

Cluster HUMCEA can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
10 according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 33 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

15 Overall, the following results were obtained as shown with regard to the histograms in 

Figure 33 and Table 939. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: epithelial malignant tumors, a mixture of malignant tumors 
from different tissues and pancreas carcinoma. 

20 Table 939 - Normal tissue distribution 



Name of Tissue '.';) 


Numbbr : 


colon 


1175 


epithelial 


92 


general 


29 


head and neck 


81 


kidney 


0 


lung 


0 


lymph nodes 


0 


breast 


0 


pancreas 


0 
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prostate 


0 


stomach 


256 



Table 940 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI 


P2-.--1 ?• 




R3 . 


SP2 


R4 


colon 


2.0e-01 


2.7e-01 


9.8e-01 


0.5 


1 1 


0.5 


epithelial 


2.1e-03 


2.7e-02 


6.4e-04 


1.4 


2.1e-01 


1.0 


general 


3.9e-08 


8.2e-06 


9.2e-18 


3.2 


1.3e-10 


2.2 ' 


head and neck 


3.4e-01 


5.0e-01 


2.1e-01 


1.8 


5.6e-01 


0.9 I 


kidney 


4.3e-01 


5.3e-01 


5.8e-01 


2.1 


7.0e-01 


1.6 


lung 


1.3e-01 


2.6e-01 


1 


1.1 


1 


1.1 


lymph nodes 


3.1e-01 


5.7e-01 


8.1e-02 


6.0 


3.3e-01 


2.5 


breast 


3.8e-01 


1.5e-01 


1 


1.0 


6.8e-01 


1.5 


pancreas 


2.2e-02 


2.3e-02 


1.4e,08 


7.8 


7.4e-07 


6.4 


prostate 


5.3e-01 


6.0e-01 


3.0e-01 


2.5 


4.2e-01 


2.0 


stomach 


1.5e-01 


4.7e-01 


8.9e-01 


0.6 


7.2e-01 


0.4 



For this cluster, at least one oligonucleotide was found to demonstrate overexpression of 
the cluster, although not of at least one transcript/segment as listed below. Microarray (chip) 
data is also available for this cluster as follows. Various oligonucleotides were tested for being 
differentially expressed in various disease conditions, particularly cancer, as previously 
10 described. The following oligonucleotides were found to hit this cluster but not other 



segments/transcripts below (in relation to lung cancer), shown in Table 941. 
Table 941 - Oligonucleotides related to this cluster 



Oligonucleotide name 


Overexpressed in cancers 


Chip reference 


HUMCEA_0_0_1 5 1 68 


lung malignant tumors 


LUN 
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As noted above, cluster HUMCEA features 5 transcript(s), which were listed in Table 1 
above. These transcript(s) encode for protein(s) which are variant(s) of protein 
Carcinoembryonic antigen-related cell adhesion molecule 5 precursor. A description of each 
variant protein according to the present invention is now provided. 

5 

Variant protein HUMCEAPEA 1 P4 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
HUMCEAPEA1T8. An alignment is given to the known protein (Carcinoembryonic 
antigen- related cell adhesion molecule 5 precursor) at the end of the application. One or more 
10 alignments to one or more previously published protein sequences are given at the end of the 

application. A brief description of the relationship of the variant protein according to the present 
invention to each such aligned protein is as follows: 

Comparison report between HUMCEA_PEA_1_P4 and CEA5_HUMAN: 

1. An isolated chimeric polypeptide encoding for HUMCEAPEA 1P4, comprising a 
15 first amino acid sequence being at least 90 % homologous to 

MESPSAPPHRWCIPWQRLLLTASL^^ 

HLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREIIYPNASLLIQNIIQNDTGFYT 
LHVIKSDLWEEATGQFRVYPELPKPSISSNNSKPVEDKDAVAFTCEPETQDATYLWWV 
1WQSLPVSPRLQLSNGNRTLTLFNVTRNDTASYKCETQNPVSARRSDSVILNVL 

20 corresponding to amino acids 1 - 234 of CEA5 HUMAN, which also corresponds to amino 

acids 1 - 234 of HUMCEA_PEA_1P4, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
CEYICSSLAQAASPNPQGQRQDFSWLRFKYTDPQ 

25 RRGGAASVLGGSGSTPYDGRNR corresponding to amino acids 235 - 315 of 

HUMCEA_PEA_1__P4, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of HUMCEA PEA 1P4, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 

30 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 
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CEYICSSLAQAASPNPQGQRQDFSVPLRFKYTDPQPWTSRLSVTFCPRKTWADQVLTKN 
RRGGAASVLGGSGSTPYDGRNR in HUMCEAJPEA_1_P4. 

The location of the variant protein was determined according to results from a number of 
5 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

10 Variant protein HUMCEAJPEA1P4 also has the following non- silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 942, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HUMCEAPEA 1 JP4 
sequence provides support for the deduced sequence of this variant protein according to the 

1 5 present invention) . 

Table 942 - Amino acid mutations 



SNP position(s) 0n amino . acid 
sequence _ 


Alternative amino acid(s) :j$)p 


f Previously known SNP? 


63 


F->L 


No 


80 


I-> V 


Yes 


83 


V-> A 


Yes 


137 


Q->P 


Yes 


173 


D->N 


No 



The glycosylation sites of variant protein HUMCEA_PEA_1_P4, as compared to the 
known protein Carcinoembryonic antigen-related cell adhesion molecule 5 precursor, are 
20 described in Table 943 (given according to their position(s) on the amino acid sequence in the 
first column; the second column indicates whether the glycosylation site is present in the variant 
protein; and the last column indicates whether the position is different on the variant protein). 

Table 943 - Glycosylation site(s) 
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Positions) on known amino 


Present in variant protein? 


Positioir in Variant protein? 


acid sequence i j ■ 






197 


yes 


197 


466 


no 




360 


no 




288 


no 




665 


no 




560 


no 




650 


no 




480 


no 




104 


yes 1 


104 


580 


no 




204 


yes | 


204 


115 


yes 


115 


208 


yes 


208 


152 


yes 


152 


309 


no 




432 


no 




351 


no 




246 


no 




182 


yes 


182 


612 


no 




256 


no 




508 


no 




330 


no 




274 


no 




292 


no 




553 


no 




529 


no 




375 


no 
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Variant protein HUMCEA_PEA_1P4 is encoded by the following transcript(s): 
HUMCEA_PEA_1_T8 ? for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HUMCEA_PEA_1_T8 is shown in bold; this coding portion starts 
5 at position 115 and ends at position 1059. The transcript also has the following SNPs as listed in 
Table 944 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HUMCEA_PEA_1_P4 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

1 0 Table 944 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence . :', : ' ' 


Alternative nucleic acid 

" ~.) _~ ; . ■. 


Previously known SNP? 


49 


T-> 


No 


273 


A->C 


Yes 


303 


T->G 


No 


324 


T->C 


Yes 


352 


A->G 


Yes 


362 


T->C 


Yes 


524 


A->C 


Yes 


631 


G-> A 


No 


1315 


A->G 


No 


1380 


T->C 


No 


1533 


C -> A 


Yes 


1706 


G-> A 


Yes 


2308 


T->C 


No 


2362 


C->T 


No 


2455 


A-> 


No 


2504 


C-> A 


Yes 


2558 


G-> 


No 


2623 


G-> 


No 
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2639 


T -> A 


No 


2640 


T-> A 


No 


2832 


G -> A 


Yes 


2885 


C->T 


No 


3396 


A->G 


Yes 


3562 


C->T 


Yes 


3753 


C->T 


Yes 



Valiant protein HUMCEA_PEA_1_P5 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMCEA_PEA_1_T9. An alignment is given to the known protein (Carcinoembryonic 

antigen- related cell adhesion molecule 5 precursor) at the end of the application. One or more 
alignments to one or more previously published protein sequences are given at the end of the 
application A brief description of the relationship of the variant protein according to the present 
invention to each such aligned protein is as follows: 
1 0 Comparison report between HUMCEA_PEA_1_P5 and CEA5 HUMAN: 

l.An isolated chimeric polypeptide encoding for HUMCEA_PEA_1_P5, comprising a 
first amino acid sequence being at least 90 % homologous to 

MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKEVLLLVHNLPQ 
HLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREIIYPNASLLIQNIIQNDTGFYT 

1 5 LHVIKSDLVNEEATGQFRVYPELPKPSISSNNSKTVEDKDAVAFTCEPETQDATYLWWV 
NNQSLPVSPRiQLSNGNRTLTLFNVTRNDTASYKCETQNPVSARRSDSVILNVLYGPDA 
PTISPLNTSYRSGENLNLSCHAASNPPAQYSWFVNGTFQQSTQELFIPNITVNNSGSYTC 
QAHNSDTGLNRTTVTTITVYAEPPKPFITSNNSNPVEDEDAVALTCEPEIQNTTYLWWV 
NNQSLPVSPRLQLSNDNRTLTLLSVTRNDVGPYECGIQNELSVDHSDPVILNVLYGPDD 

20 PTISPSYTYYRPGVNLSLSCHAASNPPAQYSWLIDGNIQQHTQELFISNITEKNSGLYTCQ 
ANNSASGHSRTTVKTITVSAELPKPSISSNNSKPVEDKDAVAFTCEPEAQNTTYl^WWVN 
GQSLPVSPRLQLSNGNRTLTLFNVTRNDARAYVCGIQNSVSANRSDPVTLDVLYGPDTP 
IISPPDSSYLSGANLNLSCHSASNPSPQYSWRTNGIPQQHTQVLFIAK1TPNNNGTYACFV 
SNLATGRNNSIVKSITVS corresponding to amino acids 1 - 675 of CEA5 JHUMAN, which 
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also corresponds to amino acids 1 - 675 of HUMCEA_PEA 1 JP5, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
GKWLPGASASYSGVESIWFSPKSQEDIFFPSLCSMGTRKSQILS corresponding to amino 
5 acids 676 - 719 of HUMCEAJPEA_1_P5, wherein said first amino acid sequence and second 
amino acid sequence are contiguous and in a sequential order, 

2. An isolated polypeptide encoding for a tail of HUMCEA_PEA_1_P5, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
1 0 sequence GKWLPGASAS YSGVESIWFSPKSQEDIFFPSLCSMGTRKSQILS in 
HUMCEAJPEAJJP5. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 

15 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HUMCEAJPEA_1_P5 also has the following non-silent SNPs (Single 

20 Nucleotide Polymorphisms) as listed in Table 945, (given according to their position(s) on the 

amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HUMCEAPEA 1JP5 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

25 Table 945 - Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) 


, Previously known SNP? 


63 


F->L 


No 


80 


I-> V 


Yes 


83 


V-> A 


Yes 
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137 


Q ->P 


Yes 


173 


D->N 


No 


289 


I->T 


No 


340 


A->D 


Yes 


398 


E->K 


Yes 


647 


P -> 


No 


664 


R->S 


Yes 



The glycosylation sites of variant protein HUMCEA_PEA_1JP5, as compared to the 
known protein Carcinoembryonic antigeiv related cell adhesion molecule 5 precursor, are 
described in Table 946 (given according to their position(s) on the amino acid sequence in the 



5 first column; the second column indicates whether the glycosylation site is present in the variant 
protein; and the last column indicates whether the position is different on the variant protein). 

Table 946 - Glycosylation site(s) 



Position(s) on known amino 
acid se4iibrice <> 


Present in variant protein? 


Position in variant protein? 


197 


yes 


197 


466 


yes 


466 


360 


yes 


360 


288 


yes 


288 


665 


yes 


665 


560 


yes 


560 


650 


yes 


650 


480 


yes 


480 


104 


yes 


104 


580 


yes 


580 


204 


yes 


204 


115 


yes 


115 


208 


yes 


208 
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152 


yes 


152 


309 


yes 


309 


432 


yes 


432 


351 


yes 


351 


246 


yes 


246 


182 


yes 


182 


612 


yes 


612 


256 


yes 


256 


508 


yes 


508 


330 


yes 


330 


274 


yes 


274 


292 


yes 


292 


553 


yes 


553 


529 


yes 


529 


375 


yes 


375 



Variant protein HUMCEAJPEA_1JP5 is encoded by the following transcript(s): 
HUMCEAPEA1T9, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HUMCEAJPEA_1_T9 is shown in bold; this coding portion starts 
5 at position 115 and ends at position 2271 . The transcript also has the following SNPs as listed in 
Table 947 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HUMCEAJPEA_1_P5 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

1 0 Table 947 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


49 


T-> 


No 


273 


A->C 


Yes 


303 


T->G 


No 
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324 


T->C 


Yes 


352 


A->G 


Yes 


362 


T->C 


Yes 


524 


A->C 


Yes 


631 


G-> A 


No 


915 


A->G 


No 


980 


T->C 


No 


1133 


C -> A 


Yes 


1306 


G->A 


Yes 


1908 


T->C 


No 


1962 


C->T 


No 


2055 


A-> 


No 


2104 


C->A 


Yes 


3259 


T->C 


Yes 



Variant protein HUMCE AJPE A__ 1 _P 1 4 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMCEA_PEA_1_T20. The location of the variant protein was determined according to results 
from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: secreted. The protein localization is believed to be secreted because both 
signal-peptide prediction programs predict that this protein has a signal peptide, and neither 
10 trans- membrane region prediction program predicts that this protein has a trans -membrane 
region. 

Variant protein HUMCEAJPEA_1 JP14 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 948, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
15 the SNP is known or not; the presence of known SNPs in variant protein 

HUMCEAJPEA_1_P14 sequence provides support for the deduced sequence of this variant 
protein according to the present invention). 
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Table 948 - Amino acid mutations 



SNTP positions) on amino acid 


Alternative amino acid(s) 


Previously known SNP? 


sequence i , ] ^ 


•" - f ; ' • 


■•- ■ "■ -v'' .S- 0 : 


63 


F->L 


No 


80 


I-> V 


Yes 


83 


V -> A 


Yes 


137 


Q->P 


Yes 


173 


D->N 


No 


289 


I->T 


No 


340 


A->D 


Yes 


398 


E->K 


Yes 



Variant protein HUMCEA_PEA_1_P14 is encoded by the following transcript(s): 
HUMCEAJPEA_1_T20, for which the sequence(s) is/are given at the end of the application, 
5 The coding portion of transcript HUMCEA_PEA_1_T20 is shown in bold; this coding portion 
starts at position 115 and ends at position 1 821 . The transcript also has the following SNPs as 
listed in Table 949 (given according to their position on the nucleotide sequence, with the 
alternative nucleic acid listed; the last column indicates whether the SNP is known or not; the 
presence of known SNPs in variant protein HUMCEA_PEA_1_P14 sequence provides support 
10 for the deduced sequence of this variant protein according to the present invention). 

Table 949 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence : 


Alternative nucleic acid 


Previously known SNP? 


49 


T-> 


No 


273 


A->C 


Yes 


303 


T->G 


No 


324 


T->C 


Yes 


352 


A->G 


Yes 


362 


T->C 


Yes 
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524 


A->C 


Yes 


631 


G-> A 


No 


915 


A->G 


No 


980 


T->C 


No 


1133 


C ->A 


Yes 


1306 


G-> A 


Yes 



Variant protein HUMCEAJPEA1P19 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMCEAPEA_1_T25. An alignment is given to the known protein (Carcinoembryonic 

antigen-related cell adhesion molecule 5 precursor) at the end of the application. One or more 
alignments to one or more previously published protein sequences are given at the end of the 
application. A brief description of the relationship of the variant protein according to the present 
invention to each such aligned protein is as follows: 
1 0 Comparison report between HUMCEAJPEA^l JP 1 9 and CEA5 J-IUMAN: 

LAn isolated chimeric polypeptide encoding for HUMCEA_PEA_1_P 1 9, comprising a 
first amino acid sequence being at least 90 % homologous to 
MESPSAPPHRWCIPWQRLLLTAS 

HLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREIIYPNASLLIQNIIQNDTGFYT 
15 LHVIKSDLVNEEATGQFRVYPELPKPSISSNNSKPVEDKDAVAFTCEPETQDATYLWWV 
NNQSLPVSPRLQLSNGNRTLTLFNVTRNDTASYKCETQNPVSARRSDSVILN 
corresponding to amino acids 1 - 232 of CEA5HUMAN, which also corresponds to amino 
acids 1 - 232 of HUMCEAJPEA _1_P19, and a second amino acid sequence being at least 90 % 
homologous to 

20 VLYGPDTPIISPPDSSYLSGANLNLSCHSASNPSPQYSWRINGIPQQHTQVLFIAKITPNNN 
GTYACFVSNLATGRNNSIVKSITVSASGTSPGLSAGATVGIMIGVLVGVALI 
corresponding to amino acids 589 - 702 of CEA5_HUMAN ? which also corresponds to amino 
acids 233 - 346 of HUMCEA PEA I P 1 9, wherein said first amino acid sequence and second 
amino acid sequence are contiguous and in a sequential order. 
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2.An isolated chimeric polypeptide encoding for an edge portion of 
HUMCEAJPEA_1JP19, comprising a polypeptide having a length "n", wherein n is at least 
about 10 amino acids in length, optionally at least about 20 amino acids in length, preferably at 
least about 30 amino acids in length, more preferably at least about 40 amino acids in length and 
5 most preferably at least about 50 amino acids in length, wherein at least two amino acids 
comprise NV, having a structure as follows: a sequence starting from any of amino acid 
numbers 232-x to 232; and ending at any of amino acid numbers 233+ ((n-2) - x), in which x 
varies from 0 to n-2. 

10 The location of the variant protein was determined according to results from a number of 

different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
membrane. The protein localization is believed to be membrane because of manual inspection of 
known protein localization and/or gene structure. 

15 Variant protein HUMCEAJPEA_1_P19 also has the following non-silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 950, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein 
HUMCEA JPE A_l JP 1 9 sequence provides support for the deduced sequence of this variant 

20 protein according to the present invention). 

Table 950 - Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


63 


F->L 


No 


80 


I->V 


Yes 


83 


V-> A 


Yes 


137 


Q->P 


Yes 


173 


D->N 


No 


291 


P-> 


No 


308 


R->S 


Yes 
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326 



G-> 



No 



The glycosylation sites of variant protein HUMCEA_PEA_1 JP19, as compared to the 
known protein Carcinoembryonic antigen- related cell adhesion molecule 5 precursor, are 
described in Table 951 (given according to their position(s) on the amino acid sequence in the 
first column; the second column indicates whether the glycosylation site is present in the variant 
protein; and the last column indicates whether the position is different on the variant protein). 

Table 951 ~ Glycosylation site(s) 



pQsition(s) bniaaow amino \, 
acid sequence , •• • 


Present in vanant protein? 


PolitiQn in variant protein? 


197 


yes 


197 


466 


no 




360 


no 




288 


no 




665 


yes 


309 


560 


no 




650 


yes 


294 


480 


no 




104 


yes 


104 


580 


no 




204 


yes 


204 


115 


yes 


115 


208 


yes 


208 


152 


yes 


152 


309 


no 




432 


no 




351 


no 




246 


no 




182 


yes 


182 
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612 


yes 


256 


256 


no 




508 


no 




330 


no 




274 


no 




292 


no 




553 


no 




529 


no 




375 


no 





Variant protein HUMCEAPEA1P19 is encoded by the following transcript(s): 
HUMCEA__PEA_1__T25 ? for which the sequence(s) is/are given at the end of the application. 
The coding portion of transcript HUMCEA_PEA_1_T25 is shown in bold; this coding portion 
5 starts at position 115 and ends at position 1 152. The transcript also has the following SNPs as 
listed in Table 952 (given according to their position on the nucleotide sequence, with the 
alternative nucleic acid listed; the last column indicates whether the SNP is known or not; the 
presence of known SNPs in variant protein HUMCEAPEA1P 1 9 sequence provides support 
for the deduced sequence of this variant protein according to the present invention). 

1 0 Table 952 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


49 


T-> 


No 


273 


A->C 


Yes 


303 


T->G 


No 


324 


T->C 


Yes 


352 


A->G 


Yes 


362 


T->C 


Yes 


524 


A->C 


Yes 


631 


G->A 


No 


840 


T->C 


No 
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894 


C ->T 


No 


987 


A-> 


No 


1036 


C -> A 


Yes 


1090 


G-> 


No 


1155 


G-> 


No 


1171 


T-> A 


No 


1172 


T-> A 


No 


1364 


G-> A 


Yes 


1417 


C->T 


No 


1928 


A->G 


Yes 


2094 


C ->T 


Yes 


2285 


C->T 


Yes 



Variant protein HUMCEAJPEA_1 JP20 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMCEAJPEA_1_T26. An alignment is given to the known protein (Carcinoembryonic 

antigen-related cell adhesion molecule 5 precursor) at the end of the application. One or more 
alignments to one or more previously published protein sequences are given at the end of the 
application. A brief description of the relationship of the variant protein according to the present 
invention to each such aligned protein is as follows: 
1 0 Comparison report between HUMCEA_PEA_1_P20 and CEA5JHUMAN: 

l.An isolated chimeric polypeptide encoding for HUMCEA_PEA_1_P20, comprising a 
first amino acid sequence being at least 90 % homologous to 
MESPSAPPHRWCIPWQRLLLTA 

HLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREIIYPNASLLIQNnQNDTGFYT 
15 LHVIKSDLVNEEATGQFRVYP corresponding to amino acids 1 - 142 of CE A5_HUM AN, 
which also corresponds to amino acids 1-142 of HUMCEAJPEA_1 JP20, and a second amino 
acid sequence being at least 90 % homologous to 
ELPKPSISSNNSKPVEDKDAVAFTCEPEAQOT^ 
LF]NTVTRNDARAYVCGIQNSVSA 
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ASNPSPQYSWRINGIPQQHTQVLFIAK^ 

TSPGLSAGATVGIMIGVLVGVALI corresponding to amino acids 499 - 702 of 
CEA5JHUMAN, which also corresponds to amino acids 143 - 346 of HUMCEAJPEA_1_P20, 
wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 
5 sequential order. 

2. An isolated chimeric polypeptide encoding for an edge portion of 
HUMCEAJPEA_1_P20, comprising a polypeptide having a length "n", wherein n is at least 
about 1 0 amino acids in length, optionally at least about 20 amino acids in length, preferably at 
least about 30 amino acids in length, more preferably at least about 40 amino acids in length and 
1 0 most preferably at least about 50 amino acids in length, wherein at least two amino acids 

comprise PE 5 having a structure as follows: a sequence starting from any of amino acid numbers 
142-x to 142; and ending at any of amino acid numbers 143+ ((n-2) - x), in which x varies from 
0 to n-2. 

1 5 The location of the variant protein was determined according to results from a number of 

different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
membrane. The protein localization is believed to be membrane because of manual inspection of 
known protein localization and/or gene structure. 

20 Variant protein HUMCEA_PEA_1_P20 also has the following non-silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 953, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein 
HUMCEA_PEA_1_P20 sequence provides support for the deduced sequence of this variant 

25 protein according to the present invention). 

Table 953 - Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) 


| Previously known SNP? 


63 


F->L 


No 


80 


I->V 


Yes 
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83 


V -> A 


Yes 


137 


Q ->P 


Yes 


291 


P -> 


No 


308 


R->S 


Yes 


326 


G-> 


No 



Hie glycosylation sites of variant protein HUMCEAJPEA_1 JP20, as compared to the 
known protein Carcinoembryonic antigen-related cell adhesion molecule 5 precursor, are 
described in Table 954 (given according to their position(s) on the amino acid sequence in the 
5 first column; the second column indicates whether the glycosylation site is present in the variant 



protein; and the last column indicates whether the position is different on the variant protein). 
Table 954 - Glycosylation site(s) 



Position(s) oh known apnnp 
aekl* sequence * ' :|t ;.. • 


Present in Variant protein? 

'£--•'•>' ■ ■ ' '• 


position; in yariant prdteiii? ; 


197 


no 




466 


no 




360 


no 




288 


no 




665 


yes 


309 


560 


yes 


204 


650 


yes 


294 


480 


no 




104 


yes 


104 


580 


yes 


224 


204 


no 




115 


yes 


115 


208 


no 




152 


no 




309 


no 
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no 




351 


no 




246 


no 




182 


no 




612 


yes 


256 


256 


no 




508 


yes 


152 


330 


no 




274 


no 




292 


no 




553 


yes 


197 


529 


yes 


173 


375 


no 





Variant protein HUMCEA PEA1P20 is encoded by the following transcript(s): 
HUMCEAJPEA_1_T26, for which the sequence(s) is/are given at the end of the application. 
The coding portion of transcript HUMCE A PE A_ 1 _T26 is shown in bold; this coding portion 
5 starts at position 1 15 and ends at position 1 152. The transcript also has the following SNPs as 
listed in Table 955 (given according to their position on the nucleotide sequence, with the 
alternative nucleic acid listed; the last column indicates whether the SNP is known or not; the 
presence of known SNPs in variant protein HUMCEA_PEA_1 JP20 sequence provides support 
for the deduced sequence of this variant protein according to the present invention). 

1 0 Table 955 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence ' 


Alternative nucleic acid 


i Previously known SNP? ; 


49 


T-> 


No 


273 


A->C 


Yes 


303 


T->G 


No 


324 


T->C 


Yes 


352 


A->G 


Yes 
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362 


T->C 


Yes 


524 


A->C 


Yes 


840 


T->C 


No 


894 


C->T 


No 


987 


A-> 


No 


1036 


C-> A 


Yes 


1090 


G-> 


No 


1155 


G-> 


No 


1171 


T-> A 


No 


1172 


T-> A 


No 


1364 


G -> A 


Yes 


1417 


C->T 


No 


1928 


A->G 


Yes 


2094 


C->T 


Yes 


2285 


C ->T 


Yes 



As noted above, cluster HUMCEA features 42 segment(s), which were listed in Table 2 
above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 



5 provided. 

Segment cluster HUMCEAJPEA_l__node__0 according to the present invention is 
supported by 56 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA_1_T8, 
1 0 HUMCEA_PEA_1_T9, HUMCEA_PEA_1 JT20, HUMCEA JPEA_1_T25 and 

HUMCE A_PE Al _T26 . Table 956 below describes the starting and ending position of this 
segment on each transcript. 

Table 956 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 
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HUMCEA_PEA_1_T8 


1 


178 


HUMCEA_PEA_1_T9 


1 


178 


HUMCEA_PEA_1_T20 


1 


178 


HUMCEA_PEA_1_T25 


1 


178 


HUMCEA_PEA_1_T26 


1 


178 



Segment cluster HUMCEAJPEA_l_node_2 according to the present invention is 
supported by 83 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): HUMCEAJPEA1T8, 
HUMCEA_PEA_1_T9 ? HUMCEA_PEA_1 JT20, HUMOEAJPEA_lJT25 ani 
HUMCEAJPEA1T26. Table 957 below describes the starting and ending position of this 
segment on each transcript. 

Table 957 - Segment location on transcripts 



Tmnscript name , t 


Segtrieiit , 
starting position ; f : 


Segment ^If.-f %J 
ending position 


HUMCEA_PEA_1_T8 


179 


456 


HUMCEA_PEA_1_T9 


179 


456 


HUMCEA_PEA_1_T20 


179 


456 


HUMCEA_PEA_1_T25 


179 


456 


HUMCEA_PEA_1_T26 


179 


456 



10 

Segment cluster HUMCEAJPEA_l_node_l 1 according to the present invention is 
supported by 6 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEAJPEA_1 JT8. Table 958 below 
15 describes the starting and ending position of this segment on each transcript. 

Table 958 - Segment location on transcripts 
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Transcript name J > % 


Segment g 
starting position 


Segment 

ending positiop 'J 


HUMCEA_PEA_1_T8 


818 


1217 



Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
5 were found to hit this segment (in relation to lung cancer), shown in Table 959. 

Table 959 - Oligonucleotides related to this segment 



Oligonuckotide^ame Iff 


Overexposed in qancers 


Chipr:eferen^:S' / fJfV,.^- 


HUMCEA_0_0__96 


lung malignant tumors 


LUN 



Segment cluster HUMCEA_PEA_l_node_12 according to the present invention is 
10 supported by 83 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA_1_T8, 
HUMCE AJPE A_l _T9 and HUMCEAJPEA_1_T20. Table 960 below describes the starting and 
ending position of this segment on each transcript. 

Table 960 - Segment location on transcripts 



Transcript name : j ■ 


' Segment • ' -i £' . .'' 


Segment 




starting position 


ending position 


HUMCEA_PEA_1_T8 


.1218 


1472 


HUMCEA_PEA_1_T9 


818 


1072 


HUMCEA_PEA_1_T20 


818 


1072 



15 

Segment cluster HUMCEAJPEA_l_node_3 1 according to the present invention is 
supported by 87 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA_1_T8, 
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HUMCEAJPEA_1_T9 and HUMCEA_PEA_1_T20. Table 961 below describes the starting and 
ending position of this segment on each transcript. 

Table 961 - Segment location on transcripts 



Transcript name V." '.>''' "J ,' ; -*•■'>''■./-• 


Segment £ 
starting position 


Segment 
ending position 


HUMCEA_PEA_1_T8 


1817 


2006 


HUMCEA_PEA_1_T9 


1417 


1606 


HUMCEA_PEA_1_T20 


1417 


1606 



Segment cluster HUMCEA_PEA_l_node_36 according to the present invention is 
supported by 94 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA__PEA__1_T8, 
HUMCEA_PEA_1_T9 and HUMCEAJPEA_1 JT26. Table 962 below describes the starting and 
10 ending position of this segment on each transcript. 



Table 962 - Segment location on transcripts 



Transcript name 


Segment 

| starting position ; 


Segment 
ending position 


HUMCEAJPEA_1_T8 


2159 


2285 


HUMCEA_PEA_1_T9 


1759 


1885 


HUMCEA_PEA_1_T26 


691 


817 



Segment cluster HUMCEA_PEA_1 jtiode_44 according to the present invention is 
15 supported by 1 12 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMCEAJPEA_ 1 _T8 5 
HUMCEAJPEA_1_T9, HUMCEA_PEA_1 JT25 and HUMCEA_PEA__1 JT26. Table 963 below 
describes the starting and ending position of this segment on each transcript. 

Table 963 - Segtnent location on transcripts 
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Transcript name ; 


Segment ; f 


Segment 




starting position 


ending position • -i 


HUMCEA_PEA_1_T8 


2286 


2540 


HUMCEA_PEA_1_T9 


1886 


2140 


HUMCEA_PEA_1_T25 


818 


1072 


HUMCEA_PEA_1_T26 


818 


1072 



Segment cluster HUMCEA_PEA__l_node_46 according to the present invention is 
supported by 15 libraries. The number of libraries was determined as previously described. This 



5 segment can be found in the following transcript(s): HUMCEA_PEA_1_T9. Table 964 below 
describes the starting and ending position of this segment on each transcript. 

Table 964 - Segment location on transcripts 



Transcript xiame ■ . fy : .} ' / • . 


1 Segment ^ /,}' ' ; : 
| starting position 


Segment/ 
ending position 


HUMCEA_PEA_1_T9 


2174 


3347 



10 Segment cluster HUMCEA_PEA_l_node_63 according to the present invention is 

supported by 68 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEAPEA 1_T8, 
HUMCEAJPEA_1_T25 and HUMCEA_PEA_1_T26. Table 965 below describes the starting 
and ending position of this segment on each transcript. 

15 Table 965 - Segment location on transcripts 



Transcript name 


; Segment 

: starting position 


Segment 
ending position 


HUMCEA_PEA_1_T8 


2957 


3135 


HUMCEA_PEA_1_T25 


1489 


1667 


HUMCEA_PEA_1_T26 


1489 


1667 
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Segment cluster HUMCEA_PEA_l_node_65 according to the present invention is 
supported by 54 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): HUMCEA_PEA_1_T8, 

HUMCEAJPEA_1_T25 and HUMCEA PEA_ 1 _T26 . Table 966 below describes the starting 
and ending position of this segment on each transcript. 

Table 966 - Segment location on transcripts 



^Trans6ri|)t name : %%f'j\ \\ 


Segment 

st|f ting position 


Segment 
• ending position 


HUMCEA_PEA_1_T8 


3166 


3897 


HUMCEA_PEA_1_T25 


1698 


2429 


HUMCEA_PEA_1_T26 


1698 


2429 



10 

Segment cluster HUMCEAJPEA_1 _node_67 according to the present invention is 
supported by 2 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA_1_T20. Table 967 below 
describes the starting and ending position of this segment on each transcript. 

15 Table 967 - Segment location on transcripts 



Transcript name t \ 


l^egnqimt : .. _ : \ 
starting position 


Segment ,;V- 
■ ending position 


HUMCEA_PEA_1_T20 


1607 


1886 



According to an optional embodiment of the present invention, short segments related to 
20 the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 
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Segment cluster HUMCEAJPEA_l_node_3 according to the present invention is 
supported by 67 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEAJPEA_1_T8, 
HUMCEA_PEA_1 JT9, HUMCEAJ>EA_1 JT20, HUMCEA_PEA_1_T25 and 



5 HUMCEA_PEA_1_T26. Table 968 below describes the starting and ending position of this 
segment on each transcript. 

Table 968 - Segment location on transcripts 



Transcript name • " / 


Segment • 
r starting position ; 


Segment ; £ '., , .; i; 

ending position f°> 


HUMCEA_PEA_1_T8 


457 


538 


HUMCEA_PEA_1_T9 


457 


538 


HUMCEA_PEA_1_T20 


457 


538 


HUMCEA_PEA_1_T25 


457 


538 


HUMCEA_PEA_1_T26 


457 


538 



10 Segment cluster HUMCEA_PEA_l_node_7 according to the present invention is 

supported by 73 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA__1_T8 5 
HUMCEAJPEA_1_T9 ? HUMCEAJ>EA_1 JT20 and HUMCEAJPEA_1_T25. Table 969 below 
describes the starting and ending position of this segment on each transcript. 

15 Table 969 - Segment location on transcripts 



Transcript name 


Segment v ; 


.; Segment ; 




\ starting position 


ending position / 


HUMCEA_PEA_1_T8 


539 


642 


HUMCEA_PEA_1_T9 


539 


642 


HUMCEA_PEA_1_T20 


539 


642 


HUMCEA_PEA_1_T25 


539 


642 
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Segment cluster HUMCEAJPEA_l_node_8 according to the present invention is 
supported by 67 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA_1_T8 ? 
HUMCEA_PEA_1_T9, HUMCEA_PEA_1_T20 and HUMCEAJPEA_1_T25. Table 970 below 



5 describes the starting and ending position of this segment on each transcript. 
Table 970- - Segment location on transcripts 



Transcript name 7'%$'- 


Segment 

starting position ''. j 


Segment 

ending position ' 


HUMCEA_PEA_1_T8 


643 


690 


HUMCEA_PEA_1_T9 


643 


690 


HUMCEA_PEA_1_T20 


643 


690 


HUMCEA_PEA_1_T25 


643 


690 



Segment cluster HUMCEA_PEA_l_node_9 according to the present invention is 
10 supported by 71 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEAPEA 1_T8, 
HUMCEA_PEA__1__T9 ? HUMCEA__PEA_1_T20 and HUMCEAJPEA_1 JT25. Table 971 below 
describes the starting and ending position of this segment on each transcript. 

Table 971 - Segment location on transcripts 



Transcript name 


: Segment 


: Segment i: : : - 




\ starting position 


ending position 


HUMCEA_PEA_1_T8 


691 


738 


HUMCEA_PEA_1_T9 


691 


738 


HUMCEA_PEA_1_T20 


691 


738 


HUMCEA_PEA_1_T25 


691 


738 



Segment cluster HUMCEA_PEA_l_node_10 according to the present invention is 
supported by 67 libraries. The number of libraries was determined as previously described. This 
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segment can be found in the following transcript(s): HUMCEAJPEA_1_T8, 
HUMCEAJ>EA_1_T9, HUMCEA_PEA_1 JT20 and HUMCEAJPEA_1 JT25. Table 972 below 
describes the starting and ending position of this segment on each transcript. 

Table 972 - Segment location on transcripts 



Transcript name . 


Segment )c v 
starting position f 


Segment ,-. , 
ending position 


HUMCEA_PEA_1_T8 


739 


817 


HUMCEA_PEA_1_T9 


739 


817 


HUMCEA_PEA_1_T20 


739 


817 


HUMCEA_PEA_1_T25 


739 


817 



5 



Segment cluster HUMCEA_PEA_l_node 15 according to the present invention can be 
found in the following transcript(s): HUMCEAJPEA_1_T8 5 HUMCEA_PEA_1_T9 and 
HUMCE A PE A_ 1 T20 . Table 973 below describes the starting and ending position of this 
10 segment on each transcript. 

Table 973 - Segment location on transcripts 



Transcript name 


; Segment ; £ 
flstarting position ;i 


Segment . <jx : 
ending position ; : ; ?i 


HUMCEA_PEA_1_T8 


1473 


1475 


HUMCEA_PEA_1_T9 


1073 


1075 


HUMCEA_PEA_1_T20 


1073 


1075 



Segment cluster HUMCEA_PEA_l_node_16 according to the present invention can be 
15 found in the following transcript(s): HUMCEA_PEA__1 JT8, HUMCEA_PEA_1_T9 and 

HUMCEAJPEA_1_T20. Table 974 below describes the starting and ending position of this 
segment on each transcript. 

Table 974 - Segment location on transcripts 
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Transcript name V » 


Segment- 
Starting position 


Segment . -> 
ending position 


HUMCEA_PEA_1_T8 


1476 


1481 


HUMCEA_PEA_1_T9 


1076 


1081 


HUMCEA_PEA_1_T20 


1076 


1081 



Segment cluster HUMCEAJPEA_l_node_17 according to the present invention can be 
found in the following transcript(s): HUMCEA_PEA_1_T8, HUMCEAJPEA__1 JT9 and 
HUMCEAJPEA_1 JT20. Table 975 below describes the starting and ending position of this 
segment on each transcript. 

Table 975 - Segment location on transcripts 



Transcript name . > » . '• ^ • - 


Segment £V „■ r \ 
starting position : 


Segment • * 
ending position J [I % 


HUMCEA_PEA_1_T8 


1482 


1488 


HUMCEA_PEA_1_T9 


1082 


1088 


HUMCEA_PEA_1_T20 


1082 


1088 



Segment cluster HUMCEA_PEA_l_node 1 8 according to the present invention can be 
found in the following transcript(s): HUMCEA_PEA_1_T8, HUMCEA_PEA_1_T9 and 
HUMCEAJPEA_1 JT20. Table 976 below describes the starting and ending position of this 
segment on each transcript. 

Table 976 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


j Segment 
ending position 


HUMCEA_PEA_1_T8 


1489 


1506 


HUMCEA_PEA_1_T9 


1089 


1106 


HUMCEAJPEA_1_T20 


1089 


1106 
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Segment cluster HUMCEA_PEA_l_node_l 9 according to the present invention is 
supported by 69 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): HUMCEAJPEA_1_T8, 

HUMCEAJPEAJ JT9 and HUMCEAJPEA_1_T20. Table 977 below describes the starting and 
ending position of this segment on each transcript. 

Table 977 - Segment location on transcripts 



Transcript name \\f4,.i-. '-''\$y:g l 


Segment S )'jf 
< starting position 


Segment ,* \j 
; ending position S . 


HUMCEA_PEA_1_T8 


1507 


1576 


HUMCEA_PEA_1_T9 


1107 


1176 


HUMCEA_PEA_1_T20 


1107 


1176 



10 

Segment cluster HUMCE AJPE A_ 1 jtiode_2 0 according to the present invention can be 
found in the following transcript(s): HUMCEAJPEA_1_T8, HUMCE AJPEAJLT9 and 
HUMCEA_PEA_1_T20. Table 978 below describes the starting and ending position of this 
segment on each transcript. 

15 Table 978 - Segment location on transcripts 



Transcript name -f 


[.Segment;,. .. J . ' 


Segment 




starting position 


ending position 


HUMCEA_PEA_1_T8 


1577 


1600 


HUMCEA_PEA_1_T9 


1177 


1200 


HUMCEA_PEA_1_T20 


1177 


1200 



Segment cluster HUMCEA_PEA_l_node_21 according to the present invention can be 
found in the following transcript(s): HUMCEA_PEA_1_T8, HUMCEA_PEA_1_T9 and 
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HUMCEA_PEA_1 JT20. Table 979 below describes the starting and ending position of this 
segment on each transcript. 

Table 979 - Segment location on transcripts 



Transcript name v>. , / 


Segment -s' ; ; ; - 
starting position 


Segment .'. 
ending position r -/ 


HUMCEA_PEA_1_T8 


1601 


1624 


HUMCEA_PEA_1_T9 


1201 


1224 


HUMCEA_PEA_1_T20 


1201 


1224 



Segment cluster HUMCEA_PEA_l_node_22 according to the present invention is 
supported by 77 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA_1_T8 ? 
HUMCEA PEA 1_T9 and HUMCEA_PEA_1 T20. Table 980 below describes the starting and 
10 ending position of this segment on each transcript. 



Table 980 - Segment location on transcripts 



Transcript name ' < ? 


Segment 


■ Segment ' .," 




starting position > 


ending position ■'■■:[■' 


HUMCEA_PEA_1_T8 


1625 


1702 


HUMCEA_PEA_1_T9 


1225 


1302 


HUMCEA_PEA_1_T20 


1225 


1302 



Segment cluster HUMCEA_PEA_l_node„23 according to the present invention is 
1 5 supported by 72 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA_1_T8 ? 
HUMCEA_PEA__1_T9 and HUMCEAJPEA_1_T20. Table 981 below describes the starting and 
ending position of this segment on each transcript. 
Table 981 - Segment location on transcripts 



WO 2006/131783 



PCT/IB2005/004037 



1008 



Transcript name ;' • ; ; . . "" 


Segment 
starting position 


Segment 4, 
ending position 


HUMCEA_PEA_1_T8 


1703 


1732 


HUMCEA_PEA_1_T9 


1303 


1332 


HUMCEA_PEA_1_T20 


1303 


1332 



Segment cluster HUMCEA PEA_l_node 24 according to the present invention can be 
found in the following transcript(s): HUMCEA_PEA_1 JT8, HUMCEAJPEA_1 JT9 and 
HUMCEAJPEA_1_T20. Table 982 below describes the starting and ending position of this 
segment on each transcript. 

Table 982 - Segment location on transcripts 



.transcript name f r ' : . 


Segment . 
starting position . .. r . £ 


^Segment . -V 
ending position 


HUMCEA_PEA_1_T8 


1733 


1751 


HUMCEA_PEA_1_T9 


1333 


1351 


HUMCEA_PEA_1_T20 


1333 


1351 



Segment cluster HUMCE A_PE A_ 1 _node 27 according to the present invention can be 
found in the following transcript(s): HUMCEA_PEA_1 JT8, HUMCEAJPEA_1_T9 and 
HUMCEA_PEA_1_T20. Table 983 below describes the starting and ending position of this 
segment on each transcript. 

Table 983 - Segment location on transcripts 



Transcript name 


; Segment 
starting position 


Segment 
' ending position 


HUMCEA_PEA_1_T8 


1752 


1770 


HTJMCEA_PEA_1_T9 


1352 


1370 


HUMCEA_PEA_1_T20 


1352 


1370 
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Segment cluster HUMCEAJPEA_l_node_29 according to the present invention can be 
found in the following transcript(s): HUMCEA_PEA_1_T8 ? HUMCEA_PEA_1_T9 and 



5 HUMCEA_PEA_1__T20. Table 984 below describes the starting and ending position of this 
segment on each transcript. 

Table 984 - Segment location on transcripts 



Transcript name $> -rj-'l v. 


Segment * > 
starting position - 


Segment % 
ending position > \j •-• 


HUMCEA_PEA_1_T8 


1771 


1788 


HUMCEA_PEA_1_T9 


1371 


1388 


HUMCEA_PEA_1_T20 


1371 


1388 



1 0 Segment cluster HUMCEAJPEA_1 jnode_30 according to the present invention is 

supported by 67 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA_1 JT8, 
HUMCEAJPEA_ 1 _T9 and HUMCEA_PEA_1_T20. Table 985 below describes the starting and 
ending position of this segment on each transcript. 

15 Table 985 - Segment location on transcripts 



Transcript name . 


Segment 
starting position 


'.Segment- ..' . - : 
ending position 


HUMCEA_PEA_1_T8 


1789 


1816 


HUMCEA_PEA_1_T9 


1389 


1416 


HUMCEA_PEA_1_T20 


1389 


1416 



Segment cluster HUMCEA_PEA_l_node_33 according to the present invention can be 
found in the following transcript(s): HUMCEA_PEA_1_T8, HUMCEA_PEA_1_T9 and 
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HUMCEAJPEA_1 JT26. Table 986 below describes the starting and ending position of this 
segment on each transcript. 

Table 986 - Segment location on transcripts 



Transcript name . ' ~ 


Segment fV) 
starting position 


. Segment- {-/i 
ending position : 


HUMCEA_PEA_1_T8 


2007 


2028 


HUMCEA_PEA_1_T9 


1607 


1628 


HUMCEA_PEA_1_T26 


539 


560 



Segment cluster HUMCEA_PEA_l_node_34 according to the present invention is 
supported by 80 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEAJPEA_1_T8, 
HUMCEAJPEA_1_T9 and HUMCEAJPEA_1 JT26. Table 987 below describes the starting and 
10 ending position of this segment on each transcript. 



Table 987 - Segment location on transcripts 



Transcript na me s _ 4 


Segment jK •■ 
• starting position ; 


! Segment 

ending position v ; 


HUMCEA_PEA_1_T8 


2029 


2110 


HUMCEA_PEA_1_T9 


1629 


1710 


HUMCEA_PEA_1_T26 


561 


642 



Segment cluster HUMCEA_PEA_l_node_35 according to the present invention is 
15 supported by 75 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA_1_T8 ? 
HUMCEA_PEA__1_T9 and HUMCEA_PEA_1_T26. Table 988 below describes the starting and 
ending position of this segment on each transcript. 

Table 988 - Segment location on transcripts 
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Transcript name " ' ; ;' 


Segment . ' 
starting position 


Segment *.- 
ending position 


HUMCEA_PEA_1_T8 


2111 


2158 


HUMCEA_PEA_1_T9 


1711 


1758 


HUMCEA_PEA_1_T26 


643 


690 



Segment cluster HUMCEA_PEA__l_node_45 according to the present invention is 
supported by 9 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): HUMCEA_PEA_1_T9. Table 989 below 
describes the starting and ending position of this segment on each transcript. 

Table 989 - Segment location on transcripts 





starting positioif 


Segment Q 
^shmg position -:| 


HUMCEAJPEA_1_T9 


2141 


2173 



10 Segment cluster HUMCEAJPEA_l_node_50 according to the present invention is 

supported by 64 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEAJPEA_1_T8 ? 
HUMCEA_PEA_1_T25 and HUMCEAJPEA_1_T26. Table 990 below describes the starting 
and ending position of this segment on each transcript. 

15 Table 990 - Segment location on transcripts 



Transcript name 


; Segment : 


Segment 




starting position 


ending position 


HUMCEA_PEA_1_T8 


2541 


2567 


HUMCEA_PEA_1_T25 


1073 


1099 


HUMCEA_PEA_1_T26 


1073 


1099 
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Segment cluster HUMCEA_PEA l node_5 1 according to the present invention is 
supported by 88 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEAPEA 1_T8 ? 



5 HUMCEAJPEAJ JT25 and HUMCEA_PEA_1 JT26. Table 991 below describes the starting 
and ending position of this segment on each transcript. 

Table 991 - Segment location on transcripts 



Transcript name- ft , , 


■ Segment ft:" 


Segment 




starting position ; 


ending position 


HUMCEA_PEA_1_T8 


2568 


2659 


HUMCEA_PEA_1_T25 


1100 


1191 


HTJMCEA_PEA_1_T26 


1100 


1191 



10 Segment cluster HUMCEA_PEA_l_node_56 according to the present invention is 

supported by 75 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA_1_T8, 
HUMCEAJ?EA__1_T25 and HUMCE A JPE A_ 1 _T2 6 . Table 992 below describes the starting 
and ending position of this segment on each transcript. 

15 Table 992 - Segment location on transcripts 



Transcript name 


j Segment . , 
starting position 


Segment : • ft.- 
ending position 


HUMCEA_PEA_1_T8 


2660 


2685 


HUMCEA_PEA_1_T25 


1192 


1217 


HUMCEA_PEA_1_T26 


1192 


1217 



Segment cluster HUMCEAJPEA_1 jtiode_57 according to the present invention is 
supported by 82 libraries. The number of libraries was determined as previously described. This 
20 segment can be found in the following transcript(s): HUMCEAJPEA_1_T8, 
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HUMCEAJPEA_1 JT25 and HUMCEA_PEA_1_T26. Table 993 below describes the starting 
and ending position of this segment on each transcript. 

Table 993 - Segment location on transcripts 



Transcript name- . 


Segment 
starting position 


Segment .> ?" 
ending position 


HUMCEA_PEA_1_T8 


2686 


2786 


HUMCEA_PEA_1_T25 


1218 


1318 


HUMCEA_PEA_1_T26 


1218 


1318 



5 

Segment cluster HUMCEAJPEA_l_node_58 according to the present invention is 
supported by 63 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEAJPEA_1_T8, 
HUMCEAJPEA__1_T25 and HUMCEAJPEA_1_T26. Table 994 below describes the starting 
10 and ending position of this segment on each transcript. 



Table 994 - Segment location on transcripts 



Transcript name ; ' 


Segment 
starting position 


, Segment s 0< '"■ ,• 
ending position 


HUMCEA_PEA_1_T8 


2787 


2820 


HUMCEA_PEA_1_T25 


1319 


1352 


HUMCEA_PEA_1_T26 


1319 


1352 



Segment cluster HUMCEAPEAl _no de_60 according to the present invention is 
15 supported by 55 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCE A_PE A_ 1 _T 8 , 
HUMCEA_PEA_1_T25 and HUMCEA_PEA_1 JT26. Table 995 below describes the starting 
and ending position of this segment on each transcript. 

Table 995 - Segment location on transcripts 
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Transcript name : : 


Segment 
starting position 


Segment 
ending position 


HUMCEA_PEA_1_T8 


2821 


2864 


HUMCEA_PEA_1_T25 


1353 


1396 


HUMCEA_PEA_1_T26 


1353 


1396 



Segment cluster HUMCEAPEA__l_node_61 according to the present invention can be 
found in the following transcript(s): HUMCEA_PEA_1_T8, HUMCEAJPEA_1_T25 and 



5 HUMCEAJPEA_1 JT26. Table 996 below describes the starting and ending position of this 
segment on each transcript. 

Table 996 - Segment location on transcripts 



Transcript name '-. 


Segment f 
' starting positidn ' 


"Segment . ■ 
ending position 


HUMCEA_PEA_1_T8 


2865 


2868 


HUMCEA_PEA_1_T25 


1397 


1400 


HUMCEA_PEA_1_T26 


1397 


1400 



10 Segment cluster HUMCEAJPEA_l_node_62 according to the present invention is 

supported by 60 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMCEA_PEA_1_T8 ? 
HUMCEA PEA l T25 and HUMCE APE A_l _T2 6 . Table 997 below describes the starting 
and ending position of this segment on each transcript. 

1 5 Table 997 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


HUMCEA_PEA_1_T8 


2869 


2956 


HUMCEA_PEA_1_T25 


1401 


1488 
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HUMCEA_PEA_1_T26 


1401 


1488 









Segment cluster HUMCEAJPEA_l_node_64 according to the present invention is 
supported by 45 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following transcript(s): HUMCEAJPEA_1_T8, 

HUMCEA P E A 1 _T2 5 and HUMCE APE A_ 1 _T2 6 . Table 998 below describes the starting 
and ending position of this segment on each transcript. 

Table 998 - Segment location on transcripts 



r Transcript name %, % , 


Segment f ; 
starting position . 


Segment. v?v S 
ending position # 


HUMCEA_PEA_1_T8 


3136 


3165 


HUMCEA_PEA_1_T25 


1668 


1697 


HUMCEA_PEA_1_T26 


1668 . 


1697 



10 



Variant protein alignment to the previously known protein: 
15 Sequence name: CEA5_HUMAN 

Sequence documentation : 

Alignment of: HUMCEA_PEA_1_P4 x CEA5_HUMAN 

20 

Alignment segment 1/1: 

Quality: 2320.00 

Escore: 0 
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Matching length: 
length: 234 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



1016 

234 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 



Alignment : 

1 ME S P S APPHRWC I PWQRLLLT ASLLT FWNP PT T AKLT I E S TPFNVAEGKE 50 

I I I I I I I I M I I 1 I I I I I I M I I I I I I I I M I M I I I I I I I 1 I I I I i M I 

1 ME S P SAP P HRWC I PWQRLL L T A S LL T FWN PPT T AKLT I E S T P FN VAE GKE 50 

51 VLLLVHNLPQHLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREI 100 

I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I 

51 VLLLVHNLPQHLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREI 100 

101 IYPNASLLIQNI IQNDTGFYTLHVIKSDLVNEEATGQFRVYPELPKPS IS 150 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I 

101 I YPNAS LL I QN 1 1 QNDTGFYTLHVIKS DLVNEEATGQFRVYPELPKPS I S 150 

151 SNNS KPVE DKDAVAFT CE PE TQDAT YLWWVNNQ S LPVS PRLQL SNGNRTL 200 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I 

151 SNNSKPVEDKDAVAFTCEPETQDATYLWWVNNQSLPVSPRLQLSNGNRTL 2 00 

201 TLFNVTRNDTASYKCETQNPVSARRSDSVILNVL 234 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2 01 TLFNVTRNDTASYKCETQNPVSARRSDSVILNVL 234 
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5 Sequence name: CEA5_HUMAN 
Sequence documentation : 

Alignment of: HUMCEA_PEA_1_P5 x CEA5_HUMAN 

10 

Alignment segment 1/1: 

Quality: 6692.00 

Escore: 0 
15 Matching length: 675 

length: 675 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
20 Identity: 100.00 

Gaps: 0 

Alignment : 

25 1 MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKE 50 

I I I M ! I I I I i I I I I I I I I I I I I I I I I I I ! I I i I I I I I I I 1 i I M I ! I I I 

1 MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKE 50 

51 VLLLVHNLPQHLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREI 100 
30 | | | M I I II I I I I t I I I I I I II I I I I I i I M I I I I I I I I I I I M I i I I I I 

51 VLLLVHNLPQHLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREI 100 



Total 
Matching Percent 
Total Percent 
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101 IYPNASLLIQNIIQNDTGFYTLHVIKSDLVNEEATGQFRVYPELPKPSIS 150 

I I I 1 I I I II I i ! I I I ! I I I I ! I I I I I I I i I I I I I I I M I 1 I I 1 i I I I I I I 

101 IYPNASLLIQNIIQNDTGFYTLHVIKSDLVNEEATGQFRVYPELPKPSIS 15 0 

5 • • • • • 

151 SNNSKPVEDKDAVAFTCEPETQDATYLWWVNNQSLPVSPRLQLSNGNRTL 200 

I I I I I I I ! I I I I I 1 I I I ! I I I I I I I I I 1 I I I I I I I I I I I 1 II I I M 1 I I I 

151 SNNSKPVEDKDAVAFTCEPETQDATYLWWVNNQSLPVSPRLQLSNGNRTL 20 0 
■ . » • • 

10 201 TLFNVTRNDTASYKCETQNPVSARRSDSVILNVLYGPDAPTISPLNTSYR 250 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 1 I I I I I I I II I I I I I 1 I 

201 TLFNVTRNDTASYKCETQNPVSARRSDSVILNVLYGPDAPTISPLNTSYR 250 

251 SGENLNLSCHAASNPPAQYSWFVNGTFQQSTQELFIPNITVNNSGSYTCQ 300 

15 I II I I I I I I I I II I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I 

251 SGENLNLSCHAASNPPAQYSWFVNGTFQQSTQELFIPNITVNNSGSYTCQ 300 

301 AHNSDTGLNRTTVTTITVYAEPPKPFITSNNSNPVEDEDAVALTCEPEIQ 350 

I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

20 301 AHNSDTGLNRTTVTTITVYAEPPKPFITSNNSNPVEDEDAVALTCEPEIQ 350 

»•-•• 

351 NTTYLWWVNNQSLPVSPRLQLSNDNRTLTLLSVTRNDVGPYECGIQNELS 400 

I I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11 I I 

351 NTTYLWWVNNQSLPVSPRLQLSNDNRTLTLLSVTRNDVGPYECGIQNELS 400 
25 ..... 

401 VDHSDPVILNVLYGPDDPTISPSYTYYRPGVNLSLSCHAASNPPAQYSWL 450 

I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I 
401 VDHSDPVILNVLYGPDDPTISPSYTYYRPGVNLSLSCHAASNPPAQYSWL 450 
..... 
30 451 IDGNIQQHTQELFISNITEKNSGLYTCQANNSASGHSRTTVKTITVSAEL 500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 



WO 2006/131783 



PCT/IB2005/004037 



1019 

451 IDGNIQQHTQELFISNITEKNSGLYTCQANNSASGHSRTTVKTITVSAEL 500 
501 PKPSISSNNSKPVEDKDAVAFTCEPEAQNTTYLWWVNGQSLPVSPRLQLS 550 

! I I I I I I I I I I I I i I I I I I I I I ! I I II I II I I I I M I I I I t I I I I I I I I I 

501 PKPSISSNNSKPVEDKDAVAFTCEPEAQNTTYLWWVNGQSLPVSPRLQLS 55 0 

551 NGNRTLTLFNVTRNDARAYVCGIQNSVSANRSDPVTLDVLYGPDTPIISP 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I 
551 NGNRTLTLFNVTRNDARAYVCGIQNSVSANRSDPVTLDVLYGPDTPIISP 600 

601 PDSSYLSGANLNLSCHSASNPSPQYSWRINGIPQQHTQVLFIAKITPNNN 650 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

601 PDSSYLSGANLNLSCHSASNPSPQYSWRINGIPQQHTQVLFIAKITPNNN 650 
15 651 GTYACFVSNLATGRNNS I VKS I TVS 675 

I M 1 1 1 ii 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 

651 GTYACFVSNLATGRNNS I VKS I TVS 67 5 



20 



10 



25 



Sequence name: CEA5_HUMAN 
Sequence documentation : 

Alignment of: HUMCEA_PEA_1JP1 9 x CEA5_HUMAN 
30 Alignment segment 1/1: 
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Quality : 

Escore: 0 

Matching length: 
length: 702 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 49.29 

Gaps : 



1020 
3298 . 00 

346 Total 

100.00 Matching Percent 

49.29 Total Percent 

1 



Alignment : 

1 MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKE 50 

II I I I I I I I I I I I I t I I I II I i I I I II I I I I I I I I I I I I I ! I I ! I i ! 1 I I 

1 MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKE 50 
51 VLLLVHNL PQHL FG Y S WY KGERVDGNRQ 1 1 GY VI GTQQAT PGPAY S GRE I 100 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I M I I I 

51 VLLLVHNLPQHLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREI 100 

101 IYPNASLLIQNIIQNDTGFYTLHVIKSDLVNEEATGQFRVYPELPKPSIS 15 0 

I I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I t 1 I I I I 
101 IYPNASLLIQNIIQNDTGFYTLHVIKSDLVNEEATGQFRVYPELPKPSIS 150 

151 SNNSKPVEDKDAVAFTCEPETQDATYLWWVNNQSLPVSPRLQLSNGNRTL 20 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I! I I I I I I I ! I I I I I I 

151 SNNSKPVEDKDAVAFTCEPETQDATYLWWVNNQSLPVSPRLQLSNGNRTL 200 
. . • • • 

201 TLFNVTRNDTASYKCETQNPVSARRSDSVILN 232 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
201 TLFNVTRNDTASYKCETQNPVSARRSDSVILNVLYGPDAPTISPLNTSYR 250 
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232 232 

251 SGENLNLSCHAASNPPAQYSWFVNGTFQQSTQELFIPNITVNNSGSYTCQ 30 0 

232 232 

301 AHNSDTGLNRTTVTTITVYAEPPKPFITSNNSNPVEDEDAVALTCEPEIQ 350 

232 232 

351 NTTYLWWVNNQSLPVSPRLQLSNDNRTLTLLSVTRNDVGPYECGIQNELS 4 00 

232 232 

401 VDHSDPVILNVLYGPDDPTISPSYTYYRPGVNLSLSCHAASNPPAQYSWL 450 

232 232 

451 IDGNIQQHTQELFISNITEKNSGLYTCQANNSASGHSRTTVKTITVSAEL 500 

232 232 

501 PKPSISSNNSKPVEDKDAVAFTCEPEAQNTTYLWWVNGQSLPVSPRLQLS 55 0 
. 

233 VLYGPDTPIISP 244 

1 I I I 1 I I I I I I I 

551 NGNRTLTLFNVTRNDARAYVCGIQNSVSANRSDPVTLDVLYGPDTPIISP 600 
245 PDSSYLSGANLNLSCHSASNPSPQYSWRINGIPQQHTQVLFIAKITPNNN 2 94 
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601 PDSSYLSGANLNLSCHSASNPSPQYSWRINGIPQQHTQVLFIAKITPNNN 650 
- 

295 GTYACFVSNLATGRNNSIVKSITVSASGTSPGLSAGATVGIMIGVLVGVA 344 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 i I M M I I I I I I I I I I I I I I I 

651 GTYACFVSNLATGRNNSIVKSITVSASGTSPGLSAGATVGIMIGVLVGVA 700 

345 LI 346 
I I 

701 LI 702 



Sequence name: CEA5_HUMAN 
Sequence documentation : 

Alignment of: HUMCEA_PEA_1_P2 0 x CEA5_HUMAN 
Alignment segment 1/1: 

Quality: 3294.00 

Escore: 0 

Matching length: 34 6 Total 

length: 702 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 49.29 Total Percent 

Identity: 49.29 
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Gaps : 1 

Alignment : 

1 MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKE 50 

I I I M I I I t I I I I I I I I II I I I I I I I I I I 1 I M I I II II I I I I I I I I I I I 

1 MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKE 50 

51 VLLLVHNLPQHLFGYSWYKGERVDGNRQIIGYVIGTQQATPGPAYSGREI 100 

| | I I I I I I I I I I II I II I I I I I I I I I I I I I I M I I I I I I I I M I I I II I I 
51 VLLLVHNLPQHLFGYSWYKGERVDGNRQ1IGYVIGTQQATPGPAYSGREI 100 

101 IYPNASLLIQNIIQNDTGFYTLHV1KSDLVNEEATGQFRVYP 142 

| | I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I 

101 IYPNASLLIQNIIQNDTGFYTLHVIKSDLVNEEATGQFRVYPELPKPSIS 150 

142 142 

151 SNNSKPVEDKDAVAFTCEPETQDATYLWWVNNQSLPVSPRLQLSNGNRTL 20 0 

142 1« 

2 01 TLFNVTRNDTASYKCETQNPVSARRSDSVILNVLYGPDAPTISPLNTSYR 250 

142 142 

251 SGENLNLSCHAASNPPAQYSWFVNGTFQQSTQELFIPNITVNNSGSYTCQ 300 

142 142 



301 AHNSDTGLNRTTVTTITVYAEPPKPFITSNNSNPVEDEDAVALTCEPEIQ 350 
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142 



142 



351 NTTYLWWVNNQSLPVSPRLQLSNDNRTLTLLSVTRNDVGPYECGIQNELS 400 



25 



142 



142 



4 01 VDHSDPVILNVLYGPDDPTISPSYTYYRPGVNLSLSCHAASNPPAQYSWL 450 

10 143 EL 144 

I I 

451 IDGNIQQHTQELFISNITEKNSGLYTCQANNSASGHSRTTVKTITVSAEL 500 

145 PKPSISSNNSKPVEDKDAVAFTCEPEAQNTTYLWWVNGQSLPVSPRLQLS 194 

15 | | | I I I 1 I I I I I I I I I I I I I I I I I I II I I I I I I 1 I I I M I I I I I M ! I I I 

501 PKPSISSNNSKPVEDKDAVAFTCEPEAQNTTYLWWVNGQSLPVSPRLQLS 550 

195 NGNRTLTLFNVTRNDARAYVCGIQNSVSANRSDPVTLDVLYGPDTPIISP 244 
| I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
20 551 NGNRTLTLFNVTRNDARAYVCGIQNSVSANRSDPVTLDVLYGPDTPIISP 60 0 

245 PDSSYLSGANLNLSCHSASNPSPQYSWRINGIPQQHTQVLFIAKITPNNN 294 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
601 PDSSYLSGANLNLSCHSASNPSPQYSWRINGIPQQHTQVLFIAKITPNNN 650 



295 GT YACFVSNLATGRNNS I VKS I TVS AS GT S PGL S AGAT VG IMI GVLVGVA 344 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
651 GTYACFVSNLATGRNNS I VKS ITVSASGTSPGLSAGATVGIMI GVLVGVA 700 



30 



345 LI 
I I 



346 
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701 LI 702 

DESCRIPTION FOR CLUSTER R35137 
Cluster R35137 features 6 transcript(s) and 20 segment(s) of interest, the names for which 
are given in Tables 999 and 1000, respectively, the sequences themselves are given at the end of 
the application. The selected protein variants are given in table 100L 

Table 999 - Transcripts of interest 



Transcript Name ^ r _ ^'f. 


Sequence ID No. : . - ;* r '"./<* 


R35137_PEA_1_PEA_1_PEA_1_T3 


114 


R35 1 37_PEA_1_PEA_1_PEA_1_T5 


115 


R35137_PEA_1_PEA_1_PEA_1_T10 


116 


R35 1 37_PEA_1_PEA_1_PEA_1_T1 1 


117 


R35137 PEA 1 PEA 1 PEA 1 T12 


118 


R35137 PEA 1 PEA 1 PEA 1 T14 


119 


Table 1000 - Segments of interest 


Segment Name ;f: '~T'. 


; Sequence ID No. * ' r ' «i ; '■■'{>■ 


R35 1 37_PEA_l_PEA_l_PEA_l_node_2 


856 


R35 1 37_PEA_l_PEA_l_PEA_l_node_3 


857 


R35 1 37_PEA_l_PEA_l_PEA_l_node_9 


858 


R35 1 37_PEA_l_PEA_l_PEA_l_node_l 1 


859 


R35 1 37_PEA_l_PEA_l_PEA_l_node_l 6 


860 


R35 1 37_PEA_l_PEA_l_PEA_l_node_l 8 


861 


R35137_PEA_l_PEA_l_PEA_l_node_20 


862 


R35 1 37_PEA_l_PEA_l_PEA_l_node_27 


863 


R35 1 37_PEA_l_PEA_l_PEA_l_node_5 


864 


R35 1 37_PEA_l_PEA_l_PEA_l_node_7 


865 


R35 1 37_PEA_l_PEA_l_PEA_l_node_l 2 


866 
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R35137 PEA 1 PEA 1 PEA 1 node 14 


867 


R35137 PEA 1 PEA 1 PEA 1 node 15 


868 


R35137 PEA 1 PEA 1 PEA 1 node 17 


869 


R35137 PEA 1 PEA 1 PEA 1 node 21 


870 


R35137 PEA 1 PEA 1 PEA 1 node 22 


871 


R35 1 37_PEA_l_PEA_l_PEA_l_node_23 


872 


R35137 PEA 1 PEA 1 PEA 1 node 24 


873 


R35137 PEA 1 PEA 1 PEA 1 node 25 


874 


R35137 PEA 1 PEA 1 PEA 1 node 26 


875 



Table 1001 - Proteins of interest 



Protein Name C 

' " U \ ' • ' - £4 ■' ' :"">i v 


Sequence ID 

m: ' ' ' •* 


Corresponding Transcript(s) 


R35 1 37_PEA_1_PEA_1_PEA_1_P9 


1385 


R35137 PEA 1 PEA 1 PEA 1 T10; 
R3 5 1 37JPEA_1_PEA_1_PEA_1_T1 2 


R35137 PEA 1 PEA 1 PEA 1 P8 


1386 


R35137_PEA_1_PEA_1_PEA_1_T1 1 


R35137 PEA 1 PEA 1 PEA 1 Pll 


1387 


R35137 PEA 1 PEA 1 PEA 1 T14 


R35137 PEA 1 PEA 1 PEA 1 P2 


1388 


R35 1 37_PEA_1_PEA_1_PEA_1_T3 


R35137 PEA 1 PEA 1 PEA 1 P4 


1389 


R3 5 1 37_PEA_1_PEA_1_PEA_1_T5 



These sequences are variants of the known protein Alanine aminotransferase (SwissProt 
5 accession identifier ALAT ITUMAN; known also according to the synonyms EC 2.6,1.2; 
Glutamic— pyruvic transaminase; GPT; Glutamic— alanine transaminase), SEQ ID NO: 1452, 
referred to herein as the previously known protein. 

Protein Alanine aminotransferase is known or believed to have the following function(s): 
Participates in cellular nitrogen metabolism and also in liver gluconeogenesis starting with 
10 precursors transported from skeletal muscles. The sequence for protein Alanine 

aminotransferase is given at the end of the application, as "Alanine aminotransferase amino acid 
sequence". Known polymorphisms for this sequence are as shown in Table 1002. 

Table 1002 - Amino acid mutations for Known Protein 
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SNP position(s) on 
amino acid sequence 


Comment • 


13 


H -> N (in allele GPT*2; dbSNP: 1063739). 
/FTId=VAR_000561. 


3 - 6 


STGD -> RRGN 


38 


G-> S 


221 


A->H 



Protein Alanine aminotransferase localization is believed to be Cytoplasmic. 

Cluster R35137 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 34 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 34 and Table 1003. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: hepatocellular carcinoma. 



Table 1003 - Normal tissue distribution 



Name of Tissue 


Number 


brain 


12 


epithelial 


16 


general 


8 


kidney 


20 


liver 


0 


lung 


0 


pancreas 


2 
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prostate 



Table 1004 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI 


P2 if? • 


SF1 i - ■ 


R3 


SP2 


R4 ; e 


brain 


3.2e-01 


4.8e-01 


1.8e-01 


2.5 


4.2e-01 


1.5 


epithelial 


7.6e-01 


7.7e-01 


8.9e-01 


0.5 


9.8e-01 


0.4 


general 


6.7e-01 


8.2e-01 


4.2e-01 


1.0 


8.5e-01 


0.7 


kidney 


8.6e-01 


9.0e-01 


5.8e-01 


0.9 


7.0e-01 


0.8 


liver 


1.8e-01 


4.5e-01 


3.0e-03 


7.6 


1.6e-01 


2.3 


lung 


1 


6.3e-01 


1 


1.0 


6.2e-01 


1.6 


pancreas 


2.3e-01 


4.0e-01 


1.8e-01 


3.1 


2.8e-01 


2.3 


prostate 


1 


7.8e-01 


1 


1.0 


7.5e-01 


1.3 



above. These transcript(s) encode for protein(s) which are variant(s) of protein Alanine 
aminotransferase. A description of each variant protein according to the present invention is 
now provided. 



Variant protein R35137JPEA_1_PEA_1 JPEA_1_P9 according to the present invention 
has an amino acid sequence as given at the end of the application; it is encoded by transcript(s) 

10 R35137JPEA„1_PEA_1 JPEA_1 JT10. An alignment is given to the known protein (Alanine 
aminotransferase) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

1 5 Comparison report between R35 1 37 J?EA_1_PEA_1_PEA_1 J>9 and 

AL AT HUM AN V 1 (SEQ ID NO: 1453): 

l.An isolated chimeric polypeptide encoding for R35137JPEA„1_PEA_1 JPEA_1_P9, 
comprising a first amino acid sequence being at least 90 % homologous to 
MASSTGDRSQAVRHGLRAKVLTLDGMKPRVRRVEYAVRGPIVQRALELEQELRQGVK 

20 KJPFTEVIRANIGDAQAMGQRPITFLRQVLALCVlSnPDLLSSPNFP 
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GHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNVFLSTGASDAIVTVLKLLVAGEG 
HTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDHCRP 
RALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEV corresponding to amino acids 1 - 
274 of ALATHUMANVl, which also corresponds to amino acids 1 - 274 of 
5 R35137_PEA_1_PEA_1 JPEA_1_P9 5 and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

RGAGEREAGQQSAPVTPCALPGVPGQRVRRGFAVPLIQEGAHGDGAALRRAAGACLLP 
LHLQGLHGRVRAYEAGGGSRAMARPSSPDGPPPPPHLTWPCAGAGSAAAMWRW 
10 corresponding to amino acids 275 - 385 of R35137JPEA_1JPEA_1 JPEA_1 JP9, wherein said 
first amino acid sequence and second amino acid sequence are contiguous and in a sequential 
order. 

2. An isolated polypeptide encoding for a tail of R35137JPEA_1 JPEAJ_PEA_1 JP9, 
comprising a polypeptide being at least 70%, optionally at least about 80%, preferably at least 
15 about 85%, more preferably at least about 90% and most preferably at least about 95% 
homologous to the sequence 

RGAGEREAGQQSAPVTPCALPGVPGQRVRRGFAVPLIQEGAHGDGAALRRAAGACLLP 
LHLQGLHGRVRAYEAGGGSRAMARPSSPDGPPPPPHLTWPCAGAGSAAAMWRWin 
R3 5 1 37 JPEA_1_PEA_1_PEA_1 JP9. 

20 

It should be noted that the known protein sequence (ALATJHUMAN) has one or more 
changes than the sequence given at the end of the application and named as being the amino 
acid sequence for ALAT_HUMAN_V1 . These changes were previously known to occur and are 
listed in the table below. 

25 Table 1005 - Changes to ALATJiUMAN_Vl 



SNP position(s) on 
amino acid sequence 


Type of change 


1 


init_met 


222 


conflict 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
5 intracellular^. The protein localization is believed to be intracellularly because neither of the 
trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signal-peptide prediction programs predict that this protein is a non- secreted 
protein. 

1 0 Variant protein R3 5 1 37_PEA_1_PEA_1_PEA_1_P9 is encoded by the following 

transcript(s): R35137JPEA_1 JPEA_1 JPEA_1 JT10, for which the sequence(s) is/are given at 
the end of the application. The coding portion of transcript 

R35137JPEA_1_PEA_1JPEA_1_T10 is shown in bold; this coding portion starts at position 
271 and ends at position 1425. The transcript also has the following SNPs as listed in Table 
15 1006 (given according to their position on the nucleotide sequence, with the alternative nucleic 
acid listed; the last column indicates whether the SNP is known or not; the presence of known 
SNPs in variant protein R35137JPEA_1 JPEA_1 JPEA_1_P9 sequence provides support for the 



deduced sequence of this variant protein according to the present invention). 
Table 1006 - Nucleic acid SNPs 



SNP position on nucleotide ' 
sequence ;A. ff . . '/• 


Alternative nucleic acid f ; 


Previously known SNP? . 


230 


C->T 


No 


231 


C->T 


No 


310 


C->A 


Yes 


432 


G-> 


No 


969 


C-> 


No 


1225 


G-> 


No 


1745 


T->G 


No 


1957 


C-> 


No 


2018 


G->A 


No 
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2019 


C -> A 


No 


2101 


A->G 


No 


2102 


A->G 


No 


2159 


C ->T 


Yes 


2710 


G->C 


No 


2789 


C -> A 


Yes 


3622 


G-> A 


Yes 



Variant protein R35137JPEA__1 _PEA_1 JPEA_1_P8 according to the present invention 
has an amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 R35137JPEA_1 JPEA_1 JPEA_1_T1 1 . An alignment is given to the known protein (Alanine 
aminotransferase) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

1 0 Comparison report between R35 1 37 _PEA_1 JPEA_1 JPEA_1_P8 and 

AL AT JHUM AN_V 1 : 

l.An isolated chimeric polypeptide encoding for R35137JPEA_1_PEA__1_PEAJ_P8 ? 
comprising a first amino acid sequence being at least 90 % homologous to 
MASSTGDRSQAVRHGLRAXVLTLDGMNPRVRRV 

15 KPFTEVIRANIGDAQAMGQRPITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACG 
GHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNVFLSTGASDAIVTVLEXLVAGEG 
HTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDHCRP 
RALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEVY 
MEMGPPYAGQQELASFHSTSKGYMGEC corresponding to amino acids 1 - 320 of 

20 ALAT JffUMAN_V 1 , which also corresponds to amino acids 1 - 320 of 

R3 5 1 3 7_PE A_ 1 _PE A_l JPE A_ 1 JP 8 , and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
VRTRRVGARGPWPGPPRPMGHPLLRT corresponding to amino acids 321 - 346 of 
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R35137_PEA__1_PEA_1_PEA_1 JP8, wherein said first amino acid sequence and second amino 
acid sequence are contiguous, and in a sequential order. 

2.An isolated polypeptide encoding for a tail of R35137_PEA__1 J>EA_1_PEA_1 JP8, 
comprising a polypeptide being at least 70%, optionally at least about 80%, preferably at least 
5 about 85%, more preferably at least about 90% and most preferably at least about 95% 
homologous to the sequence VRTRRVGARGPWPGPPRPMGHPLLRT in 
R35 1 37JPEA_1_PEA__1_PEA_1_P8 . 

It should be noted that the known protein sequence (ALATJHUMAN) has one or more 
10 changes than the sequence given at the end of the application and named as being the amino 

acid sequence for AL AT JHUM AN JV1 . These changes were previously known to occur and are 
listed in the table below. 

Table 1007 - Changes to ALATJHUMAN JV1 



SN^ ; pbsitiqn(s) ori% 
; amino acid ■ bequerier 


Type of chadge / , f " ^ . ? 


1 


init_met 


222 


conflict 



The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
intracellularly. The protein localization is believed to be intracellularly because neither of the 
20 trans- membrane region prediction programs predicted a trans- membrane region for this protein. 
In addition both signal-peptide prediction programs predict that this protein is a non- secreted 
protein. 

Variant protein R35137_PEA_1JPEA_1_PEA_1_P8 also has the following non-silent 
SNPs (Single Nucleotide Polymorphisms) as listed in Table 1008, (given according to their 
25 position(s) on the amino acid sequence, with the alternative amino acid(s) listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
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R35137JPEA_1 JPEA_1_PEA_1_P8 sequence provides support for the deduced sequence of 



this variant protein according to the present invention). 
Table 1008 - Amino acid mutations 



SNiP pbsitidn(s). on amino acid 
sequence,'' . ' • '. v • ; .'_ 


Alternative amino acid(s) 


Previously known SNP? 


14 


H->N 


Yes 


54 


Q-> 


No 


233 


R-> 


No 


296 


M-> 


No 



5 Variant protein R35 1 37 JPEA_1_PEA_1_PEA_1 JP8 is encoded by the following 

transcript(s): R35137_PEA_1_PEA__1_PEA_1_T11, for which the sequence(s) is/are given at 
the end of the application. The coding portion of transcript 

R35137_PEA_1 JPEA_1 JPEA_1 JT1 1 is shown in bold; this coding portion stalls at position 
271 and ends at position 1308. The transcript also has the following SNPs as listed in Table 
10 1009 (given according to their position on the nucleotide sequence, with the alternative nucleic 
acid listed; the last column indicates whether the SNP is known or not; the presence of known 
SNPs in variant protein R35137JPEA_1_PEA_1_PEA_1_P8 sequence provides support for the 



deduced sequence of this variant protein according to the present invention). 
Table 1009 - Nucleic acid SNPs 



SNP position on nucleotide e 
sequence 


Alternative nucleic acid 


Previously known SNP? 


230 


C->T 


No 


231 


C->T 


No 


310 


C -> A 


Yes 


432 


G-> 


No 


969 


C-> 


No 


1158 


G-> 


No 


1752 


T->G 


No 



WO 2006/131783 


1034 


PCT/IB2005/004037 


2030 


c-> 


No 


2091 


G-> A 


No 


2092 


C -> A 


No 


2174 


A->G 


No 


2175 


A -> G 


No 


2232 


C->T 


Yes | 


2783 


G-> C 


No 


2862 


C->A 


Yes 


3695 


G-> A 


Yes 



Variant protein R35137JPEA_1_PEA_1_PEA_1 JP1 1 according to the present invention 
has an amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 R35137JPEA_1_PEA_1_PEA_1_T14. An alignment is given to the known protein (Alanine 
aminotransferase) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 
10 Comparison report between R35137_PEA_1 JPEA_1 JPEA_1_P1 1 and 

ALAT_HUMAN_V 1 : 

l.An isolated chimeric polypeptide encoding for R35137JPEA_1_PEA_1JPEA_1JP11, 
comprising a first amino acid sequence being at least 90 % homologous to 
MASSTGDRSQAVRHGLRAKVLTLDGMNPRVIIRVEYAVRGPIVQPALELEQELRQGVK 
15 KPFTEVIRANIGDAQAMGQRPITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACG 
GHSLGAYSVSSGIQLIREDVARYIERRJD^ 

HTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQAR 
corresponding to amino acids 1 - 229 of ALAT_HUMAN_V 1 , which also corresponds to amino 
acids 1 - 229 of R35137JPEA_1JPEA_1 JPEA_1_P11, and a second amino acid sequence being 
20 at least 90 % homologous to SGFGQREGTYHFRMTILPPLEKLRLLLEKLSRFHAKFTLEYS 
corresponding to amino acids 455 - 496 of ALAT_HUMAN_V1 5 which also corresponds to 
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amino acids 230 - 271 of R35137JPEA_1_PEAJ J?EA_1_P1 1, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 

2.An isolated chimeric polypeptide encoding for an edge portion of 
R35137JPEA_1 J?EA_1_PEA_1_P1 1, comprising a polypeptide having a length V, wherein n 
5 is at least about 10 amino acids in length, optionally at least about 20 amino acids in length, 

preferably at least about 30 amino acids in length, more preferably at least about 40 amino acids 
in length and most preferably at least about 50 amino acids in length, wherein at least two amino 
acids comprise RS, having a structure as follows: a sequence starting from any of amino acid 
numbers 229-x to 229; and ending at any of amino acid numbers 230+ ((rv2) - x), in which x 
10 varies from 0 to n-2. 

It should be noted that the known protein sequence (ALATJHUMAN) has one or more 
changes than the sequence given at the end of the application and named as being the amino 
acid sequence for ALAT_HUMAN_V1 . These changes were previously known to occur and are 
1 5 listed in the table below. 

Table 1010 - Changes to ALA T_HUMAN_V1 



\SNP posi]^^(sX<Hi5 
| ^mino ^bi<l sequence I 


Type of change ; : 


1 


init_met 


222 


conflict 



The location of the variant protein was determined according to results from a number of 
20 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
intracellularly. The protein localization is believed to be intracellular ly because neither of the 
trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signatpeptide prediction programs predict that this protein is a non-secreted 
25 protein. 
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Variant protein R35137JPEA_1_PEA_1 J?EA_1_P1 1 also has the following non-silent 
SNPs (Single Nucleotide Polymorphisms) as listed in Table 1011, (given according to their 
position(s) on the amino acid sequence, with the alternative amino acid(s) listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
5 R35137_PEA_1_PEA_1_PEA_1_P1 1 sequence provides support for the deduced sequence of 
this variant protein according to the present invention). 

Table 1011 - Amino acid mutations 



^SJsfP position^ dti amirio ax^id 


Alternative stamp aeid(s) 

" ; . . . •? ' 4'-. 


; Previously known SNR? 


14 


H~>N 


Yes 


54 


Q-> 


No 



Variant protein R35137_PEA_1_PEA_1 JPEA_1_P1 1 is encoded by the following 
10 transcript(s): R35137_PEA_1_PEA_1_PEA_1__T14, for which the sequence(s) is/are given at 
the end of the application. The coding portion of transcript 

R35137JPEA_1_PEA_1_PEA__1_T14 is shown in bold; this coding portion starts at position 
271 and ends at position 1083. The transcript also has the following SNPs as listed in Table 
1012 (given according to their position on the nucleotide sequence, with the alternative nucleic 
15 acid listed; the last column indicates whether the SNP is known or not; the presence of known 
SNPs in variant protein R35137JPEA_1_PEA_1_PEA_1 JP1 1 sequence provides support for 
the deduced sequence of this variant protein according to the present invention). 

Table 1012 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? > 


230 ! 


C->T 


No 


231 


C->T 


No 


310 


C->A 


Yes 


432 


G-> 


No 


1115 


C-> 


No 
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1176 


G -> A 


No 


1177 


C -> A 


No 



Variant protein R35137JPEA_1 J > EA_1_PEA__1JP2 according to the present invention 
has an amino acid sequence as given at the end of the application; it is encoded by transcript(s) 

5 R35137JPEA_1_PEA_1_PEA_1_T3. An alignment is given to the known protein (Alanine 
aminotransferase) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of tte relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

10 Comparison report between R35137JPEA_1 JPEA_1 JPEA_1_P2 and 

AL ATJHUM AN_V 1 : 

1. An isolated chimeric polypeptide encoding for R35137JPEA_1JPEA_1JPEA_1_P2, 
comprising a first amino acid sequence being at least 90 % homologous to 

MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVK 
15 KPFTEVIRANIGDAQAMGQRPITFLRQVLALCVNPDLLSSPNFPDDAKKRA 

GHSLGAYSVSSGIQLIREDVARYIERIUDGGIPADPNNVFLSTGASDAIVTVLKLLVAGEG 
HTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDHCRP 
RALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEV corresponding to amino acids 1 - 
274 of ALAT_HUMAN_V 1 , which also corresponds to amino acids 1 - 274 of 
20 R35137JPEA_1 JPEA_1_PEA_1_P2, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

RGAGEREAGQQSAPVTPCALPGVPGQRVRRGFAVPLIQEGAHGDGAALRRAAGACLLP 
LHLQGLHGRVRVPRRLCGGGEHGRCSAAADAEADECAAVPAGARTGPAGPGGQPAR 
25 AHRPLLCAVPG corresponding to amino acids 275 - 399 of 

R35137JPEA_1 JPEA_1_PEA_1_P2, wherein said first amino acid sequence and second amino 
acid sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of R35137JPEA_1 JPEA_1JPEA_1JP2, 
comprising a polypeptide being at least 70%, optionally at least about 80%, preferably at least 
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about 85%, more preferably at least about 90% and most preferably at least about 95% 
homologous to the sequence 

RGAGEREAGQQSAPVTPCALPGVPGQRVRRGFAVPLIQEGAHGDGAALRRAAGACLLP 
LHLQGLHGRVRVPRRLCGGGEHGRCSAAADAEADECAAVPAGARTGPAGPGGQPAR 
5 AHRPLLCAVPG in R35 1 37JPEA_1 J>EA_1_PEA_1 J>2. 

It should be noted that the known protein sequence (ALAT_HUMAN) has one or more 
changes than the sequence given at the end of the application and named as being the amino 
acid sequence for ALATJHUMAN_V1. These changes were previously known to occur and are 
10 listed in the table below. 

Table 1013 - Changes to ALATJiVMANJVl 



SNP positions) oii# jj; 
amino acid sequence 


, — ,,,,, ^ 3 j , ^ rj-r~ — , ^:-,y • 

Typeof changie/ ^ ] -'" ' ,,-^t- 


1 


init_met 


222 


conflict 



The location of the variant protein was determined according to results from a number of 
15 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
intracellularly. The protein localization is believed to be intracellular^ because neither of the 
trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
In addition both signal-peptide prediction programs predict that this protein is a none secreted 
20 protein. 

Variant protein R35137JPEA_1_PEA_1_PEA_1_P2 also has the following non- silent 
SNPs (Single Nucleotide Polymorphisms) as listed in Table 1014, (given according to their 
position(s) on the amino acid sequence, with the alternative amino acid(s) listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
25 R35137JPEA_1 JPEA_1JPEA_1_P2 sequence provides support for the deduced sequence of 
this variant protein according to the present invention). 
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Table 1014 - Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


14 


H->N 


Yes 


54 


Q> 


No 


233 


R-> 


No 


319 


G-> 


No 



Variant protein R35137JPEA_1_PEA_1JPEA_1 JP2 is encoded by the following 
transcript(s): R35137JPEA_1 JPEA_1 JPEA_1_T3, for which the sequence(s) is/are given at the 
5 end of the application. The coding portion of transcript R35137JPEA_1 JPEA_1 JPEA_1_T3 is 
shown in bold; this coding portion starts at position 271 and ends at position 1467. The 
transcript also has the following SNPs as listed in Table 1015 (given according to their position 
on the nucleotide sequence, with the alternative nucleic acid listed; the last column indicates 
whether the SNP is known or not; the presence of known SNPs in variant protein 
10 R35137JPEA_1 JPEA_1 JPEA_1 JP2 sequence provides support for the deduced sequence of 
this variant protein according to the present invention). 

Table 1015- Nucleic acid SNPs 



: SNP position on nucleotide 
sequence 0 v, , 


Alternative nucleic acid; 


■ Previously known SNP? . 


230 


C->T 


No 


231 


C->T 


No 


310 


C-> A 


Yes 


432 


G-> 


No 


969 


C-> 


No 


1225 


G-> 


No 


1645 


T->G 


No 


1857 


C-> 


No 


1918 


G->A 


No | 
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1919 


C-> A 


No 


2001 


A->G 


No 


2002 


A->G 


No 1 


2059 


C ->T 


Yes 


2610 


G->C 


No 


2689 


C-> A 


Yes 


3522 


G-> A 


Yes 



Variant protein R35137_PEA_1_PEA_1_PEA_1_P4 according to the present invention 
has an amino acid sequence as given at the end of the application; it is encoded by transcript(s) 

5 R35137_PEA_1_PEA_1_PEA_1_T5. An alignment is given to the known protein (Alanine 
aminotransferase) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

1 0 Comparison report between R35 1 37_PEA_1_PEA_1_PEA_1_P4 and 

ALAT_HUMAN_V 1 : 

l.An isolated chimeric polypeptide encoding for R35137_PEA_1_PEA_1_PEA_1_P4, 
comprising a first amino acid sequence being at least 90 % homologous to 

MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVK 
1 5 KPFTEVIRAMGDAQAMGQRPITFLRQVLALCVNPDLLSSPNFPDDAKIOIAERILQACG 
GHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNVFLSTGASDAIVTVLKLLVAGEG 
HTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDHCRP 
RALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEVYQDNWAAGSQFHSFKKVL 
MEMGPPYAGQQELASFHSTSKGYMGECGFRGGYVEVVNMDAAVQQQMLKLMSVRL 
20 CPPWGQALLDLWSPPAPTDPSFAQFQAEKQAVLAELAAKAKLTEQVFNEAPGISCNP 
VQGAMYSFPRVQLPPRAVERAQELGLAPDMFFCLRLLEETGICVVPGSGFGQREGTYH 
FRMTILPPLEKLRLLLEKXSRFHAKFTLE corresponding to amino acids 1 - 494 of 
ALAT_HUM ANJV 1 , which also corresponds to amino acids 1 - 494 of 
R35137_PEA_1JPEA_1JPEA_1_P4, and a second amino acid sequence being at least 70%, 
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optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 

SPGRLWSPLYLLLMPGGVGWGGCWAPASLQVPNKAVWQSDSKXEALAAAWPAPTCL 
PFLQA corresponding to amino acids 495 - 555 of R35137J>EA_1_PEA_1_PEA_1 JP4, 
5 wherein said first amino acid sequence and second amino acid sequence are contiguous and in a 
sequential order. 

2.An isolated polypeptide encoding for a tail of R35137JPEA_1 JPEA^l JPEAJ_P4, 
comprising a polypeptide being at least 70%, optionally at least about 80%), preferably at least 
about 85%o, more preferably at least about 90% and most preferably at least about 95% 
10 homologous to the sequence 

SPGRLWSPLYLLLMPGGVGWGGCWAP 
PFLQA in R35137J ) EA_1_PEA_1_PEA_1_P4. 

It should be noted that the known protein sequence (ALATHUMAN) has one or more 
15 changes than the sequence given at the end of the application and named as being the amino 

acid sequence for AL AT_HUMAN_V 1 . These changes were previously known to occur and are 

listed in the table below. 

Table 1016- Changes to ALAT_HUMAN_V1 



SMP position(s) on ' . 
amino acid sequence % 


Type of change ; . . - , ' ; - " : 


1 


init_met 


222 


conflict 



The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
intracellularly. The protein localization is believed to be intracellularly because neither of the 
25 trans- membrane region prediction programs predicted a trans -membrane region for this protein. 
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In addition both signal-peptide prediction programs predict that this protein is a non-secreted 
protein. 

Variant protein R35137_PEA_1 JPEA_1 _PEA_1_P4 also has the following non- silent 
SNPs (Single Nucleotide Polymorphisms) as listed in Table 1017, (given according to their 
5 position(s) on the amino acid sequence, with the alternative amino acid(s) listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
R35137JPEA_1 JPEA 1 PEA1P4 sequence provides support for the deduced sequence of 
this variant protein according to the present invention). 



Table 1017 - Amino acid mutations 



SNP position(s) on amino acid 
.sequence ' J y < |; :,; l > 


Alternative amino acid.(s) : , 


i Pre^ouslyilmown SNP? , : . 


14 


H->N 


Yes 


54 


Q-> 


No 


233 


R-> 


No 


296 


M-> 


No 


436 


D->E 


No 


508 


M->I 


No 


509 


P->T 


No 


536 


K->R 


No 



10 

Variant protein R35137JPEA_1 JPEA_1 JPEAJJP4 is encoded by the following 
transcript(s): R35137JPEA_1JPEA_1_PEA_1_T5 5 for which the sequence(s) is/are given at the 
end of the application. The coding portion of transcript R35137JPEA_1JPEA_1 JPEA_1_T5 is 
shown in bold; this coding portion starts at position 271 and ends at position 1935. The 
15 transcript also has the following SNPs as listed in Table 1018 (given according to their position 
on the nucleotide sequence, with the alternative nucleic acid listed; the last column indicates 
whether the SNP is known or not; the presence of known SNPs in variant protein 
R35137_PEA_1 JPEA_1_PEA_1JP4 sequence provides support for the deduced sequence of 
this variant protein according to the present invention). 

20 Table 1018 - Nucleic acid SNPs 
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.SNP position on nucleotide .' " 
sequence " _ ' ' , . 


Alternative nucleic acid , 


Previously known SNP? 


230 


C->T 


No 


231 


C->T 


No 


310 


C->A 


Yes 


432 


G-> 


No 


969 


C-> 


No 


1158 


G-> 


No 


1578 


T->G 


No 


1794 


G->A 


No 


1795 


C->A 


No 


1877 


A->G 


No 


1878 


A->G 


No 


1935 


C->T 


Yes 


2486 


G->C 


No 


2565 


C-> A 


Yes 


3398 


G-> A 


Yes 


As noted above, cluster R35137 features 20 segments), w 


hich were listed in Table 2 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 



5 provided. 

Segment cluster R35137_PEA_lJ?EA_l_PEA_l_node_2 according to the present 
invention is supported by 19 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
10 R35137_PEA„1_PEA_1_PEA_1_T3 ? R35137J>EAJLPEAJLPEA_1_T5, 
R35137JPEA_1 J>EA_1 J>EA_1 JT10, R35137_PEA_1_PEA_1_PEA_1_T1 1, 
R35137_PEA_1JPEA_1_PEA„1_T12 and R35137_PEA_1_PEA_1JPEA_1 JT14. Table 1019 
below describes the starting and ending position of this segment on each transcript. 
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Table 1019 - Segment location on transcripts 



Transcript name 


Segment 
\ starting position f. ', 


Segment •? 1 
ending position : 'f 


R35 1 37_PEA_1_PEA_1_PEA_1_T3 


1 


266 


R35 1 37_PEA_1_PEA_1_PEA_1_T5 


1 


266 


R35 1 37JPEA_1_PEA_1_PEA_1_T10 


1 


266 


R35137_PEA_1_PEA_1_PEA_1_T1 1 


1 


266 


R35 1 37_PEA_1_PEA_1_PEA_1_T12 


1 


266 


R35137 PEA 1 PEA 1 PEA 1 T14 


1 


266 



Segment cluster R35137_PEA_l_PEA_l_PEA_l_node_3 according to the present 
invention is supported by 24 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
R35137_PEA_1J>EA_1_PEA_1_T3, R35137„PEA_1_PEA_1JPEA_1_T5, 
R35137_PEA_1_PEA_1_PEA_1 JT10, R35137JPEA_1_PEA_1_PEA_1_T1 1, 
R35137J>EA_1 JPEA„1_PEA_1_T12 and R35137_PEA_1 JPEA_1_PEA_1„T14. Table 1020 
below describes the starting and ending position of this segment on each transcript. 

Table 1020 - Segment location on transcripts 



Transcript name fV' 


} Segment 0 
I starting position 


Segment 
ending position 


R35137 PEA 1 PEA 1 PEA 1 T3 


267 


432 


R35137_PEA_1_PEA_1_PEA_1_T5 


267 


432 


R35137 PEA 1 PEA 1 PEA 1 T10 


267 


432 


R35137_PEA_1_PEA_1_PEA_1_T1 1 


267 


432 


R35137 PEA 1 PEA 1 PEA 1 T12 


267 


432 


R35137 PEA 1 PEA 1 PEA 1 T14 


267 


432 
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Segment cluster R35137_PEA_1 JPEA_l_PEA_l__node_9 according to the present 
invention is supported by 25 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
R35 1 37 JPEA_1_PEA_1 JPEA_1 JT3 ? R35 1 37_PEA_1 J>EA_1_PEA_1 JT5, 
5 R35 1 37JPEA_1 JPEA_1 JPEAJ JT1 0, R35 1 37 J > EA_1_PEA__1_PEA_J_T1 1 , 

R35137 PEA 1 PEA 1 PEA 1 T12andR35137 PEA 1 PEA 1 PEA 1 T14. Table 1021 



below describes the starting and ending position of this segment on each transcript. 
Table 1021 - Segment location on transcripts 



Transcript neiii^ > : A : 


Segment # | ' : - 
starting position 


Segment ; 
'ettding position ^ i; 


R35137 PEA 1 PEA 1 PEA 1 T3 


632 


765 


R35 1 37_PEA_1_PEA_1_PEA_1_T5 


632 


765 


R35 1 37_PEA_1_PEA_1_PEA_1_T10 


632 


765 


R35137 PEA 1 PEA 1 PEA 1 Til 


632 


765 


R35 1 37_PEA_1_PEA_1_PEA_1_T12 


632 


765 


R35137_PEA_1_PEA_1_PEA_1_T14 


632 


765 



Segment cluster R35 1 37 JPEA_1 JPEA_1JPEA_1 jaode_l 1 according to the present 
invention is supported by 30 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
R35 1 37 JPEA_1 JPEA_1_PEA_1_T3 ? R35 1 37 JPEA_1_PEA_1 JPEAJJT5, 
15 R35137JPEA_1_PEAJLPEA_1_T10, R35137JPEA_1 „PEA_1_PEA_1_T1 1, 

R35137 PEA 1 PEA 1 PEA 1 T12andR35137 PEA 1 PEA 1 PEA_1_T14. Table 1022 



below describes the starting and ending position of this segment on each transcript. 
Table 1022 - Segment location on transcripts 



Transcript name 


Segment 
l starting position 


Segment 
ending position 


R35137 PEA 1 PEA 1 PEA 1 T3 


766 


955 
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R35137 PEA 1 PEA 1 PEA 1 T5 


766 


955 


R35137 PEA 1 PEA 1 PEA 1 T10 


766 


955 


R35137_PEA_1_PEA_1_PEA_1_T1 1 


766 


955 


R35137 PEA 1 PEA 1 PEA 1 T12 


766 


955 


R35 1 37_PEA_1_PEA_1_PEA_1_T 1 4 


766 


955 



Segment cluster R35137_PEA_1_PEA_1 JPEA_l_node_16 according to the present 
invention is supported by 23 libraries. The number of libraries was determined as previously 
5 described. This segment can be found in the following transcript(s): 

R35137_PEA_1_PEA_1J>EA_1JT3, R35137_PEA_1_PEA_1_PEA_1_T5, 
R35137JPEA_1_PEA_1_PEA_1JT10, R35137_PEA_1JPEA_1JPEA_1JT11 and 
R35137_PEA_1JPEA_1 JPEAJT12. Table 1023 below describes the starting and ending 
position of this segment on each transcript. 

10 Table 1023 - Segment location on transcripts 



_ . . . —r'r 


Segment .- < 1' 
starting position 


Segment! 

ending position ; 


R35137 PEA 1 PEA 1 PEA 1 T3 


1157 


1293 


R35137 PEA 1 PEA 1 PEA 1 T5 


1090 


1226 


R35137 PEA 1 PEA 1 PEA 1 T10 


1157 


1293 


R35137 PEA 1 PEA 1 PEA 1 Til 


1090 


1226 


R35137_PEA_1_PEA_1_PEA_1_T12 


1157 


1293 



Segment cluster R35137JPEA_1 JPEA_l_PEA_l_node_18 according to the present 
invention is supported by 24 libraries. The number of libraries was determined as previously 
15 described. This segment can be found in the following transcript(s): 

R35 1 37_PEA_1_PEA_1_PEA_1_T3 ? R35 1 37_PEA_1_PEA_1 J>EA_1_T5, 
R35137JPEA_1_PEA_1_PEA_1_T10 5 R35137_PEA_1JPEA„1„PEA_1_T11 and 



WO 2006/131783 



PCT/IB2005/004037 



1047 

R35137JPEA_1_PEA_1_PEA_1 JT12. Table 1024 below describes the starting and ending 



position of this segment on each transcript. 
Table 1024 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment;: f 
ending position 


R35137 PEA 1 PEA 1 PEA 1 T3 


1294 


1468 


R35 1 37_PEA_1_PEA_1_PEA_1_T5 


1227 


1401 


R35 1 37_PEA_1_PEA_1_PEA_1_T 1 0 


1394 


1568 


R35137_PEA_1_PEA_1_PEA_1_T1 1 


1327 


1501 


R35 1 37_PEA_1_PEA_1_PEA_1_T12 


1394 


1568 



5 Microarray (chip) data is also available for this segment as follows. As described above 

with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment (in relation to lung cancer), shown in Table 1025. 

Table 1025 - Oligonucleotides related to this segment 



Oligonucleotide name 


Overexpf essed in caAus0rs 


Chip reference ■ i ^X\-\ f; J\ 


R35137J)_S_0 


lung malignant tumors 


LUN 



10 

Segment cluster R35137_PEA__1_PEA_1 JPEA_l_node_20 according to the present 
invention is supported by 29 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
1 5 R35 137„PEA_1_PEA_1_PEA_1 JT3, R35 137JPEA_1 JPEAJ J>EA_1_T5, 

R35137J>EA_1J>EA_1JPEA_1_T10, R35137„PEA__1_PEA_1_PEA_1_T11 and 
R35137_PEA_1_PEA_1_PEA_1_T12. Table 1026 below describes the starting and ending 
position of this segment on each transcript. 

Table 1026 - Segment location on transcripts 
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Transcript name - f / ; 


Segment 
starting position 


Segment .; 
ending position 


R35 1 37_PEA_1_PEA_1JPEA_1_T3 


1469 


1624 


R35 1 37_PEA_1_PEA_1_PEA_1_T5 


1402 


1557 


R3 5 1 37_PEA_1 _PEA_1_PEA_1_T 1 0 


1569 


1724 


R35137_PEA_1_PEA_1_PEA_1_T1 1 


1502 


1657 


R35137 PEA 1 PEA 1 PEA 1 T12 


1569 


1724 



Segment cluster R35137JPEA_1 JPEA_1 JPEA_l_node_27 according to the present 
invention is supported by 39 libraries. The number of libraries was determined as previously 
5 described. This segment can be found in the following transcript(s): 

R35 1 37 JPEA_1 J>EA_1 JPEAJLT3, R35 1 37_PEA_1_PEA_1 JPEA__1 JT5, 
R35137J>EA_1J>EA_1J>EA__1_T10, R35137J>EA_1_PEA_1 JPEA_1_T11, 
R35137JPEA_1JPEA_1JPEA_1_T12 and R35137_PEA_1_PEA_1 J>EA__1 JT14. Table 1027 
below describes the starting and ending position of this segment on each transcript. 

10 Table 1027 - Segment location on transcripts 



Transcript name ",' 


Segment ?\> % : Ui ■' * > 
j starting position 


Segment 
| ending ; position 


R35137 PEA 1 PEA 1 PEA 1 T3 


1876 


3898 


R35137 PEA 1 PEA. 1 PEA 1 T5 


1752 


3774 


R35137 PEA 1 PEA 1 PEA 1 T10 


1976 


3998 


R35137_PEA_1_PEA_1_PEA_1_T1 1 


2049 


4071 


R35137 PEA 1 PEA 1 PEA 1 T12 


2116 


4138 


R35137 PEA 1 PEA 1 PEA 1 T14 


1134 


1250 



According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 
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Segment cluster R35137JPEA_l_PEA_l_PEA_l_node_5 according to the present 
invention is supported by 20 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
R35137JPEA_1JPEA_1J?EA_1_T3 ? R35137JPEA_1JPEAJLJPEA_1JT5, 
5 R35137JPEA_1 JPEA^l JPEA_1_T10, R35137JPEA_1 J>EA_1 JPEA_1 JT1 1, 

R35137JPEA_1JPEA_1JPEA_1_T12 and R35137_PEA_1JPEA_1 JPEA_1_T14. Table 1028 
below describes the starting and ending position of this segment on each transcript. 



Table 1028 - Segment location on transcripts 



Transcript name -r^v ' ' 4 


Segment ? 
^Startingpositioii 


Segment ',■ 
ending position % 


R35137_PEA_1_PEA_1_PEA_1_T3 


433 


522 


R35137 PEA 1 PEA 1 PEA 1 T5 


433 


522 


R3 5 1 37_PEA_1_PEA_1_PEA_1_T1 0 


433 


522 


R35137_PEA_1_PEA_1_PEA_1_T1 1 


433 


522 


R35 1 37_PEA_1_PEA_1_PEA_1_T12 


433 


522 


R35 1 37_PEA_1_PEA_1_PEA_1_T14 


433 


522 



Segment cluster R35137_PEA_1 JPEA_l_PEA__l_nodeJ7 according to the present 
invention is supported by 23 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
R35137_PEA__1_PEA_1_PEA_1_T3 ? R35137_PEA_1_PEA_1_PEA > _1_T5 5 
15 R35137J>EAJ JPEA_1_PEA_1_T10, R35137_PEA_1_PEA„1_PEA_1 JT1 1, 

R35137 PEA 1 PEA 1 PEA_1_T12 and R35137_PEA_1 JPEA_1 JPEA„1_T14. Table 1029 



below describes the starting and ending position of this segment on each transcript. 
Table 1029 - Segment location on transcripts 



Transcript name 


, Segment 
starting position 


Segment 
ending position 


R35137_PEA_1_PEA_1_PEA_1_T3 


523 


631 
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R35137 PEA 1 PEA 1 PEA 1 T5 


523 


631 


R35137 PEA 1 PEA 1 PEA 1 T10 


523 


631 


R351 37_PEA_1_PEA_1_PEA_1_T1 1 


523 


631 


R35137 PEA 1 PEA 1 PEA 1 T12 


523 


631 


R351 37_PEA_1_PEA_1_PEA_1_T14 


523 


631 



Segment cluster R35137JPEA_1 JPEAl JPEA_l_node_12 according to the present 
invention is supported by 22 libraries. The number of libraries was determined as previously 
5 described. This segment can be found in the following transcript(s): 

R35137JPEA_1JPEA_JJPEA_1_T3, R35137JPEAJ J>EA_1JPEA_1JT5, 
R35137_PEA_1_PEA_1_PEA_1_T10, R35137JPEA_1 J>EAJLPEA_1_T11 and 
R35137_PEA_1JPEA_1_PEA_1JT12. Table 1030 below describes the starting and ending 
position of this segment on each transcript. 

1 0 Table 1030 - Segment location on transcripts 



Transcript nanie v :, .*/ ; f;> , .;*•> 


Segment 

starting position -£\ 


Segment > 
ending position u > 


R35137 PEA 1 PEA 1 PEA 1 T3 


956 


1009 


R35137 PEA 1 PEA 1 PEA 1 T5 


956 


1009 


R35137 PEA 1 PEA 1 PEA 1 T10 


956 


1009 


R35137 PEA 1 PEA 1 PEA 1 Til 


956 


1009 


R35 1 37_PEA_1_PEA_1_PEA_1_T12 


956 


1009 



Segment cluster R35137JPEA_1_PEA_1_PEA_1 jnode_14 according to the present 
invention is supported by 23 libraries. The number of libraries was determined as previously 
15 described. This segment can be found in the following transcript(s): 

R35 1 37_PEA_1_PEA__1 J>EA_1 JT3 , R35 1 37 JPEA JLPEA_1_PEA_1_T5, 
R35137_PEA_1_PEA_1_PEA_1JT10, R35137_PEA_1JPEA_1_PEA_1_T11 and 
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R35137_PEA_1JPEA_1_PEA_1__T12. Table 1031 below describes the starting and ending 



position of this segment on each transcript. 
Table 1031 - Segment location on transcripts 



Transcript name % . ■'>$. 
"'"v " ; :" - ' , A. v 


Segment ' ; 
starting position 


Segment 

ending position J 


R35 1 37_PEA_1_PEA_1_PEA_1_T3 


1010 


1089 


R35 1 37_PEA_1_PEA_1_PEA_1_T5 


1010 


1089 


R35 1 37_PEA_1_PEA_1_PEA_1_T10 


1010 


1089 


R35137 PEA 1 PEA 1 PEA 1 Til 


1010 


1089 


R35 1 37_PEA_1_PEA_1_PEA_1_T12 


1010 


1089 



Segment cluster R35137JPEA_l_PEA__lJPEA_l_node_15 according to the present 
invention is supported by 6 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
R35137JPEA_1JPEA_1_PEA_1_T3, R35137_PEA_1 JPEA_1_PEA_1_T10 and 
10 R35137_PEA_1JPEA_1_PEA_1JT12. Table 1032 below describes the starting and ending 



position of this segment on each transcript. 
Table 1032 - Segment location on transcripts 



Traiiscript name . : 

■ ' - ■ ■ ■•• ■ 


Segment 
1 starting position 


Segment, . \ J. ■ ; 
ending position ; : 


R35 1 37_PEA_1_PEA_1_PEA_1_T3 


1090 


1156 


R35 1 37_PEA_1_PEA_1_PEA_1_T1 0 


1090 


1156 


R35 1 37_PEA_ 1_PEA_1_PEA_1_T1 2 


1090 


1156 



1 5 Segment cluster R3 5 1 37_PEA_1 JPEA_l_PEA_l_node_l 7 according to the present 

invention is supported by 5 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following trans cript(s): 
R35137_PEA_1_PEA_1_PEAJLT10 ? R35137JPEA_1_PEA__1_PEA_1_T11 and 
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R35137_PEA_1_PEA_1 JPEA_1 JT12. Table 1033 below describes the starting and ending 
position of this segment on each transcript. 

Table 1033 - Segment location on transcripts 



Transcript name ~, ; k~r:£ ; 


Segment '•('■■■:, z,i~ : <"-f 
starting position f J 


Segment ; y 
ending position 


R35 1 37_PEA_1_PEA_1_PEA_1_T1 0 


1294 


1393 


R35 1 37_PEA_1_PEA_1_PEA_1_T1 1 


1227 


1326 


R351 37_PEA_1_PEA_1_PEA_1_T12 


1294 


1393 



5 

Segment cluster R35137JPEA_1 JPEA_1 JPEA_l_node_21 according to the present 
invention is supported by 6 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 

R35137_PEA_1JPEAJLPEA_1_T11 and R35137_PEA_1_PEAJLPEA_1__T12. Table 1034 
10 below describes the starting and ending position of this segment on each transcript. 

Table 1034 - Segment location on transcripts 



Transcript name ; " ; ? 


• Segment 

i starting position , § 


Segment . ~f . ? 
ending position 


R35137 PEA 1 PEA 1 PEA 1 Til 


1658 


\13\ 


R35 137_PEA_1_PEA_1_PEA_1_T12 


1725 


1798 



Segment cluster R35137JPEA_1 J > EA_l_PEA_l_node_22 according to the present 
15 invention is supported by 31 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
R35 1 37 JPEA_1„PEA„1_PEA_1_T3 ? R35 1 37 JPEA_1_PEA_1 J>EA_1_T5 5 
R35137JPEA_1_PEA_1_PEAJLT10, R35137J>EA_1_PEA_1_PEA_1_T11 and 
R35137_PEA_1_PEA_1_PEA_1_T12. Table 1035 below describes the starting and ending 
20 position of this segment on each transcript. 
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Table 1035 - Segment location on transcripts 



Transcript name ; 


Segment , 
starting position 


Segment . y - . : 
ending position 


R35 1 37_PEA_1_PEA_1_PEA_1_T3 


1625 


1697 


R3 5 1 37_PEA_1_PEA_1_PEA_1_T5 


1558 


1630 


R35137 PEA 1 PEA 1 PEA 1 T10 


1725 


1797 


R35137 PEA 1 PEA 1 PEA 1 Til 


1732 


1804 


R35137 PEA 1 PEA 1 PEA 1 T12 


1799 


1871 



Segment cluster R35137JPEA_1_PEA_1 JPEA_l_nodeJ23 according to the present 
5 invention is supported by 29 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
R35137_PEA_1_PEA_1_PEA_1_T3 5 R35137_PEA_1 J>EA_1_PEA_1JT5, 
R35 1 37_PEA_1_PEA_1 JPEA^IJTIO, R35 137_PEA_1__PEA_JJPEA_1_T1 1 , 
R35137_PEA_1_PEA_1_PEA_1__T12 and R35137__PEA_1_PEA„1_PEA_1_T14. Table 1036 
10 below describes the starting and ending position of this segment on each transcript. 

Table 1036 - Segment location on transcripts 



Transcript name i f ; c 


. Segment ' '-X 
starting position. 


Segment , y 
ending position 


R35137 PEA 1 PEA 1 PEA 1 T3 


1698 


1737 


R35137 PEA 1 PEA 1 PEA 1 T5 


1631 


1670 


R35137 PEA 1 PEA 1 PEA 1 T10 


1798 


1837 


R35137 PEA 1 PEA 1 PEA 1 Til 


1805 


1844 


R35137 PEA 1 PEA 1 PEA 1 T12 


1872 


1911 


R35137 PEA 1 PEA 1 PEA 1 T14 


956 


995 



Segment cluster R35137JPEA_1 JPEA_lJPEA_l_nodeJ24 according to the present 
15 invention is supported by 5 libraries. The number of libraries was determined as previously 
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described. This segment can be found in the following transcript(s): 

R35137JPEA_JJPEA_1_PEA_1JT11 and R35137_PEA_1_PEA_1 JPEA_1_T12. Table 1037 
below describes the starting and ending position of this segment on each transcript. 

Table 1037 - Segment location on transcripts 



Transcript name ; . ' 


Segment . 
starting position 


Segment ,V t: • ' : .. v 
ending position % ^ 


R35137_PEA_1_PEA_1_PEA_1_T1 1 


1845 


1910 


R35 1 37_PEA_1_PEA_1_PEA_1_T12 


1912 


1977 



5 



Segment cluster R35137JPEA_l_PEA_l_PEA_l_node_25 according to the present 
invention is supported by 30 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
10 R35 1 37_PEA_1_PEA_1_PEA_1_T3 , R35 1 37 JPEA_1_PEA_1 JPEA„1_T5, 
R35137_PEA_1 J>EA_1 JPEA_1_T10> R35137_PEA_1_PEA_1_PEA_1_T1 1, 
R35137JPEA_1 JPEA_1 JPEA_1_T12 and R35137_PEA_1_PEA_1JPEA_1_T14. Table 1038 
below describes the starting and ending position of this segment on each transcript. 

Table 1038 - Segment location on transcripts 



Transcript name 


Segment 
: starting position 


Segment 
ending position 


R35137 PEA 1 PEA 1 PEA 1 T3 


1738 


1818 


R35 1 37_PEA_1_PEA_1_PEA_1_T5 


1671 


1751 


R35137 PEA 1 PEA 1 PEA 1 T10 


1838 


1918 


R35137_PEA_1_PEA_1_PEA_1_T1 1 


1911 


1991 


R35 1 37_PEA_1_PEA_1_PEA_1_T12 


1978 


2058 


R35 1 37_PEA_1_PEA_1_PEA_1_T14 


996 


1076 



Segment cluster R35137_PEA_1JPEA_1 JPEA_l_node_26 according to the present 
invention is supported by 29 libraries. The number of libraries was determined as previously 
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described. This segment can be found in the following transcript(s): 
R35137JPEA_1JPEA_1JPEA_1JT3, R35137_PEA_1 JPEA_1 JPEA_1_T10, 
R35137JPEA_1_PEA_1JPEA_1_T11 5 R35137_PEA_1_PEA_1 JPEA_1_T12 and 
R35137_PEA_1JPEA_1 JPEA_1_T14. Table 1039 below describes the starting and ending 
5 position of this segment on each transcript. 



Table 1039 - Segment location on transcripts 



Transcript name ; ; ; 


Segment '■/:{■>■ 
starting position 


Segment . ; 
ending position . 


R35137 PEA 1 PEA 1 PEA 1 T3 


1819 


1875 


R35137 PEA 1 PEA 1 PEA 1 T10 


1919 


1975 


R35137 PEA 1 PEA 1 PEA 1 Til 


1992 


2048 


R35137 PEA 1 PEA 1 PEA 1 T12 


2059 


2115 


R35137 PEA 1 PEA 1 PEA 1 T14 


1077 


1133 



10 



Variant protein alignment to the previously known protein: 
15 Sequence name: ALAT_HUMAN_V1 

Sequence documentation : 

Alignment of: R35137__PEA_1_PEA_1_PEA_1_P9 x ALAT__HUMAN_V1 

20 

Alignment segment 1/1: 
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Quality: 2619.00 

Escore: 0 

Matching length: 274 Total 

length: 274 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps: 0 



Alignment : 



1 MAS S T GDRS QAVRHGLRAKVLTLDGMNPRVRRVE YAVRG P I VQRALELEQ 50 

I I I I 1 I I I I I I I I 1 I I I 1 I I I I 1 I I I I I I I II I 1 I I I I I I I I I I I I I I I I 

1 MAS STGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPI VQRALELEQ 50 

51 ELRQGVKKPFTEVI RAN I GDAQAMGQRP I T FLRQVLALC VN PDLL S S PNF 100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I i I I I ! I I I I I I I I II 
51 ELRQGVKKPFTEVIRAN I GDAQAMGQRP I TFLRQVLALCVNPDLLSS PNF 100 

101 PDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPAD 150 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
101 PDDAKKRAERI LQACGGHS LGAY S VS S G I QL IRE DVARYI ERRDGG I PAD 150 

151 PNNVFLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELG 200 

I | | | | I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I M I I M I I I I 
151 PNNVFLSTGAS DAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELG 200 

201 AVQVDYYLDEERAWALDVAELHRALGQARDHCRPRALCVINPGNPTGQVQ 250 

I | I | I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I M I I I I I I I I I I 

201 AVQVDYYLDEERAWALDVAELHRALGQARDHCRPRALCVINPGNPTGQVQ 250 
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251 TRECIEAVIRFAFEERLFLLADEV 27 4 

I I I I I II I I 1 M I I I I I ! 1 I I I I I 
251 TRECIEAVIRFAFEERLFLLADEV 274 

5 



10 

Sequence name: ALAT_HUMAN_V1 
Sequence documentation : 
15 Alignment of: R35137JPEA_1__PEA_1_PEA_1_P8 x ALAT_HUMAN_V1 
Alignment segment 1/1: 

Quality: 3088.00 

20 Escore: 0 

Matching length: 320 Total 

length: 320 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 
25 Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

Alignment : 

30 ..... 

1 MA S S T G DR S Q AVRH G LRAKVL T L D GMN P RVRRVE Y A VRG P I VQRAL E LE Q 50 
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I I I I I I I I I I I I I I I I I I I i I I I I I I I I ) I I I I I i 1 I I I I 1 I I I I I I I 1 I 

1 MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQ 5 0 

51 ELRQGVKKPFTEVIRAN I GDAQAMGQRPI TFLRQVLALCVNPDLLS S PNF 100 

5 I 1 I I I II 1 1 I I I I II I I! I I I I I I I I I I 1 I I I I I 1 I I I I I I I I i I I I I I I 

51 ELRQGVKKPFTEVIRANIGDAQAMGQRPITFLRQVLALCVNPDLLSSPNF 100 

101 PDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARY1ERRDGGIPAD 150 
I I I I I I 1 II I I I I 1 I ! I I 1 I I 1 I I I I I I I I I I I I I I I I i i I I I I I I I I I I 

10 101 P DDAKKRAERI LQ AC G GH S LG AY S VS S G I Q L 1 RE DVARY 1 ERRDG G I PAD 150 

151 PNNVFLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELG 200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I) I I I I I I I 

151 PNNVFLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELG 200 
15 ..... 

201 AVQVDYYLDEERAWALDVAELHRALGQARDHCRPRALCV1NPGNPTGQVQ 250 

I I I I I I I I I M I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

201 AVQVDYYLDEERAWALDVAELHRALGQARDHCRPRALCVINPGNPTGQVQ 250 

20 251 TREC I EAVI RFAFEERLFLLADE VYQDNVYAAGS QFH S FKKVLMEMGP P Y 30 0 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 1 I I I I I I I I I I I I I I I I 

251 TREC I EAVI RFAFEERLFLLADE VYQDNVYAAGS QFHS FKKVLMEMG P P Y 300 

301 AGQQELASFHSTSKGYMGEC 320 
25 I I I I I I ! I I I I I I I I I I I II 

301 AGQQELASFHSTSKGYMGEC 32 0 



30 
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Sequence name: ALAT_HUMAN__V1 
Sequence documentation : 

5 

Alignment of: R35137_PEA_1_PEA_1_PEA_1_P11 x ALAT_HUMAN_V1 

Alignment segment 1/1: 

10 Quality: 2487.00 

Escore: 0 

Matching length: 271 
length: 496 
Matching Percent Similarity: 100.00 
15 Identity: 100.00 

Total Percent Similarity: 54.64 
Identity: 54 . 64 

Gaps : 1 

20 Alignment: 

1 MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQ 50 

I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I 

1 MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQ 50 
25 ..... 

51 ELRQGVKKPFTEVIRANIGDAQAMGQRPITFLRQVLALCVNPDLLSSPNF 10 0 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I ! i I I i I I i I I I I I I I I 

51 ELRQG VKKP FTE VI RAN I GD AQ AMGQRP I TFLRQ VLALCVN PDLL S S PNF 100 
. . . • . 

30 101 P D D AKKRAE R I L Q AC G GH S L GAY SVSSGIQLI RE D VAR Y I E RRD G G I PAD 150 

I I I I I I I I 1 I I I I I I I I I II I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I 



Total 
Matching Percent 
Total Percent 
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101 P D DAKKRAERI LQ AC GGHSLGAYSVS S G I QL I RE DVARY I E RRDGG I PAD 150 

• • • • • 

151 PNNVFLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELG 20 0 

i M I I I I I I I I i I I i I I I I ! I I I I I I I I ! i I I I I I I I! ! i I I I I I I I I I I 

5 151 PNNVFLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELG 200 



201 AVQVDYYLDEERAWALDVAELHRALGQAR 229 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 ti 1 1 1 1 1 1 m i 

201 AVQVDYYLDEERAWALDVAELHRALGQARDHCRPRALCVINPGNPTGQVQ 250 
10 ..... 

229 229 

251 TRE C I E AVI RFAFEERLFLLADE VYQ DN V Y AAG SQFHS FKKVLMEMG P P Y 300 
..... 
15 229 229 

301 AGQQELASFHSTSKGYMGECGFRGGYVEWNMDAAVQQQMLKLMSVRLCP 350 

..... 

229 229 

20 

351 PVPGQALLDLVVSPPAPTDPSFAQFQAEKQAVLAELAAKAKLTEQVFNEA 4 00 

229 229 

25 401 PGISCNPVQGAMYSFPRVQLPPRAVERAQELGLAPDMFFCLRLLEETGIC 450 

.... 

230 . . . . SGFGQREGTYHFRMTILPPLEKLRLLLEKLSRFHAKFTLEYS 271 

I I I I I I I I I I I I I I I I I ] I I I I I I I I I I I I I I I I I I I I I I I I 
451 WPGSGFGQREGTYHFRMTILPPLEKLRLLLEKLSRFHAKFTLEYS 496 

30 
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5 Sequence name: ALAT__HUMAN_V1 



Sequence documentation : 



10 



Alignment of: R35137_PEA_1_PEA_1_PEA_1_P2 x ALAT_HUMAN_V1 



Alignment segment 1/1: 



Quality: 2619.00 

Escore: 0 
15 Matching length: 274 

length: 274 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
20 Identity: 100.00 

Gaps : 0 



Total 



Matching Percent 



Total Percent 



Alignment : 



25 



1 MAS S T G DRS Q AVRHGLRAKVL T LD GMN PRVRRVE Y AVRG P I VQRALELE Q 5 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MAS S T G DRS Q AVRHGLRAKVL T LDGMN PRVRRVE Y AVRG P I VQRALE LE Q 50 



30 



51 E LRQGVKKPFTE VI RAN I GDAQAMGQRPI TFLRQVLALCVNPDLLS S PNF 100 

I I I I I I I II t I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I 

51 ELRQGVKKPFTEVI RANI GDAQAMGQRPI TFLRQVLALCVNPDLLS S PNF 10 0 
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101 PDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPAD 150 

I I I I I I I I I I II I ! I I ! I I I 1 I ! I I I I I I I f I I I I i I I t I I I I ! i I I I II 

101 PDDAKKRAER1LQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPAD 150 

5 ..... 

151 PNNVFLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELG 200 

I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I II I I I I I I I I M I I I I 

151 PNNVFLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELG 2 00 
10 201 AVQVDYYLDEERAWALDVAELHRALGQARDHCRPRALCVINPGNPTGQVQ 250 

I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

201 AVQVDYYLDEERAWALDVAELHRALGQARDHCRPRALCVINPGNPTGQVQ 250 

251 T RE C I E AV I RFAFEERLFLL ADE V 27 4 

15 I I I I I I I I I I I 1 II I I I I I I I I I I 

251 TRECIEAVIRFAFEERLFLLADEV 274 



Sequence name: AL AT_HUMAN_V 1 
25 Sequence documentation: 

Alignment of: R35137_PEA_1_PEA_1_PEA_1_P4 x ALAT_HUMAN_V1 
Alignment segment 1/1: 



30 
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Quality: 

Escore: 0 

Matching length: 
length: 494 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



1063 
4785.00 

494 Total 

100.00 Matching Percent 

100.00 Total Percent 

0 



Alignment : 

1 MA S S T G DRS Q AVRH GLRAK VL T L DGMN PRVRRVE Y AVRG P I VQRALE LEQ 50 

I I I I I I I I I I I I ! I I I I I I I i 1 I I I I I I I I I I I I I I M I I I I I ! I I I I I I 

1 MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQ 5 0 
. • • . • 

51 ELRQGVKKPFTEVIRANIGDAQAMGQRPITFLRQVLALCVNPDLLSSPNF 100 

I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I II M I I I I 1 I I I I I II I I I 
51 ELRQGVKKPFTEVIRANIGDAQAMGQRPITFLRQVLALCVNPDLLSSPNF 100 

. . • 

101 PDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPAD 15 0 

I I I I I I I I I I I I I I II II I I II I I I I I I I i I II I I I I ! I I I I I I II I I I I 

101 PDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPAD 150 

151 PNNVFLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELG 200 

I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I t I I I I I I I I 1 I I I I I I I 
151 PNNVFLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELG 20 0 

201 AVQVDYYLDEERAWALDVAELHRALGQARDHCRPRALCVINPGNPTGQVQ 250 

I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

201 AVQVDYYLDEERAWALDVAELHRALGQARDHCRPRALCVINPGNPTGQVQ 250 
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251 TREC I E AVI RFAFEERLFLIjADE VYQDNV Y AAG S QFH S FKKVLMEMG P P Y 30 0 

It I I I I I I I I I I I I I I 1 1 I ! I I I I II I I I I I I I i I I I I I I I i I I 1 I I I M 

251 TRECIEAVIRFAFEERLFLLADEVYQDNVYAAGSQFHSFKKVLMEMGPPY 300 

... - - 

301 AGQQELASFHSTSKGYMGECGFRGGYVEVVNMDAAVQQQMLKLMSVRLCP 350 

1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 ii i M 1 1 1 1 1 1 1 1 M 1 1 1 

301 AGQQELASFHSTSKGYMGECGFRGGYVEVVNMDAAVQQQMLKLMSVRLCP 350 
351 PVPGQALLDLVVSPPAPTDPSFAQFQAEKQAVLAELAAKAKLTEQVFNEA 4 00 

I I I I 1 I I I I I I I I I II I I I I I I I t I I I I I I I I I I I I I I I I I I I I I I M I I 

351 PVPGQALLDLVVSPPAPTDPSFAQFQAEKQAVLAELAAKAKLTEQVFNEA 4 00 

401 PGISCNPVQGAMYSFPRVQLPPRAVERAQELGLAPDMFFCLRLLEETGIC 450 

I I i I I I i I I I I I I I I I I I ! I I I I I I I I I i I t I I I I I I I I I I I I ! I I I I I I 

401 PGISCNPVQGAMYSFPRVQLPPRAVERAQELGLAPDMFFCLRLLEETGIC 450 

• • • 

451 WPGSGFGQREGTYHFRMTILPPLEKLRLLLEKLSRFHAKFTLE 4 94 

II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I 

451 VVPGSGFGQREGTYHFRMTILPPLEKLRLLLEKLSRFHAKFTLE 4 94 

DESCRIPTION FOR CLUSTER Z25299 
Cluster Z25299 features 5 transcript(s) and 1 1 segment(s) of interest, the names for which 
are given in Tables 1040 and 1041, respectively, the sequences themselves are given at the end 
of the application. The selected protein variants are given in table 1042. 

Table 1040 - Transcripts of interest 



Transcript Name 


: Sequence ID No. 


Z25299_PEA_2_T1 


120 


Z25299_PEA_2_T2 


121 


Z25299_PEA_2_T3 


122 
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Z25299JPEA_2_T6 


123 


Z25299JPEA_2_T9 


124 


Table 1041 - Segments of interest 




Sequence ID No. •* 


Z25299_PEA_2_node_20 


876 


Z25299_PEA_2_node_21 


877 


Z25299_PEA_2_node_23 


878 


Z25299_PEA_2_node_24 


879 


Z25299_PEA_2_node_8 


880 


Z25299_PEA_2_node_12 


881 


Z25299_PEA_2_node_l 3 


882 


Z25299_PEA_2_node_14 


883 


Z25299JPEA_2_node_17 


884 


Z25299_PEA_2_node_l 8 


885 


Z25299_PEA_2_node_l 9 


886 


Table 1042 - Proteins of interest 


Protein Name . % ' 




Z25299_PEA_2_P2 


1390 


Z25299_PEA_2_P3 


1391 


Z25299_PEA_2_P7 


1392 


Z25299_PEA_2_P10 


1393 



5 



These sequences are variants of the known protein Antileukoproteinase 1 precursor 
(SwissProt accession identifier ALK1JHUMAN; known also according to the synonyms ALP; 
HUSI-1; Seminal proteinase inhibitor; Secretory leukocyte protease inhibitor; BLPI; Mucus 
proteinase inhibitor; MPI; WAP four- disulfide core domain protein 4; Protease inhibitor 
10 WAP4), SEQ ID NO: 1454, referred to herein as the previously known protein. 
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Protein Antileukoproteinase 1 precursor is known or believed to have the following 
function(s): Acid- stable proteinase inhibitor with strong affinities for trypsin, chymo trypsin, 
elastase, and cathepsin G. May prevent elastase- mediated damage to oral and possibly other 
mucosal tissues. The sequence for protein Antileukoproteinase 1 precursor is given at the end of 

5 the application, as "Antileukoproteinase 1 precursor amino acid sequence". Protein 
Antileukoproteinase 1 precursor localization is believed to be Secreted. 

It has been investigated for clinical/therapeutic use in humans, for example as a target for 
an antibody or small molecule, and/or as a direct therapeutic; available information related to 
these investigations is as follows. Potential pharmaceutically related or therapeutically related 

10 activity or activities of the previously known protein are as follows: Elastase inhibitor; Tryptase 
inhibitor. A therapeutic role for a protein represented by the chaster has been predicted. The 
cluster was assigned this field because there was information in the drug database or the public 
databases (e.g., described herein above) that this protein, or part thereof, is used or can be used 
for a potential therapeutic indication: Antiinflammatory; Antiasthma. 

15 The following GO Annotation(s) apply to the previously known protein. The following 

annotation(s) were found: proteinase inhibitor; serine protease inhibitor, which are annotation(s) 
related to Molecular Function. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 

20 from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

Cluster Z25299 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
25 the table and the numbers on the y-axis of figure 35 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
30 Figure 35 and Table 1043. This cluster is overexpressed (at least at a minimum level) in the 
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following pathological conditions: brain malignant tumors, a mixture of malignant tumors from 
different tissues and ovarian carcinoma. 



Table 1043 - Normal tissue distribution 



Narfxe of Tissue _ ; . , ; J 


Number 


bladder 


82 


bone 


6 


brain 


0 


colon 


37 


epithelial 


145 


general 


73 


head and neck 


638 


kidney 


26 


liver 


68 


lung 


465 


breast 


52 


ovary 


0 


pancreas 


20 


prostate 


36 


skin 


215 


stomach 


219 


uterus 


113 



5 

Table 1044 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI 


P2 


SP1 


R3 


SP2 


R4 


bladder 


8.2e-01 


8.5e-01 


9.2e-01 


0.6 


9.7e-01 


0.5 


bone 


5.5e-01 


7.3e-01 


4.0e-01 


2.1 


4.9e-01 


1.5 


brain 


8.8e-02 


1.5e-01 


2.3e-03 


7.7 


1.2e-02 


4.8 


colon 


3.3e-01 


2.8e-01 


4.2e-01 


1.6 


4.2e-01 


1.5 
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epithelial 


2.5e-01 


7.6e-01 


3.8e-01 


1.0 


1 


0.6 


general 


6.4e-03 


2.5e-01 


1.7e-06 


1.6 


5.2e-01 


0.9 


head and neck 


3.6e-01 


5.9e-01 


7.6e-01 


0.6 


1 


0.3 


kidney 


7.4e-01 


8.4e-01 


2.1e-01 


2.1 


4.2e-01 


1.4 


liver 


4.1e-01 


9.1e-01 


4.2e-02 


3.2 


6.4e-01 


0.8 


lung 


7.6e-01 


8.3e-01 


9.8e-01 


0.5 


1 


0.3 


breast 


5.0e-01 


5.5e-01 


9.8e-02 


1.6 


3.4e-01 


1.1 


ovary 


3.7e-02 


3.0e-02 


6.9e-03 


6.1 


4.9e-03 


5.6 


pancreas 


3.8e-01 


3.6e-01 


3.6e-01 


1.7 


3.9e-01 


1.5 


prostate 


9.1e-01 


9.2e-01 


8.9e-01 


0.5 


9.4e-01 


0.5 


skin 


6.0e-01 


8.1e-01 


9.3e-01 


0.4 


1 


0.1 


stomach 


3.0e-01 


8.1e-01 


9.1e-01 


0.6 


1 


0.3 


uterus 


1.6e-01 


1.3e-01 


3.2e-02 


1.6 


3.0e-01 


1.1 



~ As noted above, cluster Z25299 features 5 transcript(s) 5 which were listed in Table 1 
above. These transcript(s) encode for protein(s) which are variant(s) of protein 
Antileukoproteinase 1 precursor. A description of each variant protein according to the present 
invention is now provided. 

Variant protein Z25299_PEA_2_P2 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
Z25299JPEA2T1 . An alignment is given to the known protein (Antileukoproteinase 1 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between Z25299_PEA_2_P2 and ALK1_HUMAN: 
l.An isolated chimeric polypeptide encoding for Z25299_PEA_2JP2 5 comprising a first 
amino acid sequence being at least 90 % homologous to 

MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQCP 
GKKRCCPDTCGIKCLDPVDTPN^ 

CCMGMCGKSCVSPVK corresponding to amino acids 1 - 131 of ALK1JHUMAN, which also 



10 
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corresponds to amino acids 1-131 of Z25299JPEA_2_P2 ? and a second amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
90% and most preferably at least 95% homologous to a polypeptide having the sequence 
GKQGMRAH corresponding to amino acids 132- 139 of Z25299JPEA_2_P2, wherein said 
5 first and second amino acid sequences are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of Z25299_PEA_2_P2, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence GKQGMRAH in Z25299JPEA_2_P2. 

10 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 

15 prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein Z25299_PEA_2_P2 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1045, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 

20 the SNP is known or not; the presence of known SNPs in variant protein Z2 5 2 9 9 JPE A_2 JP2 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 104 5 - Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


136 


M->T 


Yes 


20 


P-> 


No 


43 


C->R 


No 


48 


K->N 


No 


83 


R->K 


No 
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84 



R->W 



No 



10 



Variant protein Z2 5 2 9 9 JPE A__2_P2 is encoded by the following transcript(s): 
Z25299JPEA_2_T1, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z25299_PEAJ2_T1 is shown in bold; this coding portion starts at 
position 124 and ends at position 540. The transcript also has the following SNPs as listed in 
Table 1046 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Z25299JPEA 2_P2 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 1046 - Nucleic acid SNPs 



S^positicmJStt&cleotide# 
sequence • . . : \jF.; . jv 


Alternative nucleic acid 


Previously knc%tl SNP? j 

' v, ■ .. M • ■ ' % 


122 


C ->T 


No 


123 


C->T 


No 


530 


T->C 


Yes 


989 


C->T 


Yes 


1127 


C->T 


Yes 


1162 


A->C 


Yes 


1180 


A->C 


Yes 


1183 


A->C 


Yes 


1216 


A->C 


Yes 


1262 


G-> A 


Yes 


183 


T-> 


No 


250 


T->C 


No 


267 


A->C 


No 


267 


A->G 


No 


339 


C ->T 


Yes 


371 


G->A 


No 


373 


A->T 


No 
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435 



C->T 



No 



Variant protein Z25299JPEA_2JP3 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 Z25299JPEA_2_T2. An alignment is given to the known protein (Antileukoproteinase 1 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

1 0 Comparison report between Z25299 JPEA_2 JP3 and ALK1 HUMAN: 

1. An isolated chimeric polypeptide encoding for Z25299JPEA_2_P3, comprising a first 
amino acid sequence being at least 90 % homologous to 

MKSSGLFPFLVLIALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQCP 
GKKRCCPDTCGIKCLDPVDTPNPTR^ 

15 CCMGMCGKSCVSPVK corresponding to amino acids 1-131 of ALK 1 HUMAN, which also 
corresponds to amino acids 1-131 of Z25299JPEA_2_P3 ? and a second amino acid sequence 
being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at least 
90% and most preferably at least 95% homologous to a polypeptide having the sequence 
GEKRHHKQLRDQEVDPLEMRRHSAG corresponding to amino acids 132 - 156 of 

20 Z25299_PEA_2_P3, wherein said first and second amino acid sequences are contiguous and in a 
sequential order. 

2. An isolated polypeptide encoding for a tail of Z25299JPEA_2JP3, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 

25 sequence GEKRHHKQLRDQEVDPLEMRRHSAG in Z25299JPEA_2JP3. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
30 secreted. The protein localization is believed to be secreted because both signatpeptide 
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prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein Z25299_PEA_2 JP3 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1 047, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein Z25299_PEA_2_P3 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 1047 - Amino acid mutations 



SNP position(s) on amino acid 
sequence • "'• • ' 


Alternative aniiAo aeid(s) 


Previously Imovra SNP? . 

■ ■ ' ; ... 


20 


P-> 


No 


43 


C->R 


No 


48 


K->N 


No 


83 


R->K 


No 


84 


R->W 


No 



Variant protein Z25299JPEA_2_P3 is encoded by the following transcript(s): 
Z25299JPEA_2__T2, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z25299_PEA 2_T2 is shown in bold; this coding portion starts at 
position 124 and ends at position 591. The transcript also has the following SNPs as listed in 
Table 1048 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Z25299_PEA_2JP3 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 



Table 1048 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


; Previously known SNP? 


122 


C->T 


No 
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C ->T 


No 


183 


T-> 


No 


250 


T->C 


No 


267 


A->C 


No 


267 


A->G 


No 


339 


C ->T 


Yes 


371 


G->A 


No 


373 


A->T 


No 


435 


C->T 


No 



Variant protein Z25299_PEA_2_P7 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 Z25299JPEA_2_T6. An alignment is given to the known protein (Antileukoproteinase 1 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

1 0 Comparison report between Z25299_PEA_2JP7 and ALK 1 HUMAN : 

l.An isolated chimeric polypeptide encoding for Z25299_PEA_2_P7 ? comprising a first 
amino acid sequence being at least 90 % homologous to 

MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQCP 
GKKRCCPDTCGIKCLDPVDTPNP corresponding to amino acids 1-81 of ALK 1 _HUM AN, 

15 which also corresponds to amino acids 1-81 of Z25299_PEA_2_P7, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
RGSLGSAQ corresponding to amino acids 82 - 89 of Z25299JPEA_2JP7, wherein said first 
and second amino acid sequences are contiguous and in a sequential order. 

20 2.An isolated polypeptide encoding for a tail of Z25299 JPEA_2JP7, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
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more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence RGSLGSAQ in Z25299JPEA_2_P7. 

The location of the variant protein was determined according to results from a number of 
5 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans- membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

10 Variant protein Z25299_PEA__2JP7 also has the following non- silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 1049, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein Z2 5 2 9 9 PE A 2 P7 
sequence provides support for the deduced sequence of this variant protein according to the 

15 present invention). 

Table 1049 -Amino acid mutations 



% SNP position(s) on amino acid 
sequence \ '• ; ' 


Alternative ^amino > acid(s) i 


Previously known SNP? 


20 


P-> 


No 


43 


C ->R 


No 


48 


K->N 


No 


82 


R->S 


No 



Variant protein Z25299_PEA_2_P7 is encoded by the following transcript(s): 
Z25299JPEAJ2T6, for which the sequence(s) is/are given at the end of the application. The 
20 coding portion of transcript Z25299_PEA_2_T6 is shown in bold; this coding portion starts at 
position 124 and ends at position 390. The transcript also has the following SNPs as listed in 
Table 1050 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
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known SNPs in variant protein Z25299 JPEA_2_P7 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 1050 - Nucleic acid SNPs 



SNP position on nucleotide > ■ 
sequence , , 


Alternative nucleic acid j J 


Previously knovvji SNP^ • A 

I f :. : '-''y') '■ ' ■■' 


122 


C->T 


No 


123 


C->T 


No 


576 


A->C 


Yes 


594 


A->C 


Yes 


597 


A->C 


Yes 


630 


A->C 


Yes 


676 


G -> A 


Yes 


183 


T-> 


No 


250 


T->C 


No 


267 


A->C 


No 


267 


A->G 


No 


339 


C->T 


Yes 


369 


A->T 


No 


431 


C->T 


No 


541 


C->T 


Yes 



5 

Variant protein Z25299_PEA_2_P 1 0 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
Z2 52 9 9_PEA 2_T9 . An alignment is given to the known protein (Antileukoproteinase 1 
precursor) at the end of the application. One or more alignments to one or more previously 
10 published protein sequences are given at the end of the application. A brief description of the 

relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between Z25299_PEA_2JF>10 and ALK 1 _HUM AN : 
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l.An isolated chimeric polypeptide encoding for Z25299_PEA_2_P10, comprising a first 
amino acid sequence being at least 90 % homologous to 

MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQCP 
GKKRCCPDTCGIKCLDPVDTPNPT corresponding to amino acids 1 - 82 of ALK1 JHUMAN, 
5 which also corresponds to amino acids 1 - 82 of Z25299 PEA 2_P 1 0. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 

10 secreted. The protein localization is believed to be secreted because both signal-peptide 

prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein Z25299JPEA_2_P10 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1051, (given according to their position(s) on the 

1 5 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein Z25299JPEA_2JP10 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 1051 - Amino acid mutations 



SNP position(s) on amino acid 
sequence ; ■ . %• ' g ; 


iUtemativ^^^ acid(s) 


Previously known ilfP? 


20 


p-> 


No 


43 


C->R 


No 


48 


K->N 


No 



20 



Variant protein Z25299_PEA__2_P10 is encoded by the following transcript(s): 
Z25299__PEA_2_T9, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Z25299_PEA_2_T9 is shown in bold; this coding portion starts at 
position 124 and ends at position 369. The transcript also has the following SNPs as listed in 
25 Table 1052 (given according to their position on the nucleotide sequence, with the alternative 
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nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Z25299JPEA_2JP10 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 1052 - Nucleic acid SNPs 



SNP position on nucleotide • 
.sequence ' ?./J. ; 


Alternative nucleic acid , 

'" . ; : V . >■*'■>.;''.. 


Previously known SNP? . 


122 


C ->T 


No 


123 


C ->T 


No 


451 


A->C 


Yes 


484 


A->C 


Yes 


530 


G->A 


Yes 


183 


T-> 


No 


250 


T->C 


No 


267 


A->C 


No 


267 


A->G 


No 


339 


C->T 


Yes 


395 


C->T 


Yes 


430 


A->C 


Yes 


448 


A->C 


Yes 


As noted above, cluster Z25299 features 1 1 segment(s), w] 


lich were listed in Table 2 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
provided. 



Segment cluster Z25299JPEA_2jnode_20 according to the present invention is supported 
by 6 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s) : Z25299 JPEA_2_T1 . Table 1053 below describes the 
starting and ending position of this segment on each transcript. 

Table 1053 - Segment location on transcripts 
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Transcript name ':' 


Segment starting position 


Segment ending position 


Z25299_PEA_2_T1 


518 


1099 



Segment cluster Z2 5 2 9 9_PE A_2_node_2 1 according to the present invention is supported 
by 162 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Z25299 JPEA_2 JT1 , Z25299JPEA_2JT6 and 
Z25299JPEA_2_T9. Table 1054 below describes the starting and ending position of this 
segment on each transcript. 

Table 1054 - Segment location on transcripts 



Transcript name 


Segment starting position 


: Segment ending position ' j'-.f 


Z25299_PEA_2_T1 


1100 


1292 


Z25299_PEA_2_T6 


514 


706 


Z25299_PEA_2_T9 


368 


560 



10 

Segment cluster Z25299__PEA_2_node_23 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z25299JPEA_2JT2. Table 1055 below describes the 
starting and ending position of this segment on each transcript. 

15 Table 1055 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z25299_PEA_2_T2 


518 


707 



Segment cluster Z25299JPEA_2_node_J24 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 
20 can be found in the following transcript(s): Z25299JPEA_2_T2 and Z25299 PEA2T3. Table 
1056 below describes the starting and ending position of this segment on each transcript. 
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Table 1056 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position . 


Z25299_PEA_2_T2 


708 


886 


Z25299_PEA_2_T3 


518 


696 



Segment cluster Z25299_JPEA_2jnode_8 according to the present invention is supported 
5 by 218 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z25299_PEA_2_T1, Z25299_PEA_2_T2, 
Z25299JPEA_2_T3, Z25299_PEA_2_T6 and Z25299JPEAJ2JT9. Table 1057 below describes 
the starting and ending position of this segment on each transcript. 



Table 1057 - Segment location on transcripts 



Transcript name • 


Segment starting position 

'■■ : tiwSv. .. ' > ■<*..•' :i 


: Segment elBmg position 


Z25299_PEA_2_T1 




208 


Z25299_PEA_2_T2 




208 


Z25299_PEA_2_T3 




208 


Z25299_PEA_2_T6 




208 


Z25299_PEA_2_T9 




208 



10 

According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

15 Segment cluster Z25299JPEA_2_node_12 according to the present invention is supported 

by 228 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z25299_PEA_2jri 5 Z25299 J > EA_2_T2, 
Z25299_PEA_2_T3, Z25299_PEA_2_T6 and Z25299JPEA_2JT9. Table 1058 below describes 
the starting and ending position of this segment on each transcript. 

20 Table 1058 - Segment location on transcripts 
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Transcript name 


Segment starting position 


Segment ending position : 


Z25299_PEA_2_T1 


209 


245 


Z25299_PEA_2_T2 


209 


245 


Z25299_PEA_2_T3 


209 


245 


Z25299_PEA_2_T6 


209 


245 


Z25299_PEA_2_T9 


209 


245 



Segment cluster Z25299__PEA_2_node_13 according to the present invention is supported 
by 246 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z25299JPEAJ2JT1, Z25299_PEA 2 T2, 
5 Z25299_PEA_2_T3, Z25299_PEA_2 JT6 and Z25299_PEA_2_T9. Table 1059 below describes 
the starting and ending position of this segment on each transcript. 



Table 1059 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


Z25299_PEA_2_T1 


246 


357 


Z25299_PEA_2_T2 


246 


357 


Z25299_PEA_2_T3 


246 


357 


Z25299_PEA_2_T6 


246 


357 


Z25299_PEA_2_T9 


246 


357 



10 Segment cluster Z2 5 2 99 JPE A_2_node_ 1 4 according to the present invention can be 

found in the following transcript(s): Z25299_PEA_2_T1, Z25299_PEA_2_T2 ? 
Z25299_PEA_2_T3 ? Z25299__PEA„2_T6 and Z25299_PEA_2_T9. Table 1060 below describes 
the starting and ending position of this segment on each transcript. 

Table 1060 - Segment location on transcripts 



Transcript name 


Segment starting position 


• Segment ending position 


Z25299_PEA_2_T1 


358 


367 


Z25299_PEA_2_T2 


358 


367 
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Z25299_PEA_2_T3 


358 


367 


Z25299_PEA_2_T6 


358 


367 


Z25299_PEA_2_T9 


358 


367 



Segment cluster Z25299JPEA_2_node_17 according to the present invention can be 
found in the following transcript(s): Z25299_PEA_2_T1, Z25299JPEA_2_T2 and 



5 Z25299__PEA_2JT3. Table 1061 below describes the starting and ending position of this 
segment on each transcript. 

Table 1061 - Segment location on transcripts 



Transcript name . f •/ 


Segment starting position' 


: Segment ending position f 


Z25299_PEA_2_T1 


368 


371 


Z25299_PEA_2_T2 


368 


371 


Z25299_PEA_2_T3 


368 


371 



10 Segment cluster Z25299JPEA_2_node_18 according to the present invention is supported 

by 221 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z25299JPEA_2_T1, Z25299_PEA_2_T2, 
Z25299JPEA_2_T3 and Z25299JPEA_2JT6. Table 1062 below describes the starting and 
ending position of this segment on each transcript. 

15 Table 1062 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position \ 


Z25299_PEA_2_T1 


372 


427 


Z25299_PEA_2_T2 


372 


427 


Z25299_PEA_2_T3 


372 


427 


Z25299_PEA_2_T6 


368 


423 
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Segment cluster Z25299_PEA_2_node_19 according to the present invention is supported 
by 197 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Z25299_PEA_2_T1, Z25299J > EA_2JT2, 
Z25299_PEAJ2_T3 and Z25299JPEA_2_T6. Table 1063 below describes the starting and 



5 ending position of this segment on each transcript. 
Table 1063 - Segment location on transcripts 



Transcript name . ; 


Segment starting position 


Segment ending position 


Z25299_PEA_2_T1 


428 


517 


Z25299_PEA_2_T2 


428 


517 


Z25299_PEA_2_T3 


428 


517 


Z25299_PEA_2_T6 


424 


513 



10 



Variant protein alignment to the previously known protein: 

Sequence name: / tmp/oXgeQ4MeyL/K6VqblMQu2 : ALK1_HUMAN 

15 

Sequence documentation : 

Alignment of: Z 2 5 2 9 9_PE A__2__P 2 x ALK INHUMAN 
20 Alignment segment 1/1: 

Quality: 1371.00 

Escore: 0 

Matching length: 131 Total 

25 length: 131 



WO 2006/131783 



PCT/IB2005/004037 



Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



1083 

100.00 
100 .00 



Matching Percent 



Total Percent 



Alignment : 



10 



1 MKS S GLFPFLVLLALGTLAPWAVEGSGKS FKAGVCPPKKS AQCLRYKKPE 50 

I I I I I I I II I I I I I I I I I ! I I I I I I I I I I I I I I I II I I I I I I I I t I I I I I 

1 MKS S GLFPFLVLLALGTLAPWAVEGSGKS FKAGVCPPKKS AQCLRYKKPE 50 



15 



51 CQSDWQCPGKKRCCPDTCGIKCLDPVDTPNPTRRKPGKCPVTYGQCLMLN 100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
51 CQSDWQCPGKKRCCPDTCGIKCLDPVDTPNPTRRKPGKCPVTYGQCLMLN 100 



20 



101 PPNFCEMDGQCKRDLKCCMGMCGKSCVSPVK 

II I I I I I I I I I I I I I I I I I I I I II I I I I M I 

101 PPNFCEMDGQCKRDLKCCMGMCGKSCVSPVK 



131 



131 



25 

Sequence name: /tmp/rbf 314VLIm/yR43i4SbP4 :ALK1_HUMAN 
Sequence documentation : 
30 Alignment of: Z25299_PEA 2 P3 x ALK1 HUMAN 
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Alignment segment 1/1: 

Quality : 

Escore: 0 
5 Matching length : 

length: 131 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
10 Identity: 100.00 

Gaps : 



1084 

1371.00 

131 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 



Alignment : 

15 1 MKS S GLFPFLVLLALGTLAPWAVEGS GKS FKAGVC PPKKS AQCLRYKKPE 50 

1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n 1 1 1 i 1 1 1 1 1 

1 MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPE 50 

51 CQSDWQCPGKKRCCPDTCGIKCLDPVDTPNPTRRKPGKCPVTYGQCLMLN 10 0 
20 [ I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I i I I I I I I I ! 1 I I I I I ! I I 

51 CQSDWQCPGKKRCCPDTCGIKCLDPVDTPNPTRRKPGKCPVTYGQCLMLN 100 
. . • 

101 PPNFCEMDGQCKRDLKCCMGMCGKSCVSPVK 131 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I 

25 101 PPNFCEMDGQCKRDLKCCMGMCGKSCVSPVK 131 



30 
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Sequence name: / tmp/KCtSXACZXe/ rK4T6LKeRX : ALKl_HUMAN 
Sequence documentation : 

Alignment of: Z252 9 9_PEA_2JP7 x ALK1_HUMAN 
Alignment segment 1/1: 

Quality: 

Escore: 0 

Matching length: 
length: 81 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 

Alignment : 

• ■ 

1 MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPE 50 

I I I I I I I I I ! I I I I I I i I I I I I I I I I I I I I I t I I I I I I I I I I I I I I M i I 

1 MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPE 50 



835 .00 

81 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 



51 CQSDWQCPGKKRCCPDTCGIKCLDPVDTPNP 

I I I I 1 I I I I I I I I I I I I I I I I I I M I I I I I I 

51 CQSDWQCPGKKRCCPDTCGIKCLDPVDTPNP 



81 
81 
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Sequence name: / tmp/LcBlcAxB6c/NSI 9pqf xoU : ALK1 HUMAN 



5 Sequence documentation: 



Alignment of: Z 2 5 2 9 9_PE A_2_P 1 0 x ALK1_HUMAN 



10 



20 



Alignment segment 1/1: 

Quality: 844.00 



Escore ; 



0 



Matching length: 
length: 82 
15 Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



82 



100 . 00 



Alignment : 



Total 



100.00 Matching Percent 



Total Percent 



25 



1 MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPE 50 

I ! I I I I I I I I I I I ! I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPE 50 



30 



5 1 CQSDWQCPGKKRCCPDTCGIKCLDPVDTPNPT 

I I I I ! I I I I I I I I I I I I I I I I I I I [ I i I I I I I 

5 1 CQSDWQCPGKKRCCPDTCGIKCLDPVDTPNPT 



82 



82 



WO 2006/131783 



PCT/IB2005/004037 



1087 

Expression of Secretory leukocyte protease inhibitor Acid-stable proteinase inhibitor Z25299 
transcripts, which are detectable by amplicon as depicted in sequence name Z25299 juncl3-14- 

21 in normal and cancerous lung tissues 
Expression of Secretory leukocyte protease inhibitor Acid- stable proteinase inhibitor 
5 transcripts detectable by or according to juncl3-14-21, Z25299 juncl3- 14-21 amplicon (SEQ ID 
NO: 1666) and Z25299 juncl3-14-21F (SEQ ID NO: 1664) and Z25299 juncl3-14-21R (SEQ 
ID NO: 1 665) primers was measured by real time PCR. In parallel the expression of four 
housekeeping genes -PBGD (GenBank Accession No. BCO 19323; amplicon — PBGD- amplicon, 
SEQ ID NO:334), HPRT1 (GenBank Accession No. NM_000194; amplicon - HPRT1- 
1 0 amplicon, SEQ ID NO: 1297), Ubiquitin (GenBank Accession No. BC000449; amplicon - 
Ubiquitin-amplicon, SEQ ID NO:328) and SDHA (GenBank Accession No. NM_004168; 
amplicon - SDHA- amplicon, SEQ ID NO:33 1) was measured similarly. For each RT sample, 
the expression of the above amplicon was normalized to the geometric mean of the quantities of 
the housekeeping genes. The normalized quantity of each RT sample was then divided by the 
15 median of the quantities of the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 
96-99, Table 2 "Tissue sample in testing panel", above), to obtain a value of fold differential 
expression for each sample relative to median of the normal PM samples. 

Figure 36 is a histogram showing down regulation of the above -indicated Secretory 
leukocyte protease inhibitor Acid- stable proteinase inhibitor transcripts in cancerous lung 
20 samples relative to the normal samples. 

As is evident from Figure 36, the expression of Secretory leukocyte protease inhibitor 
Acid- stable proteinase inhibitor transcripts detectable by the above amplicon(s) in cancer 
samples was significantly lower than in the non-cancerous samples (Sample Nos. 47-50, 90-93, 
96-99 Table 2, "Tissue sample in testing panel" ). 
25 Statistical analysis was applied to verify the significance of these results, as described 

below. 

The P value for the difference in the expression levels of Secretory leukocyte protease 
inhibitor Acid-stable proteinase inhibitor transcripts detectable by the above amplicon(s) in lung 
cancer samples versus the normal tissue samples was determined by T test as 1.98E-04. This 
30 value demonstrates statistical significance of the results. 
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Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: Z25299 juncl3-14~21F forward 
primer; and Z25299 juncl3-14-21R reverse primer. 
5 The present invention also preferably encompasses any amplicon obtained through the 

use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: Z25299 juncl3- 
14-21. 

Forward primer (SEQ ID NO: 1664): ACCCCAAACCCAACTTGATTC 
10 Reverse primer (SEQ ID NO: 1665): TCAGTGGTGGAGCCAAGTCTC 
Amplicon (SEQ ID NO: 1666): 

ACCCCAAACCCAACTTGATTCCTGCCATATGGAGGAGGCTCTGGAGTCCTGCTCTGT 
GTGGTCCAGGTCCTTTCCACCCTGAGACTTGGCTCCACCACTGA 

15 Z25299 transcripts, which are detectable by amplicon as depicted in sequence name Z25299 

seg20 in normal and cancerous lung tissues 
Expression of Secretory leukocyte protease inhibitor Acid-stable proteinase inhibitor 
transcripts detectable by or according to seg20, Z25299 seg20 amplicon (SEQ ID NO: 1669) and 
Z25299 seg20F (SEQ ID NO: 1667) and Z25299 seg20R (SEQ ID NO: 1668) primers was 

20 measured by real time PCR. In parallel the expression of four housekeeping genes -PBGD 
(GenBank Accession No. BC019323; amplicon - PBGD-amplicon, SEQ ID NO:334), HPRT1 
(GenBank Accession No. NMJ)00194; amplicon - HPRT1 -amplicon, SEQ ID NO: 1297), 
Ubiquitin (GenBank Accession No. BC000449; amplicon - Ubiquitin-amplicon, SEQ ID 
NO:328) and SDHA (GenBank Accession No. NM_004168; amplicon - SDH A- amplicon, SEQ 

25 ID NO:331) was measured similarly. For each RT sample, the expression of the above amplicon 
was normalized to the geometric mean of the quantities of the housekeeping genes. The 
normalized quantity of each RT sample was then divided by the median of the quantities of the 
normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, "Tissue samples 
in testing panel", above). Then the reciprocal of this ratio was calculated, to obtain a value of 

30 fold down-regulation for each sample relative to median of the normal PM samples. 
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Figure 37 is a histogram showing down regulation of the above- indicated Secretory 
leukocyte protease inhibitor Acid- stable proteinase inhibitor transcripts in cancerous lung 
samples relative to the normal samples. The number and percentage of samples that exhibit at 
least 5 fold down regulation, out of the total number of samples tested is indicated in the bottom. 
5 As is evident from Figure 37, the expression of Secretory leukocyte protease inhibitor 

Acid-stable proteinase inhibitor transcripts detectable by the above amplicon(s) in cancer 
samples was significantly lower than in the noncancerous samples (Sample Nos. 47-50, 90-93, 
96-99 Table 2, "Tissue sample in testing panel"). Notably an down regulation of at least 5 fold 
was found in 6 out of 15 adenocarcinoma samples, 9 out of 16 squamous cell carcinoma 
10 samples, 3 out of 4 large cell carcinoma samples and in 8 out of 8 small cell carcinoma samples. 

Statistical analysis was applied to verify the significance of these results, as described 

below. 

The P value for the difference in the expression levels of Secretory leukocyte protease 
inhibitor Acid- stable proteinase inhibitor transcripts detectable by the above amplicon(s) in 
15 lung cancer samples versus the normal tissue samples was determined by T test as 9.43 E-02 in 
adenocarcinoma, 5.62E-02 in squamous cell carcinoma, 3.38E-01 in large cell carcinoma and 
3.78E-02 in small cell carcinoma. 

Threshold of 5 fold down regulation was found to differentiate between cancer and 
normal samples with P value of 3.73E-02 in adenocarcinoma, 1.10E-02 in squamous cell 
20 carcinoma, 2.64E-02 in large cell carcinoma and 7.14E-05 in small cell carcinoma checked by 
exact fisher test. The above values demonstrate statistical significance of the results. 
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Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: Z25299 seg20F forward primer; and 
Z25299 seg20R reverse primer. 
5 The present invention also preferably encompasses any amplicon obtained through the 

use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: Z25299 seg20. 
Forward primer (SEQ ID NO: 1667): CTCCTGAACCCTACTCCAAGCA 
Reverse primer (SEQ ID NO: 1668): CAGGCGATCCTATGGAAATCC 
1 0 Amplicon (SEQ ID NO: 1 669) : 

CTCCTGAACCCTACTCCAAGCACAGCCTCTGTCTGACTCCCTTGTCCTTCAAGAGAA 
CTGTTCTCCAGGTCTCAGGGCCAGGATTTCCATAGGATCGCCTG 

Expression o/Homo sapiens secretory leukocyte protease inhibitor (antileukoproteinase) (SLPI) 
Z25299 transcripts which are detectable by amplicon as depicted in sequence name Z25299 

1 5 seg23 in normal and cancerous lung tissues 

Expression of Homo sapiens secretory leukocyte protease inhibitor (antileukoproteinase) 
(SLPI) transcripts detectable by or according to seg23, Z25299 seg23 amplicon (SEQ ID NO: 
1672) and primers Z25299 seg23F (SEQ ID NO: 1670) and Z25299 seg23R (SEQ ID NO: 1671) 
was measured by real time PGR. In parallel the expression of four housekeeping genes — PBGD 

20 (GenBank Accession No. BC019323; amplicon - PBGD-amplicon, SEQ ID NO:334), HPRT1 
(GenBank Accession No. NM_000194; amplicon - HPRT1 -amplicon, SEQ ID NO:1297), 
Ubiquitin (GenBank Accession No. BC000449; amplicon - Ubiquitin- amplicon, SEQ ID 
NO:328) and SDHA (GenBank Accession No. NM_004168; amplicon - SDHA-amplicon, SEQ 
ID NO:331), was measured similarly. For each RT sample, the expression of the above 

25 amplicon was normalized to the geometric mean of the quantities of the housekeeping genes. 
The normalized quantity of each RT sample was then divided by the median of the quantities of 
tlie normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, above). Then 
the reciprocal of this ratio was calculated, to obtain a value of fold down-regulation for each 
sample relative to median of the normal PM samples. 
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Figure 68 is a histogram showing down regulation of the above- indicated Homo sapiens 
secretory leukocyte protease inhibitor (antileukoproteinase) (SLPI) transcripts in cancerous lung 
samples relative to the normal samples. 

As is evident from Figure 68, the expression of Homo sapiens secretory leukocyte 
5 protease inhibitor (antileukoproteinase) (SLPI) transcripts detectable by the above amplicon(s) 
in cancer samples was significantly lower than in the non-cancerous samples (Sample Nos. 46- 
50, 90-93, 96-99 Table 2). Notably down regulation of at least 10 fold was found in 7 out of 15 
adenocarcinoma samples, 9 out of 16 squamous cell carcinoma samples, 3 out of 4 large cell 
carcinoma samples and in 8 out of 8 small cells carcinoma samples. 

10 



Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: Z25299 seg23F forward primer; and 
1 5 Z25299 seg23R reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a norb limiting illustrative example only of a suitable amplicon: Z25299 seg23. 



20 Primers: 

Forward primer Z25299 seg23F (SEQ ID NO: 1670): CAAGCAATTGAGGGACCAGG 

Reverse primer Z25299 seg23R (SEQ ID NO: 1671): 
CAAAAAACATTGTTAATGAGAGAGATGAC 

Amplicon Z25299 seg23F (SEQ ID NO: 1672): 
25 CAAGCAATTGAGGGACCAGGAAGTGGATCCTCTAGAGATGAGGAGGCATTCTGCTG 
GATGACTTTTAAAAATGTTTTCTCCAGAGTCATCTCTCTCATTAACAATGTTTTTTG 



30 



Expression of Secretory leukocyte protease inhibitor Acid-stable proteinase inhibitor Z25299 
transcripts which are detectable by amplicon as depicted in sequence name Z25299seg20 in 

different normal tissues 
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Expression of Secretory leukocyte protease inhibitor transcripts detectable by or according to 
Z25299seg20 amplicon (SEQ ID NO: 1669) and primers: Z25299seg23F (SEQ ID NO: 1667) 
Z25299seg20R (SEQ ID NO: 1668) was measured by real time PCR. In parallel the expression 
of four housekeeping genes -RPL19 (GenBank Accession No. NM_000981; RPL19 amplicon, 
5 SEQ ID NO: 1 630), TATA box (GenBank Accession No. NM_003 1 94; TATA amplicon, SEQ 
ID NO: 1633), Ubiquitin (GenBank Accession No. BC000449; amplicon - Ubiquitin- amplicon, 
SEQ ID NO:328) and SDHA (GenBank Accession No. NM_004168; amplicon - SDHA- 
amplicon, SEQ ID NO:331) was measured similarly. For each RT sample, the expression of the 
above amplicon was normalized to the geometric mean of the quantities of the housekeeping 
1 0 genes. The normalized quantity of each RT sample was then divided by the median of the 
quantities of the ovary samples (Sample Nos. 18-20, Table 3), to obtain a value of relative 
expression of each sample relative to median of the ovary samples. 



Primers: 

1 5 Forward primer (SEQ ID NO: 1 667): CTCCTGAACCCTACTCCAAGCA 

Reverse primer (SEQ ID NO: 1668): CAGGCGATCCTATGGAAATCC 
Amplicon (SEQ ID NO: 1669): 
CTCCTGAACCCTACTCCAAGCACAGCCTCTGTCTGACTCCCTTGTCCTTCAAGAGAA 
CTGTTCTCCAGGTCTCAGGGCCAGGATTTCCATAGGATCGCCTG 

20 



The results are demonstrated in Figure 69, showing the expression of Secretory 
leukocyte protease inhibitor Acid-stable proteinase inhibitor Z25299 transcripts which are 
detectable by amplicon as depicted in sequence name Z25299seg20 in different normal tissues. 

25 



30 



Expression of Secretory leukocyte protease inhibitor Z25299 transcripts which are detectable by 
amplicon as depicted in sequence name Z25299seg23 in different normal tissues 
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Expression of Secretory leukocyte protease inhibitor transcripts detectable by or 
according to Z25299seg23 amplicon (SEQ ID NO: 1672) and primers: Z25299seg23F (SEQ ID 
NO: 1670) Z25299seg23R (SEQ ID NO: 1671) was measured by real time PCR. In parallel the 
expression of four housekeeping genes -RPL19 (GenBank Accession No. NMJ300981; RPL19 
5 amplicon, SEQ ID NO: 1 630), TATA box (GenBank Accession No. NMJ)03 1 94; TATA 
amplicon, SEQ ID NO: 1633), Ubiquitin (GenBank Accession No. BC000449; amplicon - 
Ubiquitin-amplicon, SEQ ID NO:328) and SDHA (GenBank Accession No. NM_004168; 
amplicon - SDHA- amplicon, SEQ ID NO:331) was measured similarly. For each RT sample, 
the expression of the above amplicon was normalized to the geometric mean of the quantities of 
10 the housekeeping genes. The normalized quantity of each RT sample was then divided by the 
median of the quantities of the ovary samples (Sample Nos. 18-20, Table 3), to obtain a value of 
relative expression of each sample relative to median of the ovary samples. 



Primers: 

15 Forward primer Z25299 seg23F (SEQ ID NO: 1670): CAAGCAATTGAGGGACCAGG 
Reverse primer Z25299 seg23R (SEQ ID NO: 1671): 
CAAAAAACATTGTTAATGAGAGAGATGAC 
Amplicon Z25299 seg23F (SEQ ID NO: 1672): 

CAAGCAATTGAGGGACCAGGAAGTGGATCCTCTAGAGATGAGGAGGCATTCTGCTG 
20 GATGACTTTTAAAAATGTTTTCTCCAGAGTCATCTCTCTCATTAACAATGTTTTTTG 

The results are demonstrated in Figure 70, showing the expression of Secretory 
leukocyte protease inhibitor Acid-stable proteinase inhibitor Z25299 transcripts which are 
detectable by amplicon as depicted in sequence name Z25299seg23 in different normal tissues. 



25 
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DESCRIPTION FOR CLUSTER HSSTROL3 
Cluster HSSTROL3 features 6 transcript(s) and 16 segment(s) of interest, the names for 
which are given in Tables 1064 and 1065, respectively, the sequences themselves are given at 
the end of the application. The selected protein variants are given in table 1066. 

5 Table 1064 - Transcripts of interest 



Transcript Name ■ , .■ - V, H > - 


Sequence ID No. . " ■. 


HSSTROL3_T5 


125 


HSSTROL3_T8 


126 


HSSTROL3_T9 


127 


HSSTROL3_T10 


128 


HSSTROL3_Tll 


129 


HSSTROL3_T12 


130 


Table 1065 - Segments of interest 


Segment Name ' i - '■'['_ y * 


Sequence ID No. 1 


HSSTROL3_node_6 


887 


HSSTROL3_node_10 


888 


HSSTROL3_node_13 


889 


HSSTROL3_node_15 


890 


HSSTROL3_node_19 


891 


HSSTROL3_node_21 


892 


HSSTROL3_node_24 


893 


HSSTROL3_node_25 


894 


HSSTROL3_node_26 


895 


HSSTROL3_node_28 


896 


HSSTROL3_node_29 


897 


HSSTROL3_node_ll 


898 


HSSTROL3_node_17 


899 


HSSTROL3_node_18 


900 
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HSSTROL3_node_20 


901 


HSSTROL3_node_27 


902 



Table 1066 - Proteins of interest 



Protein Name ; ; 5 


Sequence ID No. 


, Corresponding Transcript (s) 


HSSTROL3_P4 


1394 


HSSTROL3_T5 


HSSTROL3_P5 


1395 


HSSTROL3_T8; 
HSSTROL3_T9 


HSSTROL3_P7 


1396 


HSSTROL3_T10 


HSSTROL3_P8 


1397 


HSSTROL3_Tll 


HSSTROL3_P9 


1398 


HSSTROL3_T12 



These sequences are variants of the known protein Stromelysin-3 precursor (SwissProt 
5 accession identifier MM1 INHUMAN; known also according to the synonyms EC 3.4.24.-; 

Matrix metalloproteinase- 1 1 ; MMP-1 1; ST3; SL-3), SEQ ID NO: 1455, referred to herein as the 
previously known protein. 

Protein Stromelysin-3 precursor is known or believed to have tte following function(s): 
May play an important role in the progression of epithelial malignancies. The sequence for 
10 protein Stromelysin-3 precursor is given at the end of the application, as "Stromelysin-3 
precursor amino acid sequence". 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: proteolysis and peptidolysis; developmental processes; 
morphogenesis, which are annotation(s) related to Biological Process; stromelysin 3; calcium 
15 binding; zinc binding; hydrolase, which are annotation(s) related to Molecular Function; and 
extracellular matrix, which are annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http ://www.ncbi.nlm.nih. gov/proj ects/LocusLink/>. 



20 
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Cluster HSSTROL3 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the left hand column of 
the table and the numbers on the y-axis of figure 38 refer to weighted expression of ESTs in 
5 each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 38 and Table 1067. This cluster is overexpressed (at least at a minimum level) in the 
10 following pathological conditions: transitional cell carcinoma, epithelial malignant tumors, a 
mixture of malignant tumors from different tissues and pancreas carcinoma. 



Table 1067 - Normal tissue distribution 



Name of Tissue 


ifiimber " 


adrenal 


0 


bladder 


0 


brain 


1 


colon 


63 


epithelial 


33 


general 


13 


head and neck 


101 


kidney 


0 


lung 


11 


breast 


8 


ovary 


14 


pancreas 


0 


prostate 


2 


skin 


99 


Thyroid 


0 


uterus 


181 
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Table 1068 - P values and ratios for expression in cancerous tissue 



^Namehof Tissue 


PI 




SPlJ 


R3 


SP2 i ; ; 


R4 


adrenal 


1 


4.6e-01 


1 


1.0 


5.3e-01 


1.9 


bladder 


2.7e-01 


3.4e-01 


3.3e-03 


4.9 


2.1e-02 


3.3 


brain 


3.5e-01 


2.6e-01 


1 


1.7 


3.3e-01 


2.8 


colon 


7.7e-02 


1.5e-01 


3.1e-01 


1.4 


5.2e-01 


1.0 


epithelial 


1.2e-04 


1.2e-02 


1.3e-06 


2.7 


4.6e-02 


1.4 


general 


5.4e-09 


3.1e-05 


1.8e-16 


5.0 


3.1e-07 


2.6 


head and neck 


4.6e-01 


4.3e-01 


1 


0.6 


9.4e-01 


0.7 


kidney 


2.5e-01 


3.5e-01 


l.le-01 


4.0 


2.4e-01 


2.8 


lung 


1.8e-01 


4.5e-01 


1.9e-01 


2.7 


5.1e-01 


1.4 


breast 


2.0e-01 


3.4e-01 


7.3e-02 


3.3 


2.5e-01 


2.0 


ovary 


2.6e-01 


3.2e-01 


2.2e-02 


2.0 


7.0e-02 


1.6 


pancreas 


9.5e-02 


1.8e-01 


1.8e-04 


7.8 


1.6e-03 


5.5 


prostate 


8.2e-01 


7.8e-01 


4.5e-01 


1.8 


5.6e-01 


1.5 


skin 


5.2e-01 


5.8e-01 


7.1e-01 


0.8 


1 


0.3 


Thyroid 


2.9e-01 


2.9e-01 


1 


1.1 


1 


1.1 


uterus 


4.2e-01 


8.0e-01 


7.5e-01 


0.6 


9.9e-01 


0.4 



As noted above, cluster HSSTROL3 features 6 transcript(s) ? which were listed in Table 1 
above. These transcript(s) encode for protein(s) which are variant(s) of protein Stromelysin-3 
5 precursor. A description of each variant protein according to the present invention is now 
provided. 

Variant protein HSSTROL3_P4 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) HSSTROL3_T5. 
10 An alignment is given to the known protein (Stromelysin-3 precursor) at the end of the 

application. One or more alignments to one or more previously published protein sequences are 
given at the end of the application. A brief description of the relationship of the variant protein 
according to the present invention to each such aligned protein is as follows: 
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Comparison report between HSSTROL3_P4 and MM1 INHUMAN: 
LAn isolated chimeric polypeptide encoding for HSSTROL3_P4, comprising a first 
amino acid sequence being at least 90 % homologous to 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSS 
5 PAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFP 
WQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYW corresponding to 
amino acids 1-163 ofMMl 1_HUMAN, which also corresponds to amino acids 1 - 163 of 
HSSTROL3JP4, a bridging amino acid H corresponding to amino acid 164 of HSSTROL3P4, 
a second amino acid sequence being at least 90 % homologous to 

10 GDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLG 
LQHTTAAKALMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTN 
EIAPLEPDAPPDACEASFDAVSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGL 
PSPVDAAFEDAQGHIWFFQGAQYWVYDGEKPVLGPAPLTELGLVRFPVHAALVWGPE 
KNKIYFFRGRDYWRFHPSTRRVDSPVPRRATDWRGVPSEIDAAFQDADG corresponding 

15 to amino acids 165 - 445 of MM1 INHUMAN, which also corresponds to amino acids 165 - 445 
of HSSTROL3_P4, and a third amino acid sequence being at least 70%, optionally at least 80%, 
preferably at least 85%, more preferably at least 90% and most preferably at least 95% 
homologous to a polypeptide having the sequence 

ALGVRQLVGGGHSSRFSHLWAGLPHACHRKSGSSSQVLCPEPSALLSVAG 
20 corresponding to amino acids 446 - 496 of HSSTROL3_P4, wherein said first amino acid 

sequence, bridging amino acid, second amino acid sequence and third amino acid sequence are 

contiguous and in a sequential order. 

2 .An isolated polypeptide encoding for a tail of HSSTROL3JP4, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
25 more preferably at least about 90% and most preferably at least about 95% homologous to the 

sequence ALGVRQLVGGGHSSRFSHLVVAGLPHACHRKSGSSSQVLCPEPSALLSVAG in 
HSSTROL3 P4. 



30 



The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
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secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HSSTROL3JP4 also has the following non-silent SNPs (Single 
5 Nucleotide Polymorphisms) as listed in Table 1069, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HSSTROL3_P4 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 



1 0 Table 1 069 - Amino acid mutations 



SNP positiqft(s) on amino acid 
sequence- ' "J* ; : 


Mi&rnati ve #Biitoaeid(s) 


Previously known SNP? 


38 


V-> A 


Yes 


104 


R->P 


Yes 


214 


A-> 


No 


323 


Q ->H 


Yes 



Variant protein HSSTROL3_P4 is encoded by the following transcript(s): 
HSSTROL3_T5, for which the sequence(s) is/are given at the end of the application. The coding 
portion of transcript HSSTROL3_T5 is shown in bold; this coding portion starts at position 24 
15 and ends at position 1511. The transcript also has the following SNPs as listed in Table 1070 
(given according to their position on the nucleotide sequence, with the alternative nucleic acid 
listed; the last column indicates whether the SNP is known or not; the presence of known SNPs 
in variant protein HSSTROL3_P4 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

20 Table 1070 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


136 


T->C 


Yes 



WO 7006/1^178^ 




PPTATR'JOO^/OOdO^ 




1100 




334 


G->C 


Yes 


663 


G-> 


No 


699 


->T 


No 


992 


G->C 


Yes 


1528 


A->G 


Yes 


1710 


A->G 


Yes 


2251 


A->G 


Yes 


2392 


C-> 


No 


2444 


C -> A 


Yes 


2470 


A->T 


Yes 


2687 


->G 


No 


2696 


->G 


No 


2710 


C-> 


No 


2729 


-> A 


No 


2755 


T->C 


No 


2813 


A-> 


No 


2813 


A->C 


No 


2963 


A-> 


No 


2963 


A->C 


No 


2993 


T->C 


Yes 


3140 


->T 


No 



Variant protein HSSTROL3_P5 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) HSSTROL3_T8 
5 and HSSTROL3T9. An alignment is given to the known protein (Stromelysin-3 precursor) at 
the end of the application. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 
Comparison report between HSSTROL3JP5 and MM1 1_HUMAN: 
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l.An isolated chimeric polypeptide encoding for HSSTROL3J>5, comprising a first 
amino acid sequence being at least 90 % homologous to 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSS 
PAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRPVLSGGRWEKTDLTYRILRFP 
WQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYW corresponding to 
amino acids 1 - 163 of MM1 1 HUMAN, which also corresponds to amino acids 1 - 163 of 
HSSTROL3JP5, a bridging amino acid H corresponding to amino acid 164 of HSSTROL3P5, 
a second amino acid sequence being at least 90 % homologous to 

GDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLG 
LQHTTAAKALMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTN 
EIAPLEPDAPPDACEASFDAVSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGL 
PSP VD AAFEDAQGHIWFFQ corresponding to amino acids 165 - 358 of MM1 INHUMAN, 
which also corresponds to amino acids 165 - 358 of HSSTROL3JP5, and a third amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
ELGFPSSTGRDESLEHCRCQGLHK corresponding to amino acids 359 - 382 of 
HSSTROL3_P5, wherein said first amino acid sequence, bridging amino acid, second amino 
acid sequence and third amino acid sequence are contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a tail of HSSTROL3JP5, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence ELGFPSSTGRDESLEHCRCQGLHK in HSSTROL3JP5. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HSSTROL3_P5 also has the following norbsilent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1071, (given according to their position(s) on the 
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amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HSSTROL3JP5 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 



5 Table 1071 - Amino acid mutations 



SNP position(s) on amino acid 
sequence f.. ' , .- , 


Alternative amino aeid(s) 


Previously Jcnown SNP? . 


38 


V->A 


Yes 


104 


R ->P 


Yes 


214 


A-> 


No 


323 


Q ->H 


Yes 



Variant protein HSSTROL3_P5 is encoded by the following transcript(s): 
HSSTROL3_T8 and HSSTROL3_T9 ? for which the sequence(s) is/are given at the end of the 
application. 

10 The coding portion of transcript HSSTROL3T8 is shown in bold; this coding portion 

starts at position 24 and ends at position 1 169. The transcript also has the following SNPs as 
listed in Table 1072 (given according to their position on the nucleotide sequence, with the 
alternative nucleic acid listed; the last column indicates whether the SNP is known or not; the 
presence of known SNPs in variant protein HSSTROL3 JP5 sequence provides support for the 

15 deduced sequence of this variant protein according to the present invention). 

Table 1072 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


136 


T->C 


Yes 


334 


G->C 


Yes 


663 


G-> 


No 


699 


->T 


No 


992 


G->C 


Yes 
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1903 


C-> 


No 


1955 


C-> A 


Yes 


1981 


A->T 


Yes 


2198 


->G 


No 


2207 


->G 


No 


2221 


C-> 


No 


2240 


-> A 


No 


2266 


T->C 


No 


2324 


A-> 


No 


2324 


A->C 


No 


2474 


A-> 


No 


2474 


A->C 


No 


2504 


T->C 


Yes 


2651 


->T 


No 



The coding portion of transcript HSSTROL3_T9 is shown in bold; this coding portion 
starts at position 24 and ends at position 1169. The transcript also has the following SNPs as 
listed in Table 1073 (given according to their position on the nucleotide sequence, with the 
5 alternative nucleic acid listed; the last column indicates whether the SNP is known or not; the 
presence of known SNPs in variant protein HSSTROL3 JP5 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

Table 1073 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


136 


T->C 


Yes 


334 


G->C 


Yes 


663 


G-> 1 


No 


699 


~>T 


No 


992 


G->C 


Yes 


1666 


A->G 


Yes 
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1848 


A->G 


Yes 


2389 


A->G 


Yes 


2530 


C-> 


No 


2582 


C -> A 


Yes 


2608 


A->T 


Yes 


2825 


->G 


No 


2834 


->G 


No 


2848 


C -> 


No 


2867 


-> A 


No 


2893 


T->C 


No 


2951 


A-> 


No 


2951 


A->C 


No 


3101 


A-> 


No 


3101 


A->C 


No 


3131 


T->C 


Yes 


3278 


->T 


No 



Variant protein HSSTROL3 JP7 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) HSSTROL3JT10. 
5 An alignment is given to the known protein (Stromelysin-3 precursor) at the end of the 

application. One or more alignments to one or more previously published protein sequences are 
given at the end of the application. A brief description of the relationship of the variant protein 
according to the present invention to each such aligned protein is as follows: 

Comparison report between HSSTROL3 JP7 and MM1 INHUMAN: 
10 l.An isolated chimeric polypeptide encoding for HSSTROL3JP7, comprising a first 

amino acid sequence being at least 90 % homologous to 
MAPAAWLRSAAARALLPPMLLLL^ 

PAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFP 
WQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYW corresponding to 
15 amino acids 1 - 163 of MM1 1_HUMAN ? which also corresponds to amino acids 1 - 163 of 
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HSSTROL3P7, a bridging amino acid H corresponding to amino acid 164 of HSSTROL3_P7, 
a second amino acid sequence being at least 90 % homologous to 

GDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLG 
LQHTTAAKALMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTN 
EIAPLEPDAPPDACEASFDAVSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGL 
PSPVDAAFEDAQGHIWFFQG corresponding to amino acids 165 - 359 of MM1 1HUMAN, 
which also corresponds to amino acids 165 - 359 of HSSTROL3_P7, and a third amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
TTGVSTPAPGV corresponding to amino acids 360 - 370 of HSSTROL3JP7, wherein said first 
amino acid sequence, bridging amino acid, second amino acid sequence and third amino acid 
sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of HSSTROL3JP7, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence TTGVSTPAPGV in HSSTROL3 JP7. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HSSTROL3JP7 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1074, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HSSTROL3_P7 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
Table 1074 - Amino acid mutations 
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SNP pc>sttion(s) on amino acid 
sequence .;\ ; , t _ '% 


Alternative amino acid(s) "„ 


Previously known SNP? . 


38 


V -> A 


Yes 


104 


R->P 


Yes 


214 


A-> 


No 


323 


Q->H 


Yes 



Variant protein HSSTROL3_P7 is encoded by the following transcript(s): 
HSSTROL3T10, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HSSTROL3JT10 is shown in bold; this coding portion starts at 
5 position 24 and ends at position 1 133. The transcript also has the following SNPs as listed in 
Table 1075 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HSSTROL3_P7 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

10 Table 1075 - Nucleic acid SNPs 



SNP position on nucleotide * 
sequence - ff ' 'J-.-. 


Alternative nucleic acid :J 


Previously known SNP? ' ,1 


136 


T->C 


Yes 


334 


G->C 


Yes 


663 


G-> 


No 


699 


->T 


No 


992 


G->C 


Yes 


1386 


A->G 


Yes 


1568 


A->G 


Yes 


2109 


A->G 


Yes 


2250 


C-> 


No 


2302 


C->A 


Yes 


2328 


A->T 


Yes 
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2545 


->G 


No 


2554 


-> G 


No 


2568 


C -> 


No 


2587 


-> A 


No 


2613 


T-> C 


No 


2671 


A-> 


No 


2671 


A->C 


No 


2821 


A-> 


No 


2821 


A->C 


No 


2851 


T->C 


Yes 


2998 


->T 


No 



Variant protein HSSTROL3P8 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) HSSTROL3T1 1. 
5 An alignment is given to the known protein (Stromelysin-3 precursor) at the end of the 

application. One or more alignments to one or more previously published protein sequences are 
given at the end of the application. A brief description of the relationship of the variant protein 
according to the present invention to each such aligned protein is as follows: 
Comparison report between HSSTROL3_P8 and MM1 1 HUMAN: 
10 l.An isolated chimeric polypeptide encoding for HSSTROL3JP8, comprising a first 

amino acid sequence being at least 90 % homologous to 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSS 
PAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFP 
WQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYW corresponding to 
15 amino acids 1-163 of MM1 1HUMAN, which also corresponds to amino acids 1 - 163 of 

HSSTROL3JP8, a bridging amino acid H corresponding to amino acid 164 of HSSTROL3JP8, 
a second amino acid sequence being at least 90 % homologous to 

GDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLG 
LQHTTAAKALMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTN 
20 EIAPLE corresponding to amino acids 165 - 286 of MM1 INHUMAN, which also corresponds 
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to amino acids 165 - 286 of HSSTROL3JP8, and a third amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
VRPCLPVPLLLCWPL corresponding to amino acids 287 - 301 of HSSTROL3JP8, wherein 
5 said first amino acid sequence, bridging amino acid, second amino acid sequence and third 
amino acid sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of HSSTROL3JP8, comprising a 
polypeptide being at least 70%o, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90%> and most preferably at least about 95% homologous to the 
1 0 sequence VRPCLPVPLLLCWPL in HSSTROL3„P8. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 

15 secreted. The protein localization is believed to be secreted because both signal-peptide 

prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HSSTROL3JP8 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1076, (given according to their position(s) on the 

20 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HSSTROL3JP8 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 1076- Amino acid mutations 



SNP positkm(s) on amino acid 
sequence 


Alternative amino acid(s) 


: Previously known SNP? 


38 


V -> A 


Yes 


104 


R->P 


Yes 


214 


A-> 


No 
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Variant protein HSSTROL3 J>8 is encoded by the following transcript(s): 
HSSTROL3JT1 1, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HSSTROL3JT1 1 is shown in bold; this coding portion starts at 
position 24 and ends at position 926. The transcript also has the following SNPs as listed in 
5 Table 1077 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HSSTROL3_P8 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 1077 - Nucleic acid SNPs 



SNP position on nucleotide ; : 


^temiative nucleic acid. . >! f 

: f/. . ' ■'£", 


Previously known SNP? > 


136 


T->C 


Yes 


334 


G->C 


Yes 


663 


G-> 


No 


699 


->T 


No 


935 


G->A 


Yes 


948 


G->A 


Yes 


1084 


G->C 


Yes 


1557 


C-> 


No 


1609 


C -> A 


Yes 


1635 


A->T 


Yes 


1852 


->G 


No 


1861 


->G 


No 


1875 


C-> 


No 


1894 


->A 


No 


1920 


T->C 


No 


1978 


A-> 


No 


1978 


A->C 


No 


2128 


A-> 


No 


2128 


A->C 


No 
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2158 


T->C 


Yes 


2305 


->T 


No 



Variant protein HSSTROL3JP9 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) HSSTROL3_T12. 
5 An alignment is given to the known protein (Stromelysin-3 precursor) at the end of the 

application. One or more alignments to one or more previously published protein sequences are 
given at the end of the application. A brief description of the relationship of the variant protein 
according to the present invention to each such aligned protein is as follows: 
Comparison report between HSSTROL3_P9 and MM1 INHUMAN: 
10 l.An isolated chimeric polypeptide encoding for HSSTROL3_P9, comprising a first 

amino acid sequence being at least 90 % homologous to 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSS 
PAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQK corresponding to amino acids 1 - 
96 of MM1 1 HUMAN, which also corresponds to amino acids 1-96 of HSSTROL3 JP9, a 

15 second amino acid sequence being at least 90 % homologous to 

RILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYW 
corresponding to amino acids 113 - 163 of MM1 INHUMAN, which also corresponds to amino 
acids 97- 147 of HSSTROL3 JP9, a bridging amino acid H corresponding to amino acid 148 of 
HSSTROL3JP9, a third amino acid sequence being at least 90 % homologous to 

20 GDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLG 
LQHTTAAKALMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTN 
EIAPLEPDAPPDACEASFDAVSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGL 
PSPVD AAFED AQGHIWFFQG corresponding to amino acids 165 - 359 of MM1 INHUMAN, 
which also corresponds to amino acids 149 - 343 of HSSTROL3JP9, and a fourth amino acid 

25 sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
TTGVSTP APGV corresponding to amino acids 344 - 354 of HSSTROL3_P9, wherein said first 
amino acid sequence, second amino acid sequence, bridging amino acid, third amino acid 
sequence and fourth amino acid sequence are contiguous and in a sequential order. 
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2. An isolated chimeric polypeptide encoding for an edge portion of HSSTROL3P9, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
acids in length, more preferably at least about 40 amino acids in length and most preferably at 
5 least about 50 amino acids in length, wherein at least two amino acids comprise KR, having a 
structure as follows: a sequence starting from any of amino acid numbers 96-x to 96; and ending 
at any of amino acid numbers 97-1- ((n-2) - x), in which x varies from 0 to n-2. 

3 .An isolated polypeptide encoding for a tail of HSSTROL3_P9, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
10 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence TTGVSTPAPGV in HSSTROL3JP9. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 

1 5 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HSSTROL3_P9 also has the following non-silent SNPs (Single 

20 Nucleotide Polymorphisms) as listed in Table 1078, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HSSTROL3_P9 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

25 Table 1078 - Amino acid mutations 



SNP positkm(s) on amino acid 
sequence 


Alternative amino acid(s) 


' Previously known SNP? 


38 


V-> A 


Yes 


198 


A-> 


No 


307 


Q->H 


Yes 
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Variant protein HSSTROL3_P9 is encoded by the following transcript(s): 
HSSTROL3_T12 ? for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HSSTROL3_T12 is shown in bold; this coding portion starts at 
5 position 24 and ends at position 1085. The transcript also has the following SNPs as listed in 
Table 1079 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HSSTROL3 P9 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 1079- Nucleic acid SNPs 



SNP position oh nucleotide ■/ v 
sequence'" § : 


Al^naiiv^ucleio'^d . ' , 


Previously kitevra SNP? 


136 


T->C 


Yes 


615 


G-> 


No 


651 


>T 


No 


944 


G->C 


Yes 


1275 


C-> 


No 


1327 


C-> A 


Yes 


1353 


A->T 


Yes 


1570 


->G 


No 


1579 


->G 


No 


1593 


C-> 


No 


1612 


->A 


No 


1638 


T->C 


No 


1696 


A-> 


No 


1696 


A->C 


No 


1846 


A-> 


No 


1846 


A->C 


No 


1876 


T->C 


Yes 


2023 


->T 


No 
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As noted above, cluster HSSTROL3 features 16 segments), which were listed in Table 2 
above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
5 provided. 

Segment cluster HSSTROL3_node_6 according to the present invention is supported by 
14 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3JT5, HSSTROL3_T8 ? HSSTROL3_T9, 
10 HSSTROL3JT10, HSSTROL3_Tl 1 and HSSTROL3_T12. Table 1080 below describes the 
starting and ending position of this segment on each transcript. 

Table 1080 - Segment location on transcripts 





Segment ■ 


Segment "■■ v--. • 


%t ■ 


starting position \ 


ending position ; 


HSSTROL3_T5 


1 


131 


HSSTROL3_T8 


1 


131 


HSSTROL3_T9 


1 


131 


HSSTROL3_T10 


1 


131 


HSSTROL3_Tll 


1 


131 


HSSTROL3_T12 ' 


1 


131 



1 5 Segment cluster HSSTROL3_node_l 0 according to the present invention is supported by 

21 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3JT5, HSSTROL3 JT8, HSSTROL3JT9, 
HSSTROL3JT10, HSSTROL3JT11 and HSSTROL3 T1 2. Table 1081 below describes the 
starting and ending position of this segment on each transcript. 

20 Table 1081 - Segment location on transcripts 
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Transcript nam A 


Segment 


Segment ', •• 




starting position 


ending position 


HSSTROL3_T5 


132 


313 


HSSTROL3_T8 


132 


313 


HSSTROL3_T9 


132 


313 


HSSTROL3_T10 


132 


313 


HSSTROL3_Tll 


132 


313 


HSSTROL3_T12 


132 


313 



Segment cluster HSSTROL3_node_13 according to the present invention is supported by 
36 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3 JT5, HSSTROL3JT8, HSSTROL3JT9, 
HSSTROL3 JT10, HSSTROL3_Tl 1 and HSSTROL3_T12. Table 1082 below describes the 
starting and ending position of this segment on each transcript. 

Table 1082 - Segment location on transcripts 



Transcript name ' " 'ff f 


Segment $ 
; starting position 


Segment ■ 
ending position . 


HSSTROL3_T5 


362 


505 


HSSTROL3_T8 


362 


505 


HSSTROL3_T9 


362 


505 


HSSTROL3_T10 


362 


505 


HSSTROL3_Tll 


362 


505 


HSSTROL3_T12 


314 


457 



Segment cluster HSSTROL3_node_15 according to the present invention is supported by 
47 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3_T5, HSSTROL3JT8, HSSTROL3 JT9, 
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HSSTROL3_T10, HSSTROL3_Tll and HSSTROL3__T12. Table 1083 below describes the 
starting and ending position of this segment on each transcript. 

Table 1083 - Segment location on transcripts 



Transcript name . ' 


; Segment" -. f : ' > 
starting position ^ ■ ; 


Segment 

ending position . 


HSSTROL3_T5 


506 


639 


HSSTROL3_T8 


506 


639 


HSSTROL3_T9 


506 


639 


HSSTROL3_T10 


506 


639 


HSSTROL3_Tll 


506 


639 


HSSTROL3_T12 


458 


591 



Segment cluster HSSTROL3_node_19 according to the present invention is supported by 
63 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3_T5 5 HSSTROL3_T8, HSSTROL3JT9, 
HSSTROL3_T10 5 HSSTROL3_Tll and HSSTROL3_T12. Table 1084 below describes the 
1 0 starting and ending position of this segment on each transcript. 



Table 1084 - Segment location on transcripts 



Transcript name 


. Segment " 


Segment 




i starting position 


ending position 


HSSTROL3_T5 


699 


881 


HSSTROL3_T8 


699 


881 


HSSTROL3_T9 


699 


881 


HSSTROL3_T10 


699 


881 


HSSTROL3_Tll 


699 


881 


HSSTROL3_T12 


651 


833 
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Segment cluster HSSTROL3_node_21 according to the present invention is supported by 
61 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3JT5, HSSTROL3JT8, HSSTROL3_T9 5 
HSSTROL3 T10, HSSTROL3_Tll and HSSTROL3_T12. Table 1085 below describes the 
5 starting and ending position of this segment on each transcript. 

Table 1085 - Segment location on transcripts 



Transcript name ^ : i ,| / , . 


Segment / : ; 


Segment • 


;•■ *<■ "-f "S ■ ,l "i ' •"• 


starting position 


ending position \./ ""' 


HSSTROL3_T5 


882 


1098 


HSSTROL3_T8 


882 


1098 


HSSTROL3_T9 


882 


1098 


HSSTROL3_T10 


882 


1098 


HSSTROL3_Tll 


974 


1190 


HSSTROL3_T12 


834 


1050 



Segment cluster HSSTROL3__node__24 according to the present invention is supported by 
10 7 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3JT8 and HSSTROL3_T9. Table 1086 below 
describes the starting and ending position of this segment on each transcript. 

Table 1086 - Segment location on transcripts 



Transcript name , v , 


Segment 
starting position 


Segment 
ending position 


HSSTROL3_T8 


1099 


1236 


HSSTROL3_T9 


1099 


1236 



15 

Segment cluster HSSTROL3_node_25 according to the present invention is supported by 
13 libraries. The number of libraries was determined as previously described. This segment can 
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be found in the following transcript(s): HSSTROL3_T8. Table 1087 below describes the 
starting and ending position of this segment on each transcript. 

Table 1087 - Segment location on transcripts 



Transcript name • f ■ - : ? , " ' :i 


Segment 
starting position 


.Segment 

ending position.. ; 


HSSTROL3_T8 


1237 


1536 



Segment cluster HSSTROL3_node_26 according to the present invention is supported by 
55 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3_T5, HSSTROL3_T8, HSSTROL3_T9 and 
HSSTROL3_Tl L Table 1088 below describes the starting and ending position of this segment 
10 on each transcript. 



Table 1088 - Segment location on transcripts 



Transcript name .; '. ■ . 


.Segrnenl ' - ' 


Segment ,. : „''* r; ''M : 




starting position 


ending position • . ; i 


HSSTROL3_T5 


1099 


1240 


HSSTROL3_T8 


1537 


1678 


HSSTROL3_T9 


1237 


1378 


HSSTROL3_Tl 1 


1191 


1332 



Segment cluster HSSTROL3__node_28 according to the present invention is supported by 
15 10 libraries. The number of libraries was detemiined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3_T5 ? HSSTROL3JT9 and HSSTROL3_T10. 
Table 1089 below describes the starting and ending position of this segment on each transcript. 

Table 1089 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 
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HSSTROL3_ 


T5 


1357 


2283 


HSSTROL3_ 


T9 


1495 


2421 


HSSTROL3 


_T10 


1215 


2141 











Segment cluster HSSTROL3_node_29 according to the present invention is supported by 
109 libraries. The number of libraries was determined as previously described. This segment can 
5 be found in the following transcript(s): HSSTROL3JT5, HSSTROL3JT8, HSSTROL3_T9, 
HSSTROL3_T10, HSSTROL3_Tll and HSSTROL3JT12. Table 1090 below describes the 
starting and ending position of this segment on each transcript. 



Table 1090 - Segment location on transcripts 



r 1tttosbript name - / : ^ s ' Ji- . . f ;$ : ? ■ % 

"■■ , -;jvy " '• • fy ' - ; • ' J • "... 

; . A * . " f f s - -. * "V 


Segment ■ • .- .-W / ' 
starting position J" 


Segment - ; 
ending position \£ 


HSSTROL3_T5 


2284 


3194 


HSSTROL3_T8 


1795 


2705 


HSSTROL3_T9 


2422 


3332 


HSSTROL3_T10 


2142 


3052 


HSSTROL3_Tll 


1449 


2359 


HSSTROL3_T12 


1167 


2077 



10 According to an optional embodiment of the present invention, short segments related to 

the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

Segment cluster HSSTROL3_node_l 1 according to the present invention is supported by 
15 25 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3JT5, HSSTROL3JT8, HSSTROL3_T9, 
HSSTROL3JT10 and HSSTROL3_Tll. Table 1091 below describes the starting and ending 
position of this segment on each transcript. 

Table 1091 - Segment location on transcripts 
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TrMsciipt name # 


Segment 
starting position 


Segment 
ending position 


HSSTROL3_T5 


314 


361 


HSSTROL3T8 


314 


361 


HSSTROL3_T9 


314 


361 


HSSTROL3_T10 


314 


361 


HSSTROL3_Tl 1 


314 


361 



Segment cluster HSSTROL3_node_17 according to the present invention is supported by 
45 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3JT5, HSSTROL3JT8, HSSTROL3JT9, 
HSSTROL3JT10, HSSTROL3_Tll and HSSTROL3_T12. Table 1092 below describes the 
starting and ending position of this segment on each transcript. 

Table 1092 - Segment location on transcripts 



Trdrtscript name ; j? . f 


Segment ■ : ■ 
starting position 


Segment ' , . 
ejading position 


HSSTROL3_T5 


640 


680 


HSSTROL3_T8 


640 


680 


HSSTROL3_T9 


640 


680 


HSSTROL3_T10 


640 


680 


HSSTROL3_Tll 


640 


680 


HSSTROL3_T12 


592 


632 



Segment cluster HSSTROL3_node_l 8 according to the present invention can be found in 
the following transcript(s): HSSTROL3_T5, HSSTROL3_T8, HSSTROL3_T9, 
HSSTROL3_T10, HSSTROL3_Tll and HSSTROL3_T12. Table 1093 below describes the 
starting and ending position of this segment on each transcript. 
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Table 1093 - Segment location on transcripts 



Transcript name ;• ';. ' c 


Segment 


■• Segment \ - 




■ starting position 


ending position 


HSSTROL3_T5 


681 


698 


HSSTROL3_T8 


681 


698 


HSSTROL3_T9 


681 


698 


HSSTROL3_T10 


681 


698 


HSSTROL3_Tl 1 


681 


698 


HSSTROL3_T12 


633 


650 



Segment cluster HSSTROL3 jtiode_20 according to the present invention is supported by 
1 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3_Tll. Table 1094 below describes the 
starting and ending position of this segment on each transcript. 

Table 1094 - Segment location on transcripts 



Tj^uisctiptname- • . • \&T : ;'' 1 ',• 


■ Segftieat ' 'r t .; . 
starting positioti : 


ending position 


HSSTROL3_Tll 


882 


973 



Segment cluster HSSTROL3_node__27 according to the present invention is supported by 
50 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSSTROL3_T5, HSSTROL3_T8, HSSTROL3_T9 ? 
HSSTROL3JT10, HSSTROL3_Tll and HSSTROL3JT12. Tabfe 1095 below describes the 
starting and ending position of this segment on each transcript. 

Table 1095 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 
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HSSTROL3_T5 


1241 


1356 


HSSTROL3_T8 


1679 


1794 


HSSTROL3_T9 


1379 


1494 


HSSTROL3_T10 


1099 


1214 | 


HSSTROL3_Tl 1 


1333 


1448 


HSSTROL3_T12 


1051 


1166 



Variant protein alignment to the previously known protein: 
Sequence name: MM1 INHUMAN 

10 Sequence documentation: 

Alignment of: HSSTROL3_P4 x MM1 1_HUMAN 

Alignment segment 1 / 1 : 

15 

Quality: 4444.00 

Escore: 0 

Matching length: 445 Total 

length: 445 

20 Matching Percent Similarity: 99.78 Matching Percent 
Identity: 99.78 

Total Percent Similarity: 99.78 Total Percent 

Identity: 99.78 

Gaps : 0 
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Alignment : 

1 MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQP 5 0 

I ! I I I I I I M I I i I i ! I I I I 1 I I I I I 1 I I t I I I I I I I 1 I M 1 I I I i I I I I 

1 MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQP 50 
51 WHAALPSSPAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVL 100 

I I I I I I I I I I I I I t I I I I I I I ! I 1 I ! I I I I I I I I M i I I I I t I I I I I I I I 

51 WHAALPSSPAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVL 100 
101 SGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHE 150 

I I I I I 1 I I I I I I I I t I I I I ! I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 

101 SGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHE 150 

151 GRADIMIDFARYWHGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWT 200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
151 GRADIMIDFARYWDGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWT 200 

201 IGDDQGTDLLQVAAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSPDDC 250 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M t I I I I I I I I I I I I I 

201 IGDDQGTDLLQVAAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSPDDC 250 
251 RGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLEPDAPPDACEASFDA 300 

i I I I I I I I M I I I I I I I I I I I I I I M 1 I I I I I I I I I I I I I I I I I I I I I I I 

251 RGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLEPDAPPDACEASFDA 30 0 
301 VST I RGELFFFKAGFVWRLRGGQLQPGYPALAS RHWQGLP S PVDAAFEDA 350 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 VST I RGELFFFKAGFVWRLRGGQLQPGY PAL AS RHWQGLP S PVDAAFEDA 350 
351 QGHIWFFQGAQYWVYDGEKPVLGPAPLTELGLVRFPVHAALVWGPEKNKI 400 
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I I I M I I I I ! I II ! I i I I I i I I I I 1 I I 1 I I I I I I I t I 1 M M I I M I I I I 

351 QGHIWFFQGAQYWVYDGEKPVLGPAPLTELGLVRFPVHAALVWGPEKNKI 4 0 0 

4 01 YFFRGRDYWRFHPSTRRVDSPVPRRATDWRGVPSEIDAAFQDADG 4 4 5 

I I I 1 I I I I I I I 1 1 I I 1 I I I I I I I I I I 1 I I I 1 I I I I M I i 1 I I I M 

401 YFFRGRDYWRFHPSTRRVDS PVPRRATDWRGVPSE I DAAFQDADG 44 5 



Sequence name: MM1 i HUMAN 

Sequence documentation : 

Alignment of: HSSTROL3_P5 x MM11_HUMAN 
Alignment segment 1/1: 

Quality: 3566.00 

Escore: 0 

Matching length: 358 Total 

length: 358 
Matching Percent Similarity: 99.72 Matching Percent 

Identity: 99.72 

Total Percent Similarity: 99.72 Total Percent 

Identity: 99.72 

Gaps : 0 

Alignment : 
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1 MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQP 5 0 

I I i I I I I I I I 1 I I i I I ! I ! I I I I I I II I I I I I I I i I M M I I I I I I M I I 

1 MAP AAWLRS AAARALL P PMLLLLLQ PP PLL ARALP P DVHHLH AERRG PQ P 5 0 
51 WHAALPSSPAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVL 100 

I I II I II I I I I I I I! M I I I I I II I I I I I I II I I I I I II I I I I I I I I I I I 

51 WHAALPSSPAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVL 10 0 
101 SGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHE 150 

I | | ! | | | I I I I i I I I I I I I I I I I I I I I I 1 I I II I I I I I I 1 I I I I I I I I I I 

101 SGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHE 150 

151 GRADIMIDFARYWHGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWT 200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I 1 I I I 1 I II I M 
151 GRADIMIDFARYWDGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWT 200 

- 

201 IGDDQGTDLLQVAAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSPDDC 250 

I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I M I I I I I I I I I I I I I 1 I I 

201 IGDDQGTDLLQVAAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSPDDC 250 

251 RGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLEPDAPPDACEASFDA 300 

| | I | | 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
251 RGVQHLYGQPWP T VT SRT PALG PQAG I DTNE I APLE P DAP PDACE AS FDA 300 

301 VS T IRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGLP S PVDAAFEDA 350 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I 
301 VSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGLPS PVDAAFEDA 350 



351 QGHIWFFQ 
I I I I II II 



358 
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351 QGHIWFFQ 



358 



Sequence name: MM1 INHUMAN 



Sequence documentation : 



Alignment of: HSSTROL3_P7 x MM11JBUMAN 



Alignment segment 1/1: 



Quality: 3575.00 

Escore: 0 

Matching length: 359 
length: 359 

Matching Percent Similarity: 99.72 
Identity: 99.72 

Total Percent Similarity: 99.72 
Identity: 99.72 

Gaps : 0 



Total 



Matching Percent 



Total Percent 



Alignment : 



1 MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQP 50 

I I I I ! I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I ! I I I I I I I I I I I I 

1 MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQP 50 
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51 WHAALPSSPAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVL 100 

I I I M I I I I I It I I I I I I I I I I I I I i I I I I I 1 I I M I II I I I I I I II I I I 

51 WHAALPSSPAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVL 100 
5 101 SGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHE 150 

I 1 | | | I I I I I 1 I I I I 1 I I I I I I I I I I I I M I I I i I I II I I I I 1 I I I II I I 

101 SGGRWEKTDLTYRILRFPWQLVQEQVRQT3MIAEALKVWSDVTPLTFTEVHE 150 

151 GRAD1MIDFARYWHGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWT 200 

10 I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

151 GRADIMIDFARYWDGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWT 200 

. • • • " 

201 IGDDQGTDLLQVAAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSPDDC 250 

I I I I I I I I I I I I II I I I I I I I II I I I M I I I I I I I I I I M I I I I I I I I I I 

15 2 01 IGDDQGTDLLQV7VAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSPDDC 250 

251 RGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLEPDAPPDACEASFDA 300 

I I M I I I I I I I I I I I I I I I 11 I I I I I i I I I II I I I 1 I I I I I I 1 I I I I I I h 

251 RGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLEPDAPPDACEASFDA 300 

20 • 

301 VS T I RGELF FFKAGFVWRLRGGQLQ PG Y P ALASRHWQGLP S PVDAAFE DA 350 

| | | I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
301 VSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGLPSPVDAAFEDA 350 

25 351 QGHIWFFQG 359 

MINIMI 

351 QGHIWFFQG 359 



30 
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Sequence name: MM11_HUMAN 
5 Sequence documentation: 

Alignment of: HSSTROL3_P8 x MM11_HUMAN 
Alignment segment 1/1: 

10 

Quality: 2838.00 

Escore: 0 

Matching length: 28 6 

length: 286 
15 Matching Percent Similarity: 99.65 
Identity: 99.65 

Total Percent Similarity: 99.65 
Identity: 99.65 

Gaps : 0 

20 

Alignment : 

1 MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQP 50 

I I I I I 1 I I I I I I I I M I I I I I i I I I I I I I I 1 I M 1 I I I I M I I I I I I I I I 

25 1 MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQP 5 0 

51 WHAALPS S PAPAPATQEAPRPAS SLRPPRCGVPDPSDGLS ARNRQKRFVL 10 0 

1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

51 WHAALPS S PAPAPATQEAPRPAS SLRPPRCGVPDPS DGLS ARNRQKRFVL 100 
30 ..... 

101 S G GRWEKT DLT YRI LRFP WQLVQEQ VRQ TMAE ALKVW S DVT PLT FTE VHE 150 



Total 



Matching Percent 
Total Percent 
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I I I I I I I I I I I 1 M I I I I I I I I i t I I I I I I I 1 II I I I I i I I I I I I I M I I 

101 SGGRWEKTDLT YRI LRFPWQLVQEQVRQTMAEALKVWS DVT PLTFTEVHE 150 

m a • * * 

151 GRADIMIDFARYWHGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWT 200 

I i | I I I I I I I I I I I I I I I i I I I I I I I I I I 1 I I I I I I I I M I I I I I I 1 I I 
151 GRADIMIDFARYWDGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWT 200 

201 IGDDQGTDLLQVAAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSPDDC 250 

I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I i I I I I M 

201 IGDDQGTDLLQVAAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSPDDC 250 

251 RGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLE 28 6 

I | | | I I I i I I I I I I I I I 1 I I I I I I I I I I i I N I I M 
251 RGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLE 28 6 

Sequence name: MM11_HUMAN 
Sequence documentation : 

Alignment of: HSSTROL3_P9 x MM1 INHUMAN 
Alignment segment 1/1: 

Quality: 3316.00 

Escore: 0 
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Matching length: 343 Total 

length: 359 

Matching Percent Similarity: 99.71 Matching Percent 
Identity: 99.71 

Total Percent Similarity: 95.26 Total Percent 

Identity: 95.26 

Gaps : 1 



Alignment : 

1 MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQP 5 0 

I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I II I I I 1 11 1 I I I M I I I I M 

1 MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQP 50 

• • • • 

51 WHAALPSSPAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQK. ... 96 

I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I ! I I I I I I I I I I M 1 I I 

51 WHAALPSSPAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVL 100 

97 RILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHE 134 

I I I I I i I I I I I I I I I 1 I I I I I I II I I I I I I I I I I I I I I 
101 SGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHE 150 

135 GRADIMIDFARYWHGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWT 184 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
151 GRADIMIDFARYWDGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWT 200 

• • • • * 

185 IGDDQGTDLLQVAAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSPDDC 234 

| | | I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
201 IGDDQGTDLLQVAAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSPDDC 250 

235 RGVQHL YGQPWP T VT SRT PALG PQAG I DTNE I APLE P DAPP DACE AS FDA 28 4 
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I 1 I i I I I 1 I I I I I I I M I I I I I I I I I I I I I I I 1 I I I M I I I I I 1 I I I I I I 

251 RGVQHLYGQPWPTVTSRT PALGPQAGIDTNEIAPLEPDAPPDACEASFDA 300 

■ ■ • • • 

285 VSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGLPSPVDAAFEDA 33 4 

5 I M I 1 i 1 I I I I I I I I I 1 I I I I I I 1 I I I I I I ! I 1 I I I I I I i I I I I I I I I I I 

301 VSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGLPSPVDAAFEDA 350 

335 QGHIWFFQG 343 
I I I I I I II I 

10 351 QGHIWFFQG 359 



Expression of Stromelysin-3 precursor HSSTROL3 transcripts which are detectable by 
amplicon as depicted in sequence name HSSTROL3 seg24 in normal and cancerous Lung 

1 5 tissues 

Expression of Stromelysin-3 precursor (EC 3.4.24.-) (Matrix metalloproteinase-1 1) 
(MMP-11) (ST3) (SI^3) transcripts detectable by or according to seg24, HSSTROL3 seg24 
amplicon (SEQ ID NO: 1675) and HSSTROL3 seg24F (SEQ ID NO: 1673) and HSSTROL3 
seg24R (SEQ ID NO: 1674) primers was measured by real time PCR. In parallel the expression 

20 of four housekeeping genes -PBGD (GenBank Accession No. BC019323; amplicon - PBGD- 
amplicon, SEQ ID NO:334), HPRT1 (GenBank Accession No. NM_000194; amplicon - 
HPRT1 -amplicon, SEQ ID NO: 1297), Ubiquitin(GenBank Accession No. BC000449; amplicon 
- Ubiquitin-amplicon, SEQ ID NO:328) and SDHA (GenBank Accession No. NM_004168; 
amplicon - SDHA-amplicon, SEQ ID NO:331) was measured similarly. For each RT sample, 

25 the expression of the above amplicon was normalized to the geometric mean of the quantities of 
the housekeeping genes. The normalized quantity of each RT sample was then divided by the 
median of the quantities of the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 
96-99, Table 2 "Tissue samples in testing panel", above), to obtain a value of fold up-regulation 
for each sample relative to median of the normal PM samples. 

30 Figure 39 is a histogram showing over expression of the above -indicated Stromelysin-3 

precursor transcripts in cancerous lung samples relative to the normal samples. Values 
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represent the average of duplicate experiments. Error bars indicate the minimal and maximal 
values obtained.) 

As is evident from Figure 39, the expression of Stromelysin-3 precursor transcripts 
detectable by the above amplicon(s) in cancer samples was significantly higher than in the non- 
5 cancerous samples (Sample Nos. 47-50, 90-93, 96-99 Table 2, "Tissue samples in testing 
panel"). Notably an over- expression of at least 5 fold was found in 13 out of 15 
adenocarcinoma samples, 8 out of 1 6 squamous cell carcinoma samples, 3 out of 4 large cell 
carcinoma samples and in 7 out of 8 small cell carcinoma samples. 

Threshold of 5 fold overexpression was found to differentiate between cancer and 
10 normal samples with P value of 4.04E-04 in adenocarcinoma, 9.89E-02 in squamous cell 

carcinoma, 6.04E-02 in Large cell carcinoma, 3.14E-03 in small cell carcinoma as checked by 
exact fisher test. The above values demonstrate statistical significance of the results. 

Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
15 limiting illustrative example only of a suitable primer pair: HSSTROL3 seg24F forward primer; 
and HSSTROL3 seg24R reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: HSSTROL3 
20 seg24. 

Forward Primer (SEQ ID NO: 1673): ATTTCCATCCTCAACTGGCAGA 
Reverse Primer (SEQ ID NO: 1674): TGCCCTGGAACCCACG 
Amplicon (SEQ ID NO: 1675): 
ATTTCCATCCTCAACTGGCAGAGATGAGAGCCTGGAGCATTGCAGATGCCAGGGAC 
25 TTCACAAATGAAGGCACAGCATGGGAAACCTGCGTGGGTTCCAGGGCA 

Expression of Stromelysin-3 precursor HSSTROL3 transcripts which are detectable by 
amplicon as depicted in sequence name HSSTROL3 seg24 in different normal tissues 

30 Expression of Stromelysin-3 precursor transcripts detectable by or according to 

HSSTROL3 seg24 amplicon (SEQ ID NO: 1675) and HSSTROL3 seg24F (SEQ ID NO: 1673) 
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and HSSTROL3 seg24R (SEQ ID NO: 1674) was measured by real time PGR. In parallel the 
expression of four housekeeping genes Ubiquitin(GenBank Accession No. BC000449; amplicon 
- Ubiquitin-amplicon, SEQ ID NO:328) and SDHA (GenBank Accession No. NM_004168; 
amplicon- SDHA- amplicon, SEQ ID NO:331), RPL19 (GenBank Accession No. NM_000981; 
RPL19 amplicon, SEQ ID NO: 1630), TATA box (GenBank Accession No. NMJ303194; TATA 
amplicon, SEQ ID NO: 1633) was measured similarly. For each RT sample, the expression of 
the above amplicon was normalized to the geometric mean of the quantities of the housekeeping 
genes. The normalized quantity of each RT sample was then divided by the median of the 
quantities of the lung samples (Sample Nos. 15-17, Table 2 "Tissue samples in normal panel", 
above), to obtain a value of relative expression of each sample relative to median of the lung 
samples. 

Forward Primer (SEQ ID NO: 1673): ATTTCCATCCTCAACTGGCAGA 

Reverse Primer (SEQ ID NO: 1674): TGCCCTGGAACCCACG 

Amplicon (SEQ ID NO: 1675): 
ATTTCCATCCTCAACTGGCAGAGATGAGAGCCTGGAGCATTGCAGATGCCAGGGAC 

TTCACAAATGAAGGCACAGCATGGGAAACCTGCGTGGGTTCCAGGGCA 

The results are demonstrated in Figure 40, showing the expression of Stromelysin-3 
HSSTROL3 transcripts, which are detectable by amplicon as depicted in sequence name 
HSSTROL3 seg24, in different normal tissues. 

Expression o/Homo sapiens matrix metalloproteinase 11 (stromelysin 3) (MMP 1 1) HSSTROL3 
transcripts which are detectable by amplicon as depicted in sequence name HSSTROL3 seg20- 

21 in normal and cancerous lung tissues 
Expression of Homo sapiens matrix metalloproteinase 11 (stromelysin 3) (MMP11) 
transcripts detectable by or according to seg20-21, HSSTROL3 seg20-21 amplicon (SEQ ID 
NO: 1678) and primers HSSTROL3 seg20-21F (SEQ ID NO: 1676) and HSSTROL3 seg20-21R 
(SEQ ID NO: 1677) was measured by real time PGR. In parallel the expression of four 
housekeeping genes — PBGD (GenBank Accession No. BC019323; amplicon - PBGD-amplicon, 
SEQ ID NO:334), HPRT1 (GenBank Accession No. NM_000194; amplicon - HPRT1 -amplicon, 
SEQ ID NO: 1297), Ubiquitin (GenBank Accession No. BC000449; amplicon - Ubiquitin- 
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amplicon, SEQ ID NO:328) and SDHA (GenBank Accession No. NM_004168; amplicon - 
SDHA-amplicon, SEQ ID NO:331), was measured similarly. For each RT sample, the 
expression of the above amplicon was normalized to the geometric mean of the quantities of the 
housekeeping genes. The normalized quantity of each RT sample was then divided by the 
median of the quantities of the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 
96-99, Table 2, above), to obtain a value of fold up-regulation for each sample relative to median 
of the normal PM samples. 

Figure 71 is a histogram showing over expression of the above -indicated Homo sapiens 
matrix metalloproteinase 11 (stromelysin 3) (MMPll) transcripts in cancerous lung samples 
relative to the normal samples. 

As is evident from Figure 71, the expression of Homo sapiens matrix metalloproteinase 
11 (stromelysin 3) (MMP1 1) transcripts detectable by the above amplicon(s) in cancer samples 
was significantly higher than in the non-cancerous samples (Sample Nos. 46-50, 90-93, 96-99 
Table 2, Notably an over- expression of at least 6 fold was found in 1 1 out of 15 
adenocarcinoma samples, 6 out of 1 6 squamous cell carcinoma samples, 1 out of 4 large cell 
carcinoma samples and in 6 out of 8 small cells carcinoma samples. 
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Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: HSSTROL3 seg20-21F forward 
primer; and HSSTROL3 seg20-21R reverse primer. 
5 The present invention also preferably encompasses any amplicon obtained through the 

use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: HSSTROL3 
seg20-21. 

10 Primers: 

Forward primer HSSTROL3 seg20-21F (SEQ ID NO: 1676): TCTGCTGGCCACTGTGACTG 
Reverse primer HSSTROL3 seg20-21R (SEQ ID NO: 1677): 
GAAGAAAAAGAGCTCGCCTCG 
Amplicon HSSTROL3 seg20-21 (SEQ ID NO: 1678): 
15 TCTGCTGGCCACTGTGACTGCAGCATATGCCCTCAGCATGTGTCCCTCTCTCCCACC 
CCAGCCAGACGCCCCGCCAGATGCCTGTGAGGCCTCCTTTGACGCGGTCTCCACCA 
TCCGAGGCGAGCTCTTTTTCTTC 



20 

Expression o/Homo sapiens matrix metalloproteinase 11 (stromelysin 3) (MMP 11) HSSTROL3 
transcripts which are detectable by amplicon as depicted in sequence name HSSTROL3 junc21- 

27 in normal and cancerous lung tissues 
Expression of Homo sapiens matrix metalloproteinase 11 (stromelysin 3) (MMP 11) 

25 transcripts detectable by or according to junc21-27 5 HSSTROL3 junc21-27 amplicon (SEQ ID 
NO: 1681) and primers HSSTROL3 junc21-27F (SEQ ID NO: 1679) and HSSTROL3 junc21- 
27R (SEQ ID NO: 1680) was measured by real time PCR. In parallel the expression of four 
housekeeping genes — PBGD (GenBank Accession No. BC019323; amplicon - PBGD-amplicon, 
SEQ ID NO:334), HPRT1 (GenBank Accession No. NM_000194; amplicon - HPRT1 -amplicon, 

30 SEQ ID NO: 1297), Ubiquitin (GenBank Accession No. BC000449; amplicon - Ubiquitin- 



WO 2006/131783 



PCT/IB2005/004037 



1135 

amplicon, SEQ ID NO:328) and SDHA (GenBank Accession No. NM_004168; amplicon - 
SDHA-amplicon, SEQ ID NO:331), was measured similarly. For each RT sample, the 
expression of the above amplicon was normalized to the geometric mean of the quantities of the 
housekeeping genes. The normalized quantity of each RT sample was then divided by the 
5 median of the quantities of the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 
96-99, Table 2, above), to obtain a value of fold up -regulation for each sample relative to median 
of the normal PM samples. 

Figure 72 is a histogram showing over expression of the above -indicated Homo sapiens 
matrix metalloproteinase 11 (stromelysin 3) (MMP11) transcripts in cancerous lung samples 

1 0 relative to the normal samples. 

As is evident from Figure 72, the expression of Homo sapiens matrix metalloproteinase 
1 1 (stromelysin 3) (MMP1 1) transcripts detectable by the above amplicon(s) in cancer samples 
was significantly higher than in the non-cancerous samples (Sample Nos. 46-50, 90-93, 96-99 
Table 2, ). Notably an over- expression of at least 10 fold was found in 15 out of 15 

15 adenocarcinoma samples, 13 out of 16 squamous cell carcinoma samples, 3 out of 4 large cell 
carcinoma samples and in 5 out of 8 small cells carcinoma samples. 
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Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: HSSTROL3 junc21-27F forward 
primer; and HSSTROL3 junc21-27R reverse primer. 
5 The present invention also preferably encompasses any amplicon obtained through the 

use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: HSSTROL3 
junc21-27. 



10 Primers: 

Forward primer HSSTROL3 junc21-27F (SEQ ID NO: 1679): 
ACATTTGGTTCTTCCAAGGGACTAC 

Reverse primer HSSTROL3 junc21-27R (SEQ ID NO: 1680): 
TCGATCTCAGAGGGCACCC 
15 Amplicon HSSTROL3 junc21-27 (SEQ ID NO: 1681): 

ACATTTGGTTCTTCCAAGGGACTACTGGCGTTTCCACCCCAGCACCCGGCGTGTAGA 
CAGTCCCGTGCCCCGCAGGGCCACTGACTGGAGAGGGGTGCCCTCTGAGATCGA 



20 
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DESCRIPTION FOR CLUSTER HUMTREFAC 
Cluster HUMTREFAC features 2 transcript(s) and 7 segment(s) of interest, the names for 
which are given in Tables 1096 and 1097, respectively, the sequences themselves are given at 
the end of the application. The selected protein variants are given in table 1098. 

5 Table 1096 - Transcripts of interest 



Transcript Name :■■- ■>: ./ 


Sequence ID No. ' / i f 


HUMTREFAC_PEA_2_T4 


131 


HUMTREFAC_PEA_2_T5 


132 


Table 1097 - Segments of interest 


Segment Name . \#? v ■ / ' Jf - '~ ' 


p-Sequenee ID NoV;'; :> • : f f: ' 


HUMTREFAC_PEA_2_node_0 


903 


HUMTREFAC_PEA_2_node_9 


904 


HUMTREFAC_PEA_2_node_2 


905 


HUMTREFAC_PEA_2_node_3 


906 


HUMTREFAC_PEA_2_node_4 


907 


HUMTREFAC_PEA_2_node_5 


908 


HUMTREFAC_PEA_2_node_8 


909 



Table 1098 - Proteins of interest 



Protein Name • ; 


Sequence ID No. 


Corresponding Transcript(s) 


HUMTREFAC_PEA_2_P7 


1399 


HUMTREFAC_PEA_2_T5 


HUMTREFAC_PEA_2_P8 


1400 


HUMTREFAC_PEA_2_T4 



10 

These sequences are variants of the known protein Trefoil factor 3 precursor (SwissProt 
accession identifier TFF3 HUMAN; known also according to the synonyms Intestinal trefoil 
factor; hPl.B), SEQ ID NO: 1456, referred to herein as the previously known protein. 

Protein Trefoil factor 3 precursor is known or believed to have the following function(s): 
15 May have a role in promoting cell migration (motogen). The sequence for protein Trefoil factor 
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3 precursor is given at the end of the application, as "Trefoil factor 3 precursor amino acid 
sequence". Known polymorphisms for this sequence are as shown in Table 1099. 

Table 1099 - Amino acid mutations for Known Protein 



Sl>fP po$iti6n(s) on v 
amino' acid sequence 


Comment "V" ,r v.- : ■;. j . „■■ ' -<jj- : - . h < 4 . 


74-76 


QEA -> TRKT 



5 Protein Trefoil factor 3 precursor localization is believed to be Secreted. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: defense response; digestion, which are annotation(s) related to 
Biological Process; and extracellular, which are annotation(s) related to Cellular Component. 
The GO assignment relies on information from one or more of the SwissProt/TremBl 
10 Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 



Cluster HUMTREFAC can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
15 according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 41 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

20 Overall, the following results were obtained as shown with regard to the histograms in 

Figure 41 and Table 1 100. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: a mixture of malignant tumors from different tissues, breast 
malignant tumors, pancreas carcinoma and prostate cancer. 



25 Table 1100- Normal tissue distribution 



Name of Tissue 



Number 
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adrenal 


40 


colon 


797 


epithelial 


95 


general 


39 


liver 


0 


lung 


57 


lymph nodes 


3 


breast 


0 


muscle 


3 


pancreas 


2 


prostate 


16 


stomach 


0 


Thyroid 


257 


uterus 


54 



Table 1101 -P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI '% 


$2/r. f 


SP1 ■ 




SP2 .... '/ 




adrenal 


6.4e-01 


6.9e-01 


7.1e-01 


1.1 


7.8e-01 


0.9 


colon 


4.6e-01 


5.7e-01 


9.7e-01 


0.5 


1 


0.4 


epithelial 


2.4e-02 


3.4e-01 


9.5e-10 


2.0 


5.3e-02 


1.1 


general 


2.5e-04 


3.9e-02 


1.4e-28 


3.6 


1.9e-10 


1.9 


liver 


1 


6.8e-01 


1 


1.0 


6.9e-01 


1.4 


lung 


4.8e-01 


7.6e-01 


2.2e-03 


1.0 


1.6e-01 


0.5 


lymph nodes 


5.1e-01 


8.0e-01 


2.3e-02 


5.0 


1.9e-01 


2.1 


breast 


7.6e-02 


1.2e-01 


3.1e-06 


12.0 


l.le-03 


6.5 


muscle 


9.2e-01 


4.8e-01 


1 


0.8 


3.9e-01 


2.1 


pancreas 


1.2e-01 


2.4e-01 


5.7e-03 


6.5 


2.1e-02 


4.6 


prostate 


1.5e-01 


2.7e-01 


9.9e-10 


8.1 


3.1e-07 


5.7 


stomach 


3.0e-01 


1.3e-01 


5.0e-01 


2.0 


6.7e-02 


2.8 
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Thyroid 


6.4e-01 


6.4e-01 


9.6e-01 


0.5 


9.6e-01 


0.5 


uterus 


4.1e-01 


7.3e-01 


7.5e-02 


1.3 


4.0e-01 


0.8 



As noted above, cluster HUMTREFAC features 2 transcript(s), which were listed in Table 
1 above. These transcript(s) encode for protein(s) which are variant(s) of protein Trefoil factor 3 
precursor. A description of each variant protein according to the present invention is now 
5 provided. 

Variant protein HUMTREFAC JPEA_2 JP7 according to the present invention has an 
amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
HUMTREF ACPE A2T5 . The location of the variant protein was determined according to 

10 results from a number of different software programs and analyses, including analyses from 

SignalP and other specialized programs. The variant protein is believed to be located as follows 
with regard to the cell: secreted. The protein localization is believed to be secreted because both 
signal-peptide prediction programs predict that this protein has a signal peptide, and neither 
trans- membrane region prediction program predicts that this protein has a trans -membrane 

15 region. 

Variant protein HUMTREFAC_PEA_2_P7 also has the following non-silent SNPs 
(Single Nucleotide Polymorphisms) as listed in Table 1 102, (given according to their position(s) 
on the amino acid sequence, with the alternative amino acid(s) listed; the last column indicates 
whether the SNP is known or not; the presence of known SNPs in variant protein 
20 HUMTREFAC_PEA_2_P7 sequence provides support for the deduced sequence of this variant 
protein according to the present invention). 

Table 1102 - Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


5 


A->S 


No 


5 


A->T 


No 


14 


A->V 


Yes 


43 


L->M 


No 
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60 


P->S 


Yes 


123 


s -> * 


Yes 



Variant protein HUMTREFACJPEAJ2 _P7 is encoded by the following transcript(s): 
HUMTREFACJPEA_2_T5 ? for which the sequence(s) is/are given at the end of the application. 
The coding portion of transcript HUMTREFAC_PEA_2_T5 is shown in bold; this coding 
5 portion starts at position 278 and ends at position 688. The transcript also has the following 

SNPs as listed in Table 1 103 (given according to their position on the nucleotide sequence, with 
the alternative nucleic acid listed; the last column indicates whether the SNP is known or not; 
the presence of known SNPs in variant protein HUMTREFACJPEA_2JP7 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

1 0 Table 1103- Nucleic acid SNPs 



SNP position on nucleotide 

■sequence' , : ••< '■ 


Alternative nucleic acid | 

S ■ ■ . "' -f^B'" jL- 


Previously known SNP? 


233 


A->G 


Yes 


290 


G-> A 


No 


290 


G->T 


No 


318 


C->T 


Yes 


404 


C->A 


No 


404 


C->T 


No 


455 


C->T 


Yes 


645 


C->A 


Yes 


685 


C->T 


No 



Variant protein HUMTREF ACJPE A_2_P 8 according to the present invention has an 
amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
15 HUMTREF AC_PE A_2_T4 . An alignment is given to the known protein (Trefoil factor 3 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
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relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between HUMTREFACJPEA_2JP8 and TFF3HUM AN : 
1 An isolated chimeric polypeptide encoding for HUMTREFACJ > EA_2JP8, comprising 
5 a first amino acid sequence being at least 90 % homologous to 

MAARALCMLGLVLALLS S S S AEE YVGL corresponding to amino acids 1 - 27 of 
TFF3JHUMAN, which also corresponds to amino acids 1 - 27 of HUMTREFAC_PEA_2_P8, 
and a second amino acid sequence being at least 70%, optionally at least 80% 5 preferably at least 
85%, more preferably at least 90% and most preferably at least 95% homologous to a 
10 polypeptide having the sequence WKVHLPKGEGFSSG corresponding to amino acids 28-41 
of HUMTREFACJPEA_2_P8, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

2 An isolated polypeptide encoding for a tail of HUMTREFAC_PEA_2JP8, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
15 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence WKVHLPKGEGFSSG in HUMTREFACJ > EA_2_P8. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 

20 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HUMTREF AC JPE A_2 JP 8 also has the following non-silent SNPs 

25 (Single Nucleotide Polymorphisms) as listed in Table 1 104, (given according to their position(s) 
on the amino acid sequence, with the alternative amino acid(s) listed; the last column indicates 
whether the SNP is known or not; the presence of known SNPs in variant protein 
HUMTREF AC_PE A_2 JP 8 sequence provides support for the deduced sequence of this variant 
protein according to the present invention). 

30 Table 1104 - Amino acid mutations 
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SNP position(s) on amino acid 
sequence ' -V- ■ f ' ■ |' . 

i . i— - — — — — - 


Alternative amino acid(s) 


Previously known SNP? 


5 


A->S 


No 


5 


A->T 


No 


14 


A-> V 


Yes 



Variant protein HUMTREFACJPEAJMP8 is encoded by the following transcript(s): 
FIUMTREFAC_PEA_2_T4 ? for which the sequence(s) is/are given at the end of the application. 
The coding portion of transcript HUMTREFACPEA2T4 is shown in bold; this coding 
5 portion starts at position 278 and ends at position 400. The transcript also has the following 

SNPs as listed in Table 1 105 (given according to their position on the nucleotide sequence, with 
the alternative nucleic acid listed; the last column indicates whether the SNP is known or not; 
the presence of known SNPs in variant protein HUMTREFAC_PEA_2_P8 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

10 Table 1105 - Nucleic acid SNPs 



SNP position on nucleotide v 
sequence ; . 


Alternative nucleic acid 


Previously known SNP? ' 

• .. - •'- . ' ' * ',- 


233 


A->G 


Yes 


290 


G-> A 


No 


290 


G->T 


No 


318 


C->T 


Yes 


515 


C-> A 


No 


515 


C->T 


No 


566 


C->T 


Yes 


756 


C->A 


Yes 


796 


C->T 


No 


1265 


A->C 


No 


1266 


A->T 


No 



As noted above, cluster HUMTREFAC features 7 segment(s), which were listed in Table 
2 above and for which the sequence(s) are given at the end of the application. These segment(s) 
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are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
provided. 

Segment cluster HUMTREF AC_PEAJ2_node_0 according to the present invention is 
supported by 188 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMTREFAC PEA 2 T4 and 
HUMTREF AC JPEA2T5 . Table 1106 below describes the starting and ending position of this 
segment on each transcript. 
Table 1106- Segment location on transcripts 



Tj^script name' ^ ' 

i--^: ^ %■ % 


Segment f 
starting position ; 


Segment t 
ending position J 


HUMTREFAC_PEA_2_T4 


1 


359 


HUMTREF AC_PEA_2_T5 


1 


359 



Segment cluster HUMTREF ACPEA2_node_9 according to the present invention is 
supported by 150 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMTREF AC_PEA_2_T4 and 
HUMTREF AC_PEA2_T5 . Table 1107 below describes the starting and ending position of this 
segment on each transcript. 
Table 1107 - Segment location on transcripts 



Transcript name . / ; 


Segment 
starting position 


Segment 

ending position > ■ 


HUMTREF AC_PEA_2_T4 


681 


1266 


HUMTREF AC_PEA_2_T5 


570 


747 



the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 
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Segment cluster HUMTREFAC_PEA_2_node_2 according to the present invention is 
supported by 4 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HUMTREFAC_PEA_2_T4. Table 1108 
below describes the starting and ending position of this segment on each transcript. 



5 Table 1108 - Segment location on transcripts 



Transcript name ' ... , . '< " r - j •• r '\ v - f 


■Segment ' , - . : ? y: ; ^ 
starting position t y\; . 


Segment '' x V'" 
ppndxag position 


HUMTREFAC_PEA_2_T4 


360 


470 



Segment cluster HUMTREFAC_PEA_2_node_3 according to the present invention is 
supported by 10 libraries. The number of libraries was determined as previously described. This 
10 segment can be found in the following transcript(s): HUMTREFAC_PEA_2_T4 and 

HUMTREF ACPE A_2_T5 . Table 1 109 below describes the starting and ending position of this 
segment on each transcript. 

Table 1109 - Segment location on transcripts 





Segm&it .. j ' - 
starting positton ; 


h Segment A- 
\ ending position 


HUMTREFAC_PEA_2_T4 


471 


514 


HUMTREFACJPEA_2_T5 


360 


403 



15 

Segment cluster HUMTREFAC_PEA_2_node_4 according to the present invention is 
supported by 197 libraries. The number of libraries was determined as previously described. 
This segment can be found in the following transcript(s): HUMTREFAC_PEA_2_T4 and 
HUMTREF AC_PE A_2_T5 . Table 1110 below describes the starting and ending position of this 
20 segment on each transcript. 

Table 1110 - Segment location on transcripts 
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Transcript name /•_/ -i. 


Segment 
starting position 


Segment 

ending position '. . 


HUMTREFAC_PEA_2_T4 


515 


611 


HUMTREFAC_PEA_2_T5 


404 


500 



Segment cluster HUMTREFACJPEA_2_node_5 according to the present invention is 
supported by 187 libraries. The number of libraries was determined as previously described. 
5 This segment can be found in the following transcript (s): HUMTREFACJPEA_2_T4 and 

HUMTREFAC PE A 2 T5 . Table 1111 below describes the starting and ending position of this 
segment on each transcript. 

Table 1111 - Segment location on transcripts 



Transcript name "'v/'. 


' Segment M\, 
1 starting position 


Segment "J 
endittg position' 


HUMTREFACPEA2T4 


612 


661 


HUMTREFAC_PEA_2_T5 


501 


550 



10 

Segment cluster HUMTREFAC_PEA_2_node_8 according to the present invention can 
be found in the following transcript(s): HUMTREFACJPEA_2JT4 and 

HUMTREF AC PE A 2 T5 . Table 1112 below describes the starting and ending position of this 
segment on each transcript. 

15 Table 1112 - Segment location on transcripts 



Transcript name 


Segment 
\ starting position 


Segment 
ending position 


HTJMTREFAC_PEA_2_T4 


662 


680 


HUMTREFAC_PEA_2_T5 


551 


569 
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Variant protein alignment to the previously known protein: 

Sequence name: TFF3_HUMAN 

Sequence documentation : 

Alignment of: HUMTREFAC_PEA_2_P8 x TFF3JHUMAN 



Alignment segment 1/1: 

Quality : 

Escore: 0 

Matching length: 
length: 27 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100 .00 

Gaps : 



246.00 

27 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 



Alignment : 

1 MAARALCMLGLVLALLS S S S AEE YVGL 

II I I I I II I I I ! I 1 I I I 1 I I I I i I I I I 

1 MAARALCMLGLVLALLS S S S AEE YVGL 

DESCRIPTION FOR CLUSTER HSS100PCB 
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Cluster HSS100PCB features 1 transcript(s) and 3 segment(s) of interest, the names for 
which are given in Tables 1113 and 1114, respectively, the sequences themselves are given at 
the end of the application. The selected protein variants are given in table 1115. 



5 



Table 1113 - Transcripts of interest 


Transcript Name '-. *?CW 


Sequence ID No. . ■"""■"y 


HSSIOOPCBJTI 


133 


Table 1114 - Segments of interest 


Segment Name - \: v w " : f ? : - f: m: ■ 


Se4 uen ce ID No, • ; ' ■ . 


HSS100PCB_node_3 


910 


HSS100PCB_node_4 


911 


HSS100PCB_node_5 


912 



Table 1115 - Proteins of interest 



Protein Name , : J-\ • ■ ■ 


Sequence iD No. : *'/ V, 


Corresponding Transcripts) 


HSS100PCB_P3 


1401 


HSS100PCB_T1 



10 These sequences are variants of the known protein S-100P protein (SwissProt accession 

identifier SIOPJHDUMAN), SEQ ID NO: 1457, referred to herein as the previously known 
protein, which binds two calcium ions. 

The sequence for protein S-100P protein is given at the end of the application, as "S-100P 
protein amino acid sequence". Known polymorphisms for this sequence are as shown in Table 

15 1116. 



Table 1116- Amino acid mutations for Known Protein 



SNP position(s) on 
amino acid sequence 


Comment 


32 


E->T 


44 


F->E 
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The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: calcium binding; protein binding, which are annotation(s) related to 
Molecular Function. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
5 Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

Cluster HSS100PCB can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
10 according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the ^axis of figure 42 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

1 5 Overall, the following results were obtained as shown with regard to the histograms in 

Figure 42 and Table 1117. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: a mixture of malignant tumors from different tissues. 

Table 1117 - Normal tissue distribution 



Name of Tissue - 


Number ? 


bladder 


41 


colon 


37 


epithelial 


38 


general 


22 


kidney 


0 


liver 


0 


lung 


18 


breast 


0 


bone marrow 


0 


ovary 


0 


pancreas 


0 
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prostate 


46 


stomach 


553 


uterus 


13 



Table 1118 -P values and ratios for expression in cancerous tissue 



Name of Tisstie 




P2 


SP1 


R3 


sp2 


R4 v,.'-'r- 


bladder 


3.3e-01 


2.9e-01 


2.9e-02 


2.8 


3.5e-02 


2.8 


colon 


3.0e-01 


1.9e-01 


5.2e-01 


1.2 


2.4e-01 


1.7 


epithelial 


4.7e-02 


1.6e-02 


2.0e-01 


1.2 


6.1e-02 


1.3 


general 


l.le-03 


6.8e-05 


1.4e-02 


1.5 


4.9e-04 S 


1.7 


kidney 


6.5e-01 


7.2e-01 


5.8e-01 


1.7 


7.0e-01 


1.4 


liver 


9.1e-01 


4.9e-01 


1 


1.0 


7.7e-02 


2.1 


lung 


6.8e-01 


7.3e-01 


2.2e-02 


2.9 


1.3e-01 


1.7 


breast 


2.8e-01 


3.2e-01 


4.7e-01 


2.0 


6.8e-01 


1.5 


bone marrow 


1 


6.7e-01 


1 


1.0 


2.8e-01 


2.8 


ovary 


2.6e-01 


3.0e-01 


4.7e-01 


2.0 


5.9e-01 


1.7 


pancreas 


3.3e-01 


4.4e-01 


7.6e-02 


3.7 


1.5e-01 


2.8 


prostate 


9.1e-01 


9.3e-01 


5.8e-01 


0.6 


7.6e-01 


0.5 


stomach 


3.7e-01 


3.2e-01 


1 


0.1 


1 


0.3 


uterus 


9.4e-01 


7.0e-01 


1 


0.6 


4.1e-01 


1.1 



above. These transcript(s) encode for protein(s) which are variant(s) of protein S-100P protein. 
A description of each variant protein according to the present invention is now provided. 



Variant protein HSS100PCBJP3 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) HSS100PCB_T1. 
The location of the variant protein was determined according to results from a number of 
10 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
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prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HSS100PCBJP3 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1119, (given according to their position(s) on the 
5 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein HSS100PCB_P3 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 1119 - Amino acid mutations 



SNl positiori(s) on amino acid 
sequence \ -J' 


Alternative amino acid(s) . f 


Previously known SNP? 


1 


M->R 


Yes 


11 


M->L H 


Yes 


20 


L->F 


Yes 



Variant protein HSS100PCB_P3 is encoded by the following transcript(s): 
HSS100PCB_T1, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HSS100PCBJT1 is shown in bold; this coding portion starts at 
position 1057 and ends at position 1533. The transcript also has the following SNPs as listed in 
15 Table 1 120 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HSS100PCBJP3 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 1120 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


52 


C->T 


Yes 


107 


A->C 


Yes 


458 


C ->T 


Yes 
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468 


A->G 


Yes 


648 


C ->T 


Yes 


846 


C->G 


Yes 


882 


G->A 


Yes 


960 


C->T 


No 


965 


C ->T 


Yes 


1058 


T->G 


Yes 


1087 


A->C 


Yes 


1114 


C->T 


Yes 


1968 


G -> A 


Yes 


1971 


C->T 


Yes 


2010 


C -> A 


Yes 


2099 


G-> 


No 



As noted above, cluster HSS100PCB features 3 segments), which were listed in Table 2 
above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest A description of each segment according to the present invention is now 



5 provided. 

Segment cluster HSS100PCBjtiode_3 according to the present invention is supported by 
16 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSS100PCB_T1. Table 1121 below describes the 
10 starting and ending position of this segment on each transcript. 

Table 1121 - Segment location on transcripts 



Transcript name 


[ Segment 

■ starting position 


Segment 
ending position 


HSS100PCB_T1 


1 


1133 
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Segment cluster HSS100PCB_node_4 according to the present invention is supported by 
29 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSS100PCB_T1. Table 1122 below describes the 
starting and ending position of this segment on each transcript. 



5 Table 1123 - Segment location on transcripts 



Tmnscript name — v : f ; - \ 


Segment •• ? / 
stealing position I f V: 


Segment 

ending position * 


HSS100PCBJT1 


1134 


1923 



Segment cluster HSS100PCB_node_5 according to the present invention is supported by 
141 libraries. The number of libraries was determined as previously described. This segment can 
be found in the following transcript(s): HSS100PCB_T1. Table 1124 below describes the 
10 starting and ending position of this segment on each transcript. 

Table 1124 - Segment location on transcripts 



Transcript-name- . f?i '\f ' ■% 

z " - - ' y * * , ':' 


^Segpibit §*f . | 
starting position" \| 


Segment: \}f*' : >v - ? / 
ending position 


HSS100PCBJT1 


1924 


2201 



15 



WO 2006/131783 



PCT/IB2005/004037 



1154 

DESCRIPTION FOR CLUSTER HSU33147 
Cluster HSU33147 features 2 transcript(s) and 5 segment(s) of interest, the names for 
which are given in Tables 1125 and 1126, respectively, the sequences themselves are given at 
the end of the application. The selected protein variants are given in table 1 127. 



Table 1125 - Transcripts of interest 


Transcript Name • 


Sequence TD No. ; cf r. _ 


HSU33 147JPEA_1_T1 


1464 


HSU33 147_PEA_1_T2 


1465 


Table 1126 - Segments of interest 


Segment Name . 


Sequence ID No. • s " " 


HSU33 147_PEA_l_node_0 


1276 


HSU33 147_PEA_l_node_2 


1277 


HSU33 1 47_PEA_l_node_4 


1278 


HSU33 147_PEA_l_node_7 


1279 


HSU33 147_PEA_l_node_3 


1280 



Table 1127 - Proteins of interest 



Protein Name 


Sequence ID No. ' 


Corresponding Transcripts) 


HSU33 1 47_PEA_1_P5 


1415 


HSU33 147_PEA_1_T1 ; 
HSU33 147_PEA_1_T2 



These sequences are variants of the known protein Mammaglobin A precursor (SwissProt 
accession identifier MGBA HUMAN; known also according to the synonyms Mammaglobin 1; 
Secretoglobin family 2A member 2), SEQ ID NO: 1416, referred to herein as the previously 
known protein. 

15 The sequence for protein Mammaglobin A precursor is given at the end of the application, 

as "Mammaglobin A precursor amino acid sequence". 
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It has been investigated for clinical/therapeutic use in humans, for example as a target for 
an antibody or small molecule, and/or as a direct therapeutic; available information related to 
these investigations is as follows. Potential pharmaceutically related or therapeutically related 
activity or activities of the previously known protein are as follows: Immunostimulant. A 
5 therapeutic role for a protein represented by the cluster has been predicted. The cluster was 
assigned this field because there was information in the drug database or the public databases 
(e.g., described herein above) that this protein, or part thereof, is used or can be used for a 
potential therapeutic indication: Anticancer. 

The following GO Annotation(s) apply to the previously known protein. The following 
10 annotation(s) were found: steroid binding, which are annotation(s) related to Molecular 
Function. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

15 

Cluster HSU33147 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the left hand column of 
the table and the numbers on the y-axis of figure 43 refer to weighted expression of ESTs in 
20 each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 43 and Table 1 128. This cluster is overexpressed (at least at a minimum level) in the 
25 following pathological conditions: a mixture of malignant tumors from different tissues. 



Table 1128 - Normal tissue distribution 



Name of Tissue 


Number 


epithelial 


6 


general 


2 
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lung 


0 


breast 


131 



Table 1129 -P values and ratios for expression in cancerous tissue 



Maine of Tissue 




P2 \ 


SPl 


R3 


SP2 


R4 .;'*' ' 


epithelial 


4.1e-02 


6.4e-02 


1.5e-12 


2.6 


2.2e-06 


1.5 


general 


1.6e-02 


l.le-02 


1.2e-22 


4.4 


7.2e-13 


2.4 


lung 


1 


6.3e-01 


1 


1.0 


6.2e-01 


1.6 


breast 


8.6e-02 


l.le-01 


3.4e-07 


1.7 


2.6e-03 


1.0 



As noted above, cluster HSU33147 features 2 transcript(s), which were listed in Table 1 
5 above. These transcript(s) encode for protein(s) which are variant(s) of protein Mammaglobin A 
precursor. A description of each variant protein according to the present invention is now 
provided. 

Variant protein HSU33147JPEA_1 JP5 according to the present invention has an amino 
10 acid sequence as given at the end of the application; it is encoded by transcript(s) 

HSU33147_PEA_1_T1. An alignment is given to the known protein (Mammaglobin A 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
15 is as follows: 

Comparison report between HSU33147_PEAJ_P5 and MGBAJHUMAN: 

l.An isolated chimeric polypeptide encoding for HSU33147_PEA_1 JP5, comprising a 

first amino acid sequence being at least 90 % homologous to 

MKIXMVLMLAAL^ 

20 DELKECFLNQTDETLSNVE corresponding to amino acids 1 - 78 of MGBA HUMAN, which 
also corresponds to amino acids 1-78 of HSU33147JPEA1P5, and a second amino acid 
sequence being at least 90 % homologous to QLI YDS SLCDLF corresponding to amino acids 82 
- 93 of MGB A HUMAN, which also corresponds to amino acids 79 - 90 of 
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HSU33147JPEA_1JP5, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

2.An isolated chimeric polypeptide encoding for an edge portion of 
HSU33147JPEA_1JP5, comprising a polypeptide having a length "n", wherein n is at least 
5 about 10 amino acids in length, optionally at least about 20 amino acids in length, preferably at 
least about 30 amino acids in length, more preferably at least about 40 amino acids in length and 
most preferably at least about 50 amino acids in length, wherein at least two amino acids 
comprise EQ, having a structure as follows: a sequence starting from any of amino acid numbers 
78-x to 78; and ending at any of amino acid numbers 79+ ((n-2) - x), in which x varies from 0 to 
10 n-2. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
15 secreted. The protein localization is believed to be secreted because both signatpeptide 

prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

The glycosylation sites of variant protein HSU33147_PEA_1 JP5, as compared to the 
20 known protein Mammaglobin A precursor, are described in Table 1 130 (given according to their 
position(s) on the amino acid sequence in the first column; the second column indicates whether 
the glycosylation site is present in the variant protein; and the last column indicates whether the 
position is different on the variant protein). 



Table 1130 - Glycosylation site(s) 



Positions) on known amino 
acid sequence 


Present in variant protein? 


Position in variant protein? 


68 


yes 


68 


53 


yes 


53 
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Variant protein HSU33147_PEA__1JP5 is encoded by the following transcript(s): 
HSU33147JPEAJMT1, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript HSU33147JPEA_1_T1 is shown in bold; this coding portion starts 
at position 72 and ends at position 34 L The transcript also has the following SNPs as listed in 
5 Table 1131 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein HSU33147JPEA_1_P5 sequence provides support for the 
deduced sequence of this variant protein according to the present invention). 

Table 1131 - Nucleic acid SNPs 



iSNP position QhJnucleotKie 
.sequence 1 .~V; 


^Altei^ative nucleic aqjtd 


Previously known SNP? i 


84 


A->C 


No 


124 


C-> 


No 


396 


A->G 


No 


As noted above, cluster E 


[SU33147 features 5 segment(s), which were listed in Table 2 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
provided. 

15 

Segment cluster HSU33 147 JPEA_l_node_0 according to the present invention is 
supported by 38 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HSU33147_PEA_1_T1 and 
HSU33147_PEA_1_T2. Table 1 132 below describes the starting and ending position of this 
20 segment on each transcript. 



Table 1132 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


HSU33 147_PEA_1_T1 


1 


126 
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HSU33147_PEA_1_T2 


1 


126 









Segment cluster HSU33147_PEA_l_node_2 according to the present invention is 
supported by 44 libraries. The number of libraries was determined as previously described. This 
5 segment can be found in the following trans cript(s): HSU33147JPEA_1_T1 and 

HSU33147_PEA_1_T2. Table 1 133 below describes the starting and ending position of this 
segment on each transcript. 

Table 1133 - Segment location on transcripts 



TriHscript name v ' :f 


Segment 

starting ^ositiOBf 

• ' zK. ■ *? • *• " ; 


: Segment ; >::; 

? ending position f % 


HSU33 147_PEA_1_T1 


127 


305 


HSU33 147_PEA_1_T2 


127 


305 



10 

Segment cluster HSU33147JPEA_l_node_4 according to the present invention is 
supported by 3 libraries. The number of libraries was determined as previously described. This 
segment can be found in the following transcript(s): HSU33147JPEA_1JT2. Table 1134 below 
describes the starting and ending position of this segment on each transcript. 

15 Table 1134 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


s Segment ../ 
ending position 


HSU33 147JPEA_1_T2 


315 


907 



Segment cluster HSU33147JPEA_l_node_7 according to the present invention is 
supported by 35 libraries. The number of libraries was determined as previously described. This 
20 segment can be found in the following transcript(s): HSIB3147JPEAJL.T1. Table 1 135 below 
describes the starting and ending position of this segment on each transcript. 
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Table 1135 - Segment location on transcripts 



Transcript name ' 


; Segment 
starting position j* 


Segment J ; 
; ending position 


HSU33 147JPEA_1_T1 


306 


516 



the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



Segment cluster HSU33147JPEA__l_node_3 according to the present invention can be 
found in the following transcript(s): HSU33147JPEA_1_T2. Table 1136 below describes the 
starting and ending position of this segment on each transcript. 

Table 1136 - Segment location on transcripts 



i *vV • •': ' ' 'm. \ii'F : 


.Segment \ I 
f starting position ;>r, ; 


dkding#^sition , I' 


HSU33 147_PEA_1_T2 


306 


314 



10 



Variant protein alignment to the previously known protein: 

Sequence name: MGBA_HUMAN 

20 Sequence documentation: 

Alignment of: HSU3314 7_PEA_1_P5 x MGBA_HUMAN 



Alignment segment 1/1: 
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10 



Quality : 

Escore: 0 

Matching length: 
length: 93 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 96.77 

Gaps : 



776.00 



90 



96.77 



Total 



100.00 Matching Percent 



Total Percent 



Alignment : 



15 



1 MKLLMVLMLAALSQHCYAGSGCPLLENVISKTINPQVSKTEYKELLQEFI 5 0 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 ) i 1 1 1 1 1 1 1 1 1 1 ! ii i i 1 1 1 1 1 1 1 1 

1 MKLLMVLMLAALSQHCYAGSGCPLLENVISKTINPQVSKTEYKELLQEFI 50 



20 



51 DDNATTNAIDELKECFLNQTDETLSNVE. . . QLIYDSSLCDLF 

I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I 

51 DDNATTNAIDELKECFLNQTDETLSNVEVFMQLIYDSSLCDLF 



90 



93 



DESCRIPTION FOR CLUSTER R20779 
Cluster R20779 features 1 transcript(s) and 24 segment(s) of interest, the names for which 
25 are given in Tables 1 137 and 1138, respectively, the sequences themselves are given at the end 
of the application. The selected protein variants are given in table 1 139. 

Table 1137 - Transcripts of interest 



Transcript Name 


Sequence ID No. 


R20779_T7 


134 
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Table 1138 - Segments of interest 


1162 


Segment Name .• ''• • ?.>/." 


Sequence ID No. 


R20779_node_0 


913 


R20779_node_2 


914 


R20779_node_7 


915 


R20779_node_9 


916 


R20779_node_18 


917 


R20779_node_21 


918 


R20779_node_24 


919 


R20779_node_27 


920 


R20779_node_28 


921 


R20779_node_30 


922 


R20779_node_31 


923 


R20779_node_32 


924 


R20779_node_l 


925 


R20779_node_3 


926 


R20779_node_10 


927 


R20779_node_l 1 


928 


R20779_node_14 


929 


R20779_node_17 


930 


R20779_node_19 


931 


R20779_node_20 


932 


R20779_node_22 


933 


R20779_node_23 


934 


R20779_node_25 


935 


R20779_node_29 


936 


Table 1139 - Proteins of interest 



Protein Name 



Sequence ID No. 



Corresponding Transcript(s) 



WO 2006/131783 



PCT/IB2005/004037 



1163 



R20779_P2 


1402 


R20779_T7 









These sequences are variants of the known protein Stanniocalcin 2 precursor (SwissProt 
accession identifier STC2JHUMAN; known also according to the synonyms STC-2; 
Stanniocalcin-related protein; STCRP; STC-related protein), SEQ ID NO: 1458, referred to 
5 herein as the previously known protein. 

Protein Stanniocalcin 2 precursor is known or believed to have the following function(s): 
Has an anti-hypocalcemic action on calcium and phosphate homeostasis. The sequence for 
protein Stanniocalcin 2 precursor is given at the end of the application, as "Stanniocalcin 2 
precursor amino acid sequence". Protein Stanniocalcin 2 precursor localization is believed to be 
1 0 Secreted (Potential). 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: cell surface receptor linked signal transduction; cell-cell signaling; 
nutritional response pathway, which are annotation(s) related to Biological Process; hormone, 
which are annotation(s) related to Molecular Function; and extracellular, which are 
15 annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <http ://w ww.ncbi .nlm.nih. gov/proj ects/LocusLink/> . 



20 Cluster R20779 can be used as a diagnostic marker according to overexpression of 

transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y^axis of figure 44 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 

25 the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 44 and Table 1 140. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: epithelial malignant tumors, a mixture of malignant tumors 
30 from different tissues and lung malignant tumors. 
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Table 1140 - Normal tissue distribution 



, Name of Tissue ; ? J - ^ - 


Nurriber; 


bone 


825 


brain 


0 


colon 


0 


epithelial 


32 


general 


38 


kidney 


22 


liver 


9 


lung 


11 


lymph nodes 


0 


breast 


215 


muscle 


35 


ovary 


36 


pancreas 


4 


prostate 


80 


skin 


99 


stomach 


0 


uterus 


4 



Table 1141 - P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI 


P2 


SP1 


R3 


SP2 


R4 


bone 


5.9e-01 


7.4e-01 


1 


0.2 


1 


0.1 


brain 


2.5e-02 


1.6e-02 


2.2e-01 


6.0 


3.5e-02 


8.0 


colon 


1.7e-01 


1.7e-01 


1 


1.3 


7.7e-01 


1.5 


epithelial 


1.7e-01 


1.5e-03 


5.9e-01 


1.0 


2.0e-04 


2.0 


general 


2.4e-02 


6.2e-07 


7.6e-01 


0.8 


4.6e-05 


1.6 


kidney 


4.3e-01 


2.7e-01 


6.2e-01 


1.3 


1.5e-01 


2.0 
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liver 


8.3e-01 


7.6e-01 


1 


0.8 


3.3e-01 


1.6 


lung 


1.2e-01 


1.4e-03 


1.9e-01 


2.9 


1.6e-05 


7.7 


lymph nodes 


1 


3.1e-01 


1 


1.0 


1 


1.4 


breast 


6.8e-01 


6.8e-01 


6.9e-01 


0.8 


3.6e-01 


0.8 


muscle 


9.2e-01 


4.8e-01 


1 


0.3 


1.4e-03 


1.4 


ovary 


8.4e-01 


7.1e-01 


9.0e-01 


0.7 


8.6e-01 


0.8 


pancreas 


9.3e-01 


6.8e-01 


1 


0.7 


1.5e-01 


2.0 


prostate 


9.1e-01 


5.0e-01 


9.8e-01 


0.4 


5.7e-01 


0.7 


skin 


6.3e-01 


7.5e-01 


7.1e-01 


0.8 


9.5e-01 


0.3 


stomach 


1 


4.5e-01 


1 


1.0 


5.1e-01 


1.8 


uterus 


7.1e-01 


2.6e-01 


4.4e-01 


1.7 


4.1e-01 


1.8 



above. These transcript(s) encode for protein(s) which are variant(s) of protein Stanniocalcin 2 
precursor. A description of each variant protein according to the present invention is now 
provided. 



Variant protein R20779JP2 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) R20779JT7. An 
alignment is given to the known protein (Stanniocalcin 2 precursor) at the end of the application. 
One or more alignments to one or more previously published protein sequences are given at the 
1 0 end of the application. A brief description of the relations hip of the variant protein according to 
the present invention to each such aligned protein is as follows: 

Comparison report between R20779_P2 and STC2 JHUMAN: 

l.An isolated chimeric polypeptide encoding for R20779_P2, comprising a first amino 
acid sequence being at least 90 % homologous to 

15 MCAERLGQFMTLALVLATFDPARGTDATNPPEGPQDRSSQQKGRLSLQNTAEIQHCLV 
NAGDVGCGWECFENNSCEIRGLHGICMTFLHNAGKFDAQGKSFIKDALKCKAHAL 
RFGCISRKCPAIREMVSQLQRECYLKHDLCAAAQENTRVIVEMIHFKDLLLHE 
corresponding to amino acids 1-169 of STC2JBUMAN, which also corresponds to amino 
acids 1-169 of R20779_P2, and a second amino acid sequence being at least 70%, optionally at 

20 least 80%, preferably at least 85%, more preferably at least 90% and most preferably at least 
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95% homologous to a polypeptide having the sequence CYKIEITMPKRRKVKLRD 
corresponding to amino acids 170- 187 of R20779_P2, wherein said first amino acid sequence 
and second amino acid sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of R20779JP2, comprising a polypeptide 
5 being at least 70%, optionally at least about 80%, preferably at least about 85%, more preferably 
at least about 90% and most preferably at least about 95% homologous to the sequence 
CYKIEITMPKRRKVKLRD in R20779JP2. 

The location of the variant protein was determined according to results from a number of 
10 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 
15 Variant protein R20779_P2 also has the following non- silent SNPs (Single Nucleotide 

Polymorphisms) as listed in Table 1142, (given according to their position(s) on the amino acid 
sequence, with the alternative amino acid(s) listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein R20779_P2 sequence provides 
support for the deduced sequence of this variant protein according to the present invention). 

20 Table 1142 - Amino acid mutations 



SNP positkm(s) on amino acid 
sequence 1 ' \;y. 


Alternative amino acid(s) 


Previously Mown SNP? . 


16 


L-> 


No 


98 


Q> 


No 


171 


Y->C 


Yes 


177 


M->V 


Yes 



The glycosylation sites of variant protein R20779JP2, as compared to the known protein 
Stanniocalcin 2 precursor, are described in Table 1 143 (given according to their position(s) on 
the amino acid sequence in the first column; the second column indicates whether the 
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glycosylation site is present in the variant protein; and the last column indicates whether the 
position is different on the variant protein). 

Table 1 143 - Glycosylation site(s) 



Position(s) on kn^wn amino • 
acid sentience; J? 'h . r : 


Present in variant protein? . ; f 


Position in variant protein? 'j 


73 


yes 


73 



5 Variant protein R20779_P2 is encoded by the following transcript(s): R20779JT7, for 

which the sequence(s) is/are given at the end of the application. The coding portion of transcript 
R20779_T7 is shown in bold; this coding portion starts at position 1397 and ends at position 
1957. The transcript also has the following SNPs as listed in Table 1 144 (given according to 
their position on the nucleotide sequence, with the alternative nucleic acid listed; the last column 
10 indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
R20779JP2 sequence provides support for the deduced sequence of this variant protein 
according to the present invention). 



Table 1144 - Nucleic acid SNPs 



SNP position on nucleotide 5 
'.sequence .' ... : ii '-X_?:\ 


Alternative nucleic acid * ? 


Previously known SNP? / 


1442 


T-> 


No 


1690 


G-> 


No 


1732 


C->T 


Yes 


1867 


G->T 


Yes 


1908 


A->G 


Yes 


1925 


A->G 


Yes 


1968 


G->A 


Yes 


2087 


C->T 


No 


2138 


C->T 


Yes 


2270 


C-> 


No 


2443 


A-> 


No 
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2478 


G-> 


No 


2479 


C -> A 


No 


2616 


C ->A 


No 


2941 


C -> 


No 


3196 


-> A 


No 


3479 


T->G 


Yes 


4290 


C ->T 


Yes 


4358 


G-> A 


Yes 


5363 


G-> A 


No 


As noted above, cluster R20779 features 24 segment(s), w 


hich were listed in Table 2 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 



5 provided. 

Segment cluster R20779_nodeJ) according to the present invention is supported by 31 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779_T7. Table 1 145 below describes the starting and 
10 ending position of this segment on each transcript. 

Table 1145 - Segment location on transcripts 



^^eript name ,; 1 


1 Segment 
starting position 


Segment • 
endmgposMoh" 


R20779_T7 ^ 


1 


1298 



Segment cluster R20779_node_2 according to the present invention is supported by 55 
15 libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779JT7. Table 1146 below describes the starting and 
ending position of this segment on each transcript. 

Table 1146 - Segment location on transcripts 
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Transcript narde 


Segment 

/stalling position 


Segment 
ending position 


R20779_T7 


1337 


1506 



Segment cluster R20779_nodeJ7 according to the present invention is supported by 63 
libraries. The number of libraries was determined as previously described. This segment can be 



5 found in the following transcript(s): R20779_T7. Table 1 147 below describes the starting and 
ending position of this segment on each transcript. 

Table 1147 - Segment location on transcripts 





; Segment 'tk - * r ^. : 
j starting position- 


Segment .. ' ■ "'- 
ending position : \. 


R20779T7 


1548 


1690 



10 Segment cluster R20779_node_9 according to the present invention is supported by 66 

libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779_T7. Table 1 148 below describes the starting and 
ending position of this segment on each transcript. 

Table 1148- Segment location on transcripts 





Segment 
starting position 


1 Segment . *. ; , . 
ending position 


R20779JT7 


1691 


1838 



15 

Segment cluster R20779_node_18 according to the present invention is supported by 61 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779JT7. Table 1 149 below describes the starting and 
20 ending position of this segment on each transcript. 
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Table 1149 - Segment location on transcripts 



iTCansqmpt name ,( '" i ; , 


, Segment } 
'.starting position 


Segment 
ending position 


R20779JT7 


2009 


2176 



Segment cluster R20779_node_21 according to the present invention is supported by 106 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779__T7. Table 1 150 below describes the starting and 
ending position of this segment on each transcript. 



Table 1150 - Segment location on transcripts 



itransaipt name . • ;; ,r> ' ;> 


; Segment , ,.. ,# ... 
starting position \£ 


1 Segment: . > 
ehding position J ; 


R20779_T7 


2219 


2796 



Segment cluster R20779_node_24 according to the present invention is supported by 100 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779JT7. Table 1151 below describes the starting and 
ending position of this segment on each transcript. 



Table 1151 - Segment location on transcripts 



Transcript name 


\ Segment; 
starting position 


Segment 
ending position 


R20779_T7 


2977 


3667 



Segment cluster R20779__node_27 according to the present invention is supported by 26 
libraries. The number of libraries was determined as previously described. This segment can be 
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found in the following transcript(s): R20779_T7. Table 1 152 below describes the starting and 
ending position of this segment on each transcript. 

Table 1152 - Segment location on transcripts 



Transcript name f - 


Segment 
starting, position 


Segment •{ , 
ending position 


R20779_T7 


3673 


3803 



5 

Segment cluster R20779_node_28 according to the present invention is supported by 31 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779JT7. Table 1 153 below describes the starting and 
ending position of this segment on each transcript. 

1 0 Table 1153- Segment location on transcripts 



,$ Transcript name^ ' , t \ 

■ ■■ C \ ' y ' ; > 


Segment ; :-\ 
starting position 


;},Seg&ent ■ 
ending position # 


R20779_T7 


3804 


4050 



Segment cluster R20779_node_30 according to the present invention is supported by 34 
libraries. The number of libraries was determined as previously described. This segment can be 
15 found in the following transcript(s): R20779_T7. Table 1 154 below describes the starting and 
ending position of this segment on each transcript. 



Table 1154 - Segment location on transcripts 



Transdript name 


Segment 
starting position 


Segment 
ending position 


R20779 T7 


4068 


4193 
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Segment cluster R20779_node_3 1 according to the present invention is supported by 46 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779JT7. Table 1155 below describes the starting and 
ending position of this segment on each transcript. 



5 Table 1155 - Segment location on transcripts 



Transcript name ^ - 


Segment 

starting position ^ 


Segment 

ending position ^ 


R20779_T7 


4194 


4424 



Segment cluster R20779jnode_32 according to the present invention is supported by 88 
libraries. The number of libraries was determined as previously described. This segment can be 
10 found in the following transcript(s): R20779_T7. Table 1 156 below describes the starting and 
ending position of this segment on each transcript. 



Table 1156 - Segment location on transcripts 



TrajScript name 


^Segment ~ip jj- 1 ; 
starting position 


Segment .£:■: r ts 
ending position/ ; . 


R20779_T7 


4425 


5503 



According to an optional embodiment of the present invention, short segments related to 



the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
15 included in a separate description. 



Segment cluster R20779__node_l according to the present invention is supported by 27 
libraries. The number of libraries was detemiined as previously described. This segment can be 
found in the following transcript(s): R20779_T7. Table 1 157 below describes the starting and 
20 ending position of this segment on each transcript. 

Table 1157 ' - Segment location on transcripts 



Transcript name 


! Segment 


Segment 




starting position 


ending position 
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R20779_T7 


1299 


1336 









Segment cluster R20779_nodeJ3 according to the present invention is supported by 52 
libraries. The number of libraries was determined as previously described. This segment can be 



5 found in the following transcript(s): R20779JT7. Table 1158 below describes the starting and 
ending position of this segment on each transcript. 

Table 1158 - Segment location on transcripts 



'Transcripts^ ; r: i0 f ' 


Segment ' : V & ■ 
) star%g positior^ 


Segment .. . ... ; . ; rl 

ending position 1 


R20779T7 


1507 


1547 



10 Segment cluster R20779_node_10 according to the present invention can be found in the 

following transcript(s): R20779JT7. Table 1 159 below describes the starting and ending 
position of this segment on each transcript. 

Table 1159 - Segment location on transcripts 





Segment . " fj, * : l -■■ 
starting position t v 


! Se^ent \l . 
ending position ^ • 


R20779_T7 


1839 


1849 



Segment cluster R20779jtiode_l 1 according to the present invention is supported by 58 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779_T7. Table 1 160 below describes the starting and 
ending position of this segment on each transcript. 

20 Table 1160 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 
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R20779_T7 


1850 


1902 









Segment cluster R20779_node_14 according to the present invention is supported by 1 
libraries. The number of libraries was determined as previously described. This segment can be 



5 found in the following transcript(s): R20779JT7. Table 1 161 below describes the starting and 
ending position of this segment on each transcript. 

Table 1161 - Segment location on transcripts 



J ■ ; " h / V ' '-A, A T* ' ■ 


— , 1 '" -j~r. " JS _ 

st^rtigg pogiioai • ':v 


'Segment '■ :t '- i "' , 
ending positioii 


R20779JT7 ' 


1903 


1975 



10 Segment cluster R20779_node_17 according to the present invention is supported by 54 

libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779_T7. Table 1 162 below describes the starting and 
ending position of this segment on each transcript. 



Table 1162 - Segment location on transcripts 



Transcript name ; V % 

:r -'.'<.. ..■ y. ] \ ' 
y ■ '"' ' M ' ' ' , ; i • 


"Segment. ^ / ^ 
starting position , 


^Segment "% K 
ending position . , 


R20779JT7 


1976 


2008 



15 

Segment cluster R20779_node_19 according to the present invention can be found in the 
following transcript(s): R20779_T7. Table 1 163 below describes the starting and ending 
position of this segment on each transcript. 

20 Table 1163 - Segment location on transcripts 



Transcript name 


Segment 


Segment 




stalling position 


ending position 
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R20779T7 


2177 


2188 









Segment cluster R20779_node__20 according to the present inventbn is supported by 53 
libraries. The number of libraries was determined as previously described. This segment can be 
5 found in the following transcript(s): R20779JT7. Table 1 164 below describes the starting and 
ending position of this segment on each transcript. 



Table 1164 - Segment location on transcripts 



Transcript name M . ' ~ & , / f 


• Segrrieift 
[Starting position^ 


Segment -J^ ; "i <:: - 
' ending position 


R20779JT7 


2189 


2218 



10 Segment cluster R20779_node_22 according to the present invention is supported by 76 

libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779 T7. Table 1165 below describes the starting and 
ending position of this segment on each transcript. 



Table 1165 - Segment location on transcripts 



Transcript name 


i;SegmenJ: \ : .\ 
parting position 


Segment / ' v 4 , ;y. 
priding position 


R20779JT7 


2797 


2899 



15 

Segment cluster R20779_node_23 according to the present invention is supported by 81 
libraries. The number of libraries was determined as previously described. This segment can be 
found in the following transcript(s): R20779JT7. Table 1 166 below describes the starting and 
20 ending position of this segment on each transcript. 

Table 1166 - Segment location on transcripts 
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Transcript name •;. 


Segment 

j starting position 


Segment '.' f : ■■' 
ending position .: 


R20779JT7 


2900 


2976 



Segment cluster R20779jnode_25 according to the present invention can be found in the 
following transcript(s): R20779JT7. Table 1167 below describes the starting and ending 
position of this segment on each transcript. 

Table 1167 - Segment location on transcripts 





Segi^ntf >. . 
^starting pd^Soni 


Segment 
ending positiop. 


R20779JT7 


3668 


3672 



Segment cluster R20779_node_29 according to the present invention can be found in the 
following transcript(s): R20779_T7. Table 1168 below describes the starting and ending 
position of this segment on each transcript. 

Table 1168 - Segment location on transcripts 





■ Segment' 
Starting position 


Segment v 
1 ending position 


R20779J7 


4051 


4067 



Variant protein alignment to the previously known protein: 
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Sequence name: STC2_HUMAN 



Sequence documentation : 



Alignment of: R2 0 77 9_P2 x STC2_HUMAN 



Alignment segment 1/1: 



Quality: 1688.00 

Escore: 0 

Matching length: 171 
length: 171 

Matching Percent Similarity: 99.42 
Identity: 99.42 

Total Percent Similarity: 99.42 
Identity: 99.42 

Gaps: 0 



Total 



Matching Percent 



Total Percent 



Alignment ; 



1 MCAERLGQFMTLALVLATFDPARGTDATNPPEGPQDRSSQQKGRLSLQNT 50 

| I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I i I I I I I I I I I I I I M I I I I 
1 MCAERLGQFMTLALVLATFDPARGTDATNPPEGPQDRSSQQKGRLSLQNT 50 

51 AEIQHCLVNAGDVGCGVFECFENNSCEIRGLHGICMTFLHNAGKFDAQGK 100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

51 AE I QHCLVNAGDVGCGVFECFENNS CE IRGLHGI CMTFLHNAGKFDAQGK 10 0 



101 SFIKDALKCKAHALRHRFGCISRKCPAIREMVSQLQRECYLKHDLCAAAQ 150 

M I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
101 S F IKDALKCKAHALRHRFGC I S RKC PA IREMVS QLQRE C YLKHDLCAAAQ 150 
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151 ENTRVIVEMIHFKDLLLHECY 171 

I I I I I I I I I I I I i i 1 I I I I I 
151 ENTRVIVEMIHFKDLLLHEPY 171 



DESCRIPTION FOR CLUSTER R38144 
Cluster R38144 features 6 transcript(s) and 24 segment(s) of interest, the names for which 
are given in Tables 1169 and 1170, respectively, the sequences themselves are given at the end 
10 of the application. The selected protein variants are given in table 1171. 



Table 1169 - Transcripts of interest 



Transcript Name • ,- • r - , 


'■Sequence ID No. - . ;, ; 


R38144_PEA_2_T6 


135 


R38144_PEA_2_T10 


136 


R38144_PEA_2_T13 


137 


R38144_PEA_2_T15 


138 


R38144_PEA_2_T19 


139 


R38144_PEA_2_T27 


140 


Table 1170- Segments of interest 


> ■ . — • — ~ : ; * 

Segment Name V- . .■• 


1 Sequence ID No. ;: , \v Y-$ ' : 


R3 8 1 44_PEA_2_node_2 1 


937 


R38144_PEA_2_node_26 


938 


R38144_PEA_2_node_29 


939 


R38144_PEA_2_node_3 1 


940 


R38 144_PEA_2_node_46 


941 


R38 144_PEA_2_node_47 


942 


R38144_PEA_2_node_49 


943 


R38144_PEA_2_node_0 


944 
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R3 8 1 44_PEA_2_node_l 


945 


R3 8 1 44_PE A_2_node_4 


946 


R3 8 1 44JPE A_2_node_5 


947 


R38 144_PEA_2_node_7 


948 


R3 8 1 44_PEA_2_node_l 1 


949 


R3 8 144_PEA_2_node_l 4 


950 


R3 8 144_PEA_2_node_l 5 


951 


R38144_PEA_2_node_16 


952 


R3 8 1 44_PE A_2_node_l 9 


953 


R38144_PEA_2_node_20 


954 


R3 8 1 44JPE A_2_node_3 6 


955 


R3 8 1 44_PEA_2_node_37 


956 


R3 8 1 44_PEA_2_node_43 


957 


R3 8 1 44_PEA_2_node_44 


958 


R3 8 1 44_PE A_2_node_45 


959 


R3 8 1 44_PEA_2_node_5 1 


960 



Table 1171 - Proteins of interest 



Protein Name 


SequencejID Nop .%% ■ .-, 


Corresponding Transcript(s) 


R38144_PEA_2_P6 


1403 


R38144_PEA_2_T6 


R38144_PEA_2_P13 


1404 


R38144_PEA_2_T13 


R38144_PEA_2_P15 


1405 


R38144_PEA_2_T15 


R38144_PEA_2_P19 


1406 


R38144_PEA_2_T19 


R38144_PEA_2_P24 


1407 


R38144_PEA_2_T27 


R38144_PEA_2_P36 


1408 


R38144_PEA_2_T10 



These sequences are variants of the known protein Putative alpha- mannosidase C20orf31 
5 precursor (SwissProt accession identifier CT3 INHUMAN; known also according to the 

synonyms EC 3.2.1), SEQ ID NO: 1459, referred to herein as the previously known protein. 
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The sequence for protein Putative alpha-mannosidase C20orf31 precursor is given at the 
end of the application, as "Putative alpha-mannosidase C20orf31 precursor amino acid 
sequence". Known polymorphisms for this sequence are as shown in Table 1 172. 

Table 1172 - Amino acid mutations for Known Protein 



SNP position(s) on -. 
amino acid sequence 


Comment '% ". v ' ' - 


456 


A -> T. /FTId=VAR_012165. 


511 


S ->C 



5 



Protein Putative alpha-mannosidase C20orf31 precursor localization is believed to be 
Secreted (Potential). 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: carbohydrate metabolism; N-linked glycosylation, which are 
10 annotation(s) related to Biological Process; mannosyl- oligosaccharide 1,2- alpha-mannosidase; 
calcium binding; hydrolase, acting on glycosyl bonds, which are annotation(s) related to 
Molecular Function; and membrane, which are annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
1 5 from <http://www.ncbi.nlm.nih. gov/proj ects/LocusLink/>. 

Cluster R38144 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
20 the table and the numbers on the y-axis of figure 45 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
25 Figure 45 and Table 1 173. This cluster is overexpressed (at least at a minimum level) in the 
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following pathological conditions: epithelial malignant tumors, lung malignant tumors, skin 
malignancies and gastric carcinoma. 

Table 1173 - Normal tissue distribution 



Name of Tissiiq X- 1 


Number ^ 


Adrenal 


40 


Bladder 


41 


Bone 


38 


Brain 


16 


Colon 


37 


Epithelial 


18 


General 


31 


head and neck 


50 


Kidney 


26 


Liver 


4 


Lung 


11 


lymph nodes 


47 


Breast 


52 


Ovary 


7 


Pancreas 


20 


Prostate 


0 


Skin 


13 


Stomach 


0 


Uterus 


0 



5 Table 1174 -P values and ratios for expression in cancerous tissue 



Name of Tissue 


PI 


P2 


SPl 


R3 


SP2 


R4 


Adrenal 


9.2e-01 


6.9e-01 


1 


0.5 


7.8e-01 


0.9 


Bladder 


7.6e-01 


8.1e-01 


8.1e-01 


0.9 


9.0e-01 


0.7 


Bone 


6.6e-01 


8.5e-01 


1 


0.6 


1 


0.6 
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Brain 


8.0e-02 


6.0e-02 


4.7e-02 


3.0 


1.6e-02 


3.0 


colon 


7.7e-01 


7.5e-01 


1 


0.5 


3.5e-01 


0.8 


epithelial 


2.0e-01 


4.8e-03 


1.7e-01 


1.4 


2.7e-16 


5.2 


general 


3.9e-01 


2.2e-02 


7.8e-01 


0.9 


2.1e-19 


2.9 


head and neck 


3.4e-01 


5.6e-01 


4.6e-01 


1.4 


7.5e-01 


0.9 


kidney 


8.3e-01 


7.7e-01 


4.4e-01 


1.4 


8.5e-02 


1.6 


liver 


9.1e-01 


6.0e-01 


1 


0.9 


l.le-01 


1.8 


lung 


1.6e-02 


1.5e-02 


9.5e-02 


3.8 


1.6e-05 


6.6 


lymph nodes 


7.1e-01 


7.8e-01 


1 


0.3 


1.2e-04 


1.0 


breast 


9.1e-01 


9.1e-01 


1 


0.5 


9.7e-01 


0.6 


ovary 


5.0e-01 


2.9e-01 


4.7e-01 


1.7 


7.0e-02 


2.2 


pancreas 


7.2e-01 


4.2e-01 


8.1e-01 


0.8 


3.0e-02 


1.8 


prostate 


7.9e-01 


5.7e-01 


3.0e-01 


2.5 


1.8e-04 


3.0 


skin 


9.2e-01 


8.7e-02 


1 


0.5 


3.0e-05 


4.1 


stomach 


3.0e-01 


5.5e-02 


2.5e-01 


3.0 


9.2e-04 


6.1 


uterus 


2.1e-01 


9.4e-02 


4.4e-01 


2.0 


5.1e-01 


1.9 



above. These transcript(s) encode for protein(s) which are variant(s) ofprotein Putative alpha- 
mannosidase C20orf31 precursor. A description of each variant protein according to the present 
invention is now provided. 



Variant protein R38144_PEA_2_P6 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
R3 8 1 44_PE A_2_T6. An alignment is given to the known protein (Putative alpha- mannosidase 
C20orf31 precursor) at the end of the application. One or more alignments to one or more 
10 previously published protein sequences are given at the end of the application. A brief 

description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

Comparison report between R38144_PEA_2_P6 and CT3 INHUMAN : 
l.An isolated chimeric polypeptide encoding for R38144 PEAJ2P6, comprising a first 
15 amino acid sequence being at least 90 % homologous to 
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MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFD 

ELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFET 

NIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPAFQTPTGMPYGTV 

NLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSDIGLV 

GNHIDVLTGKWVAQDAGIGAGVDSYFEYLWGAILLQDKKXMAMFLEYNKAIRNYTR 

FDDWYLWQMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGG 

LPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLELGRDAVESIEKISKVEC 

GFAT corresponding to amino acids 1-412 of CT3 INHUMAN, which also corresponds to 

amino acids 1 - 412 of R38144_PEA_2_P6, and a second amino acid sequence being at least 

70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 

preferably at least 95% homologous to a polypeptide having the sequence 

LASFSHMSDQRSARPQAGQPHGVVLPGRDCEIPLPPV corresponding to amino acids 413 - 
449 of R38144_PEA_2JP6, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of R3 8 144_PEA_2 JP6, comprising a 
polypeptide being at least 70%, optionally at least about 80%o, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence LASFSHMSDQRSARPQAGQPHGWLPGRDCEIPLPPV in R38144JPEA_2JP6. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein R38144JPEA_2JP6 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1 175, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein R38144_PEA_2JP6 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 
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Table 1175 - Amino acid mutations 



SNP position(s) on amino acid 
sequence . 


Alternative amino acid(s) ". 


Previously known SNP? 


10 


G-> 


No 


54 


A-> V 


Yes 


55 


F->L 


Yes 


73 


S->I 


Yes 


87 


I-> 


No 


145 


P-> 


No 


145 


P-> A 


No 


164 


A->G 


No 


164 


A-> 


No 


203 


A->G 


No 


203 


A-> 


No 


211 


D-> 


No 


236 


G-> 


No 


265 


V->G 


No 


285 


K-> 


No 


294 


D->N 


No 


305 


G->E 


No 


323 


Q->R 


No 


346 


F-> 


No 



The glycosylation sites of variant protein R38144_PEA_2JP6, as compared to the known 
protein Putative alpha- mannosidase C20orf31 precursor, are described in Table 1 176 (given 
5 according to their position(s) on the amino acid sequence in the first column; the second column 
indicates whether the glycosylation site is present in the variant protein; and the last column 
indicates whether the position is different on the variant protein). 

Table 1176- Glycosylation site(s) 
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Positions), oti known amino 
acid sequence ; 


Present in variant protein? 


Position in variant protein? 


450 


no 




289 


yes 


289 


112 


yes 


112 


90 


yes 


90 



Variant protein R38144_PEA_2JP6 is encoded by the following transcript(s): 
R38144JPEA_2_T6, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript R38144_PEA_2_T6 is shown in bold; this coding portion starts at 
5 position 91 and ends at position 1437. The transcript also has the following SNPs as listed in 
Table 1177 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein R38144JPEA_2JP6 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 1177 ' - Nucleic acid SNPs 



SNP position on nucleotide 
sequence % ; - f:v , 


Alternative nucleic acid -W'l 


Previously known SNB? 


120 


C -> 


No 


251 


C->T 


Yes 


253 


T->C 


Yes 


308 


G->T 


Yes 


312 


T->C 


No 


350 


T-> 


No 


523 


C-> 


No 


523 


C->G 


No 


581 


C-> 


No 


581 


C->G 


No 


698 


C-> 


No 
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698 


C ->G 


No 


723 


C -> 


No 


798 


C-> 


No 


798 


C -> G 


No 


849 


->C 


No 


849 


->G 


No 


884 


T->G 


No 


901 


->C 


No 


901 


->T 


No 


943 


A-> 


No 


970 


G -> A 


No 


1004 


G-> A 


No 


1058 


A->G 


No 


1126 


T-> 


No 


1218 


C->T 


Yes 


1392 


A->G 


No 


1425 


T->C 


No 


1481 


G-> A 


Yes 


1560 


C ->T 


No 


1566 


C-> 


No 


1644 


G->A 


Yes 


1646 


A->T 


No 


1763 


A-> 


No 


1763 


A->C 


No 


1781 


C->T 


Yes 


1799 


C-> 


No 


1799 


C->G 


No 


1844 


T->G 


No 


1855 


A->C 


Yes 
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Variant protein R38144_PEA_2JP13 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
R38144JPEA_2JT13. An alignment is given to the known protein (Putative alpha- mannosidase 

5 C20orf3 1 precursor) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

Comparison report between R38144JPEA_2JP13 and CT3 INHUMAN: 

10 l.An isolated chimeric polypeptide encoding for R38144_PEA_2_P13, comprising a first 

amino acid sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFD 

ELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFET 

NIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPAFQTPTGMPYGTV 

15 NLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSDIGLV 
GNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTR 
FDDWYLWVQMYKGTVSMPVFQSLEAYWPGLQ corresponding to amino acids 1 - 323 of 
CT31JHUMAN, which also corresponds to amino acids 1 - 323 of R38144JPEA_2_P13, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 

20 more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence NLLKAQCTSTVPRGIPPS corresponding to amino acids 324 - 341 of 
R38144JPEAJ2JP13, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of R38144_PEA_2_P13, comprising a 

25 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence NLLKAQCTSTVPRGIPPS in R38144JPEA_2J>13. 



30 



The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 



WO 2006/131783 



PCT/IB2005/004037 



1188 

secreted. The protein localization is believed to be secreted because both signa 1-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein R38144_PEA_2_P13 also has the following non- silent SNPs (Single 
5 Nucleotide Polymorphisms) as listed in Table 1 178, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein R38144JPEA_2_P13 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

1 0 Table 1178 - Amino acid mutations 



SNP position(s) on amino acid 
sequence^ '■'(■,}, % 


Alternative amino acid(s)' : 

" -H: "J • •; ; $' : i '"■ ■ 


^Previous lyttown SNP? 


10 


G-> 


No 


54 


A-> V 


Yes 


55 


F->L 


Yes 


73 


S->I 


Yes 


87 


I-> 


No 


145 


P-> 


No 


145 


P -> A 


No 


164 


A->G 


No 


164 


A-> 


No ! 


203 


A->G 


No 


203 


A-> 


No 


211 


D-> 


No 


236 


G-> 


No 


265 


V->G 


No 


285 


K-> 


No 


294 


D->N 


No 


305 


G->E 


No 


323 


Q->R 


No 
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328 



A-> V 



Yes 



The glycosylation sites of variant protein R38144_PEA_2_P13, as compared to the 
known protein Putative alpha- mannosidase C20orf31 precursor, are described in Table 1179 
(given according to their position(s) on the amino acid sequence in the first column; the second 
column indicates whether the glycosylation site is present in the variant protein; and the last 
column indicates whether the position is different on the variant protein). 

Table 1179 - Glycosylation site(s) 



.Positic^ 

acid gequehee , - £*• . J? .. 


Present in variant: protein? £ 


: Position to; variant ptotm^ J 


450 


no 




289 


yes 


289 


112 


yes 


112 


90 


yes 


90 



Variant protein R38144_PEA_2_P13 is encoded by the following transcript(s): 
10 R38144JPEA_2_T13, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript R38144_PEA_2_T13 is shown in bold; this coding portion starts at 
position 91 and ends at position 1113. The transcript also has the following SNPs as listed in 
Table 1 180 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
15 known SNPs in variant protein R38144JPEA_2_P13 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 1180 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


Previously known SNP? 


120 


C-> 


No 


251 


C->T 


Yes 


253 


T->C 


Yes 
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308 


G->T 


Yes 


312 


T->C 


No 


350 


T-> 


No 


523 


C-> 


No 


523 


C->G 


No 


581 


C-> 


No 


581 


C ->G 


No 


698 


C-> 


No 


698 


C->G 


No 


723 


C -> 


No 


798 


C-> 


No 


798 


C->G 


No 


849 


->C 


No 


849 


->G 


No 


884 


T->G 


No 


901 


->C 


No 


901 


->T 


No 


943 


A-> 


No 


970 


G-> A 


No 


1004 


G -> A 


No 


1058 


A->G 


No 


1073 


C->T 


Yes 


1222 


A->G 


No 


1255 


T->C 


No 


1311 


G-> A 


Yes 


1390 


C->T 


No 


1396 


C-> 


No 


1474 


G->A 


Yes 


1476 


A->T 


No 


1593 


Ao- 


No 
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1593 


A->C 


No 


1611 


C ->T 


Yes 


1629 


C-> 


No 


1629 


C->G 


No 


1674 


T->G 


No 


1685 


A->C 


Yes 



Variant protein R38144_PEA_2JP15 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 R38144JPEA_2_T15. An alignment is given to the known protein (Putative alpha- mannosidase 
C20orf31 precursor) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 
10 Comparison report between R3S144JPEA_2JP15 and CT31_HUMAN: 

1. An isolated chimeric polypeptide encoding for R3 8 1 44JPE A J2JP 1 5 , comprising a first 
amino acid sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFD 
ELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFET 

15 NIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPAFQTPTGMPYGTV 
NLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSDIGLV 
GNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGAILLQDKKLMAMFLE corresponding 
to amino acids 1 - 282 of CT3 1 JHUMAN, which also corresponds to amino acids 1 - 282 of 
R38144JPEAJ2JP15, and a second amino acid sequence being at least 70%, optionally at least 

20 80%, preferably at least 85%, more preferably at least 90% and most preferably at least 95% 

homologous to a polypeptide having the sequence PHWRH corresponding to amino acids 283 - 
287 of R38144_PEA_2_P15, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of R38144_PEA_2JP15, comprising a 
25 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 



WO 2006/131783 



PCT/IB2005/004037 



1192 

more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence PHWRH in R38144JPEAJ2JP15. 

The location of the variant protein was determined according to results from a number of 
5 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

10 Variant protein R38144JPEA_2_P15 also has the following nore silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 1181, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein R38144_PEA_2_P15 
sequence provides support fcr the deduced sequence of this variant protein according to the 

1 5 present invention). 

Table 1181 - Amino acid mutations 



SKP pdakm(<$ on amiiio^cid 
sequence ." i , : 


Alternative amino acid(s) . 




10 


G-> 1 


No 


54 


A->V 


Yes 


55 


F->L 


Yes 


73 


S->I 


Yes 


87 


I-> 


No 


145 


P-> 


No 


145 


P-> A 


No 


164 


A->G 


No 


164 


A-> 


No 


203 


A->G 


No 


203 


A-> 


No 


211 


D-> 


No 
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236 


G-> 


No 


265 


V->G 


No 



The glycosylation sites of variant protein R38144_PEAJ2JP15, as compared to the 
known protein Putative alpha- mannosidase C20orf31 precursor, are described in Table 1182 
(given according to their position(s) on the amino acid sequence in the first column; the second 



5 column indicates whether the glycosylation site is present in the variant protein; and the last 
column indicates whether the position is different on the variant protein). 

Table 1182 - Glycosylation site(s) 



Positions) oh known amino 
acid sequence 


Present in variant protein? 


Position in variant protein? . 


450 


no 




289 


no 




112 


yes 


112 


90 


yes 


90 



Variant protein R38144JPEA_J2JP15 is encoded by the following transcript(s): 
10 R38144JPEA_2jri5 5 for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript R38144JPEA_2_T15 is shown in bold; this coding portion starts at 
position 91 and ends at position 951. The transcript also has the following SNPs as listed in 
Table 1183 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
1 5 known SNPs in variant protein R38144_PEA_2_P1 5 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 1183- Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


! Previously known SNP? 


120 


C-> 


No 


251 


C->T 


Yes 
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253 


T->C 


Yes 


308 


G->T 


Yes 


312 


T->C 


No 


350 


T-> 


No 


523 


C-> 


No 


523 


C->G 


No 


581 


C-> 


No 


581 


C->G 


No 


698 


C-> 


No 


698 


C->G 


No 


723 


C-> 


No 


798 


C-> 


No 


798 


C->G 


No 


849 


->C 


No 


849 


->G 


No 


884 


T->G 


No 


901 


->C 


No 


901 


->T 


No 


1001 


T-> 


No 


1093 


C->T 


Yes 


1242 


A->G 


No 


1275 


T->C 


No 


1331 


G-> A 


Yes 


1410 


C->T 


No 


1416 


C-> 


No 


1494 


G-> A 


Yes 


1496 


A->T 


No 


1613 


A-> 


No 


1613 


A->C 


No 


1631 


C->T 


Yes 
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1649 


C -> 


No 


1649 


C->G 


No 


1694 


T->G 


No 


1705 


A->C 


Yes 



Variant protein R38144_PEA__2_P19 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 

5 R38144JPEA_2_T19. An alignment is given to the known protein (Putative alpha- mannosidase 
C20orf31 precursor) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. Abrief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

10 Comparison report between R38144_PEA_2_P19 and CT31_HUMAN: 

LAn isolated chimeric polypeptide encoding for R38144_PEA_2_P19, comprising a first 
amino acid sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFD 
ELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRWEVLQDSVDFDIDVNASVFET 
15 NIRWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAAREXLPAFQTPTGMPYGTV 
NLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMREWESRSDIGLV 
GNHIDVLTGKWVAQDAGIGAGVDSYFEYL 

FDDWYLWQMYKGTVSMPWQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGG 
LPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLELGRDAVESIEKISKVEC 

20 GFAT corresponding to amino acids 1 - 412 of CT3 1_HUMAN, which also corresponds to 

amino acids 1-412 of R38144JPEA_2_P19, and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
KRSRSVAQAGVQWCDHDSPQP corresponding to amino acids 413 - 433 of 

25 R38144_PEA_2_P19, wherein said first amino acid sequence and second amino acid sequence 
are contiguous and in a sequential order. 
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2. An isolated polypeptide encoding for a tail of R38144_PEA_2_P19, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence KRSRSVAQAGVQWCDHDSPQP in R38144JPEA_2_P19. 

5 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 

10 prediction programs predict that this protein has a signal peptide, and neither trans- membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein R38144JPEA_2JP19 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1 184, (given according to their position(s) on the 
. amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 

15 the SNP is known or not; the presence of known SNPs in variant protein R3 8 1 44 JPE A_2_P 1 9 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 1184- Amino acid mutations 



SNP ^sition(s) im amino acid 
sequence ' . \^ '*' . > 


Alternative amino acid(s) 


Previously known SNP? ■-■ 

." * -V ; , .:2J_ ' '■ V'.. .'" 


10 


G-> 


No 


54 


A-> V 


Yes 


55 


F->L 


Yes 


73 


S->I 


Yes 


87 


I-> 


No 


145 


P-> 


No 


145 


P-> A 


No 


164 


A->G 


No 


164 


A-> 


No 


203 


A->G 


No 
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203 


A-> 


No 


211 


D-> 


No 


236 


G-> 


No 


265 


V->G 


No 


285 


K-> 


No 


294 


D->N 


No 


305 


G->E 


No 


323 


Q->R 


No 


346 


F-> 


No 



The glycosylation sites of variant protein R38144JPEA_2JP19 ? as compared to the 
known protein Putative alpha- mannosidase C20orf31 precursor, are described in Table 
1 185(given according to their position(s) on the amino acid sequence in the first column; the 



5 second column indicates whether the glycosylation site is present in the variant protein; and the 
last column indicates whether the position is different on the variant protein). 

Table 1185- Glycosylation site(s) 



Ppsition(s) o#Mo^Spa|no; % 
acid sequence; --. 


Preset m variant piptcb^i 


Position in vaiiarit protein? 


450 


no 




289 


yes 


289 


112 


yes 


112 


90 


yes 


90 



Variant protein R38144_PEA_2_P19 is encoded by the following transcript(s): 
10 R3 8 1 44*_PE A_2_T 1 9, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript R38144_PEA_2JT19 is shown in bold; this coding portion starts at 
position 91 and ends at position 1389. The transcript also has the following SNPs as listed in 
Table 1 186 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
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known SNPs in variant protein R38144JPEAJ2JP19 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 1186- Nucleic acid SNPs 



■. SNP positional! nucleotide 
sequence . 


Alternative nucleic acid 


Previously known SNP? ; ;? 

•' ' , . - :- ".. . j . . • 
'■ - " ■ ,■■ ' . ' - f '• > . ■ 'V. - 


120 


c-> 


No 


251 


C->T 


Yes 


253 


T->C 


Yes 


308 


G->T 


Yes 


312 


T->C 


No 


350 


T-> 


No 


523 


C-> 


No 


523 


C->G 


No 


581 


C-> 


No 


581 


C->G 


No 


698 


C-> 


No 


698 


C->G 


No 


723 


C-> 


No 


798 


C-> 


No 


798 


C->G 


No 


849 


->C 


No 


849 


->G 


No 


884 


T->G 


No 


901 


->C 


No 


901 


->T 


No 


943 


A-> 


No 


970 


G-> A 


No 


1004 


G-> A 


No 


1058 


A->G 


No 



WO 2006/131783 



PCT/IB2005/004037 



1199 



1126 


T-> 


No ! 


1218 


C->T 


Yes 


1446 


C-> 


Yes 



Variant protein R38144_PEA_2_P24 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 

5 R38144JPEA_2_T27. An alignment is given to the known protein (Putative alpha- mannosidase 
C20orf31 precursor) at the end of the application. One or more alignments to one or more 
previously published protein sequences are given at the end of the application. A brief 
description of the relationship of the variant protein according to the present invention to each 
such aligned protein is as follows: 

10 Comparison report between R38144_PEA_2 J>24 and CT3 1 HUMAN: 

LAn isolated chimeric polypeptide encoding for R38144_PEA__2JP24, comprising a first 
amino acid sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFD 
ELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFET 
15 NIR corresponding to amino acids 1-121 of CT3 INHUMAN, which also corresponds to amino 
acids 1-121 of R38144_PEA_2_P24, and a second amino acid sequence being at least 90 % 
homologous to 
EYNKAIRJNTYTRFDDW 

YYTVWKQFGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLELGRDA 
20 VESIEKISKVECGFATIKDLRDHKLDNRMESFFLAETVKYLYLLFDPTNFIHNNGSTFDA 
VITPYGECILGAGGYIFNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPSQPFTSKLALLGQVFL 
DSS corresponding to amino acids 282 - 578 of CT3 INHUMAN, which also corresponds to 
amino acids 122 - 418 of R38144JPEA_2JP24, wherein said first amino acid sequence and 
25 second amino acid sequence are contiguous and in a sequential order. 

2.An isolated chimeric polypeptide encoding for an edge portion of R38144JPEA2P24, 
comprising a polypeptide having a length "n", wherein n is at least about 10 amino acids in 
length, optionally at least about 20 amino acids in length, preferably at least about 30 amino 
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acids in length, more preferably at least about 40 amino acids in length and most preferably at 
least about 50 amino acids in length, wherein at least two amino acids comprise RE, having a 
structure as follows: a sequence starting from any of amino acid numbers 121-x to 121; and 
ending at any of amino acid numbers 122+ ((n-2) - x), in which x varies from 0 to n-2. 

5 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal- peptide 

10 prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein R38144JPEAJ2JP24 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1 187, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 

15 the SNP is known or not; the presence of known SNPs in variant protein R3S144JPEA_2_P24 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 1187- Amino acid mutations 



SNP- positions) on amino acid 
'sequence ' ' . 


Alternative amino acid(s) % 


Previously known SNP? 


10 


G-> 


No 


54 


A-> V 


Yes 


55 


F->L 


Yes 


73 


S->I 


Yes | 


87 


I-> 


No 


125 


K-> 


No 


134 


D->N 


No 


145 


G->E 


No 


163 


Q ->R 


No 


186 


F-> 


No 
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266 


E->G 


No 


277 


L->P 


No 


296 


A->T 


Yes 


322 


P->L 


No 


324 


A-> 


No 


350 


R->Q 


Yes 


351 


S->C 


No 


390 


K-> 


No 


390 


K->Q 


No 


396 


L->F 


Yes 


402 


P-> 


No 


402 


P-> A 


No 


417 


S -> A 


No 



The glycosylation sites of variant protein R38144_PEA_2_P24, as compared to the 
known protein Putative alpha- mannosidase C20orf31 precursor, are described in Table 1 188 
(given according to then position(s) on the amino acid sequence in the first column; the second 



5 column indicates whether the glycosylation site is present in the variant protein; and the last 
column indicates whether the position is different on the variant protein). 

Table 1188- Glycosylation site(s) 



PoAioi^ 
ad&sequence 


Present in variant protein? » 


) Position in variant protein? 


450 


yes 


290 


289 


yes 


129 


112 


yes 


112 


90 


yes 


90 



Variant protein R3 8 1 44_PE A_2JP24 is encoded by the following transcript(s): 
10 R38144_PEA_2_T27, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript R38144_PEAJ2_T27 is shown in bold; this coding portion starts at 
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position 91 and ends at position 1344. The transcript also has the following SNPs as listed in 
Table 1 189 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein R38144JPEA_2JP24 sequence provides support for the deduced 
5 sequence of this variant protein according to the present invention). 

Table 1189- Nucleic acid SNPs 



SNP position an, nucleotide f * 
sequence ' ■-• 


Alternative nucleic acid 

_j ■,r, , „r,i,4' , • ^ — — 


Previously Mown SNP? ; 

x . : .-.' 4 .... ... .. ' " ^ " ■ , ( ' v . / ' t 


120 


c-> 


No 


251 


C->T 


Yes 


253 


T->C 


Yes 


308 


G->T 


Yes 


312 


T->C 


No 


350 


T-> 


No 


463 


A-> 


No 


490 


G-> A 


No 


524 


G->A 


No 


578 


A->G 


No 


646 


T-> 


No 


738 


C->T 


Yes 


887 


A->G 


No 


920 


T->C 


No 


976 


G->A 


Yes 


1055 


C->T 


No 


1061 


C-> 


No 


1139 


G-> A 


Yes 


1141 


A->T 


No 


1258 


A-> 


No 


1258 


A->C 


No 
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C ->T 


Yes 


1294 


C-> 


No 


1294 


C->G 


No 


1339 


T->G 


No 


1350 


A->C 


Yes 



Variant protein R38144JPEA__2_P36 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 R38144 PEA_2_T10. An alignment is given to the known protein (Putative alpha- mannosidase 
C20orf31 precursor; SEQ ID NO: 1459) at the end of the application. One or more alignments to 
one or more previously published protein sequences are given at the end of the application. A 
brief description of the relationship of the variant protein according to the present invention to 
each such aligned protein is as follows: 
10 Comparison report between R38144_PEA_2JP36 and AAH16184 (SEQ ID NO: 1460): 

LAn isolated chimeric polypeptide encoding for R38144_PEA_2_P36, comprising a first 
amino acid sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYR corresponding to amino acids 1-36 
of AAH16184, which also corresponds to amino acids 1 - 36 of R38144JPEA_2_P36, and a 

15 second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence FWGMSQNSKEWLKCSRTAWTLILM corresponding to amino acids 37 
- 60 of R38144_PEA_2_P36, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

20 2. An isolated polypeptide encoding for a tail of R38144JPEAJ2JP36, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homobgous to the 
sequence FWGMSQNSKEWLKCSRTAWTLILM in R38144_PEA_2JP36. 

25 Comparison report between R38144JPEA_2JP36 and AAQ88943 (SEQ ID NO:1461): 
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l.An isolated chimeric polypeptide encoding for R38144_PEA_2_P36, comprising a first 
amino acid sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHY corresponding to amino acids 1 - 35 of 
AAQ88943, which also corresponds to amino acids 1 - 35 of R38144JPEA_2_P36 5 and a 

5 second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence RFWGM S QN S KE WLKC S RT A WTLILM corresponding to amino acids 
36- 60 of R38144JPEAJ2JP36, wherein said first amino acid sequence and second amino acid 
sequence are contiguous and in a sequential order. 

10 2.An isolated polypeptide encoding for a tail of R38144_PEA_2_P36, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence REWGMSQNSKEWLKCSRTAWTLILM in R38144_PEA_2_P36. 

15 Comparison report between R38144_PEA_2JP36 and CT31 JHUMAN: 

l.An isolated chimeric polypeptide encoding for R38144JPEA_2_P36, comprising a first 
amino acid sequence being at least 90 % homologous to 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYR corresponding to amino acids 1 - 36 
of CT31HUMAN, which also corresponds to amino acids 1 - 36 of R3 8 1 44 JPE A_2_P3 6, and 

20 a second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 
85%, more preferably at least 90% and most preferably at least 95% homologous to a 
polypeptide having the sequence FWGMSQNSKEWLKCSRTAWTLILM corresponding to 
amino acids 37-60 of R38144_PEA_2_P36, wherein said first amino acid sequence and second 
amino acid sequence are contiguous and in a sequential order. 

25 2. An isolated polypeptide encoding for a tail of R3 8 144JPEA_2_P36, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence FWGMSQNSKEWLKCSRTAWTLILM in R38144JPEA_2_P36. 

30 The location of the variant protein was determined according to results from a number of 

different software programs and analyses, including analyses from SignalP and other specialized 
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programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

5 Variant protein R38144JPEAJ2JP36 also has the following noi>silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 1 190, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein R38144JPEA_2_P36 
sequence provides support for the deduced sequence of this variant protein according to the 

10 present invention). 

Table 1190 - Amino acid mutations 



%]S^ bhamiho acid 
Sequence " K • • - - 


Alternative ammo g:cid(s§ : % 

'" Ji -,r- . .. '"V 


Previously known SNP? 


10 


G-> 


No 


37 


F-> 


No 



The glycosylation sites of variant protein R38144_PEA_2_P36, as compared to the 
known protein Putative alpha- mannosidase C20orf31 precursor, are described in Table 
15 1 191 (given according to their position(s) on the amino acid sequence in the first column; the 

second column indicates whether the glycosylation site is present in the variant protein; and the 
last column indicates whether the position is different on the variant protein). 

Table 1191 - Glycosylation site(s) 



Position(s) on known amino 
acid sequence 


Present in variant protein? 


450 


no 


289 


no 


112 


no 


90 


no 
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Variant protein R38144_PEA_2JP36 is encoded by the following transcript(s): 
R38144JPEA_2_T10 5 for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript R38144JPEA_2_T10 is shown in bold; this coding portion starts at 
position 91 and ends at position 270. The transcript also has the following SNPs as listed in 
5 Table 1 192 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein R38144JPEAJ2JP36 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 1192- Nucleic acid SNPs 



SNP position on nucleotide 
sequence - .< , : •*/ 


Alternative nucleic acid 

. . '. '^--r, ' 


Previously known Sl^P? i 

;*. ' % " f I: - "'~.'f */ • '$ 


120 


C-> 


No 


199 


T-> 


No 


372 


C-> 


No 


372 


C->G 


No 


430 


C-> 


No 


430 


C->G 


No 


547 


C-> 


No 


547 


C->G 


No 


572 


C-> 


No 


647 


C-> 


No 


647 


C->G 


No ! 


698 


->C 


No 


698 


->G 


No 


733 


T->G 


No 


750 


->C 


No 


750 


->T 


No 


792 


A-> 


No 


819 


G-> A 


No 


853 


G->A 


No 
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A->G 


No 


975 


T-> 


No 


1067 


C->T 


Yes 


1216 


A->G 


No 


1249 


T->C 


No 


1305 


G->A 


Yes 


1384 


C->T 


No 


1390 


C-> 


No 


1468 


G-> A 


Yes 


1470 


A->T 


No 


1587 


A-> 


No 


1587 


A->C 


No 


1605 


C->T 


Yes 


1623 


C-> 


No 


1623 


C->G 


No 


1668 


T->G 


No 


1679 


A->C 


Yes 



As noted above, cluster R38144 features 24 segment(s), which were listed in Table 2 



above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A description of each segment according to the present invention is now 
5 provided. 

Segment cluster R38144JPEA_2_node_21 according to the present invention is supported 
by 108 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144JPEA_2_T6, R38144JPEA_2_T10, 
10 R38144JPEA_2_T13, R38144_PEA_2JT15 and R38144JPEA_2_T19. Table 1 193 below 
describes the starting and ending position of this segment on each transcript. 

Table 1193- Segment location on transcripts 
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Transcript name : . . j ■ ' 


Segment 


Segment 




starting position 


ending position 


R38144_PEA_2_T6 


626 


792 


R38144_PEA_2_T10 


475 


641 


R38144_PEA_2_T13 


626 


792 


R38144_PEA_2_T15 


626 


792 


R38144_PEA_2_T19 


626 


792 



Segment cluster R38144JPEA_2_node__26 according to the present invention is supported 
by 98 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): R38144_PEA_2__T6, R38144_PEA_2_T10, 
R38144JPEA_2JIT3, R38144JPEA_2_T15 and R38144_PEA_2_T19. Table 1194 below 
describes the starting and ending position of this segment on each transcript. 



Table 1194- Segment location on transcripts 



Transcript.name f > 


Segment.--, ■ 
: starting position ** 


Segment \ ■a..-.. 
enduig position 


R38144_PEA_2_T6 


793 


934 


R38144_PEA_2_T10 


642 


783 


R38144_PEA_2_T13 


793 


934 


R38144_PEA_2_T15 


793 


934 


R38144_PEA_2_T19 


793 


934 



10 

Segment cluster R38144_PEA_2_node_29 according to the present invention is supported 
by 98 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144_PEA_2_T6, R38144_PEA_2_T10, 
R38144_PEA_2_T13, R38144_PEA_2_T19 andR38144_PEA_2_T27. Table 1195 below 
1 5 describes the starting and ending position of this segment on each transcript. 
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Table 1195- Segment location on transcripts 



Transcript name 


. Segment 
starting position 


Segment 

ending position ■ 


R38144_PEA_2_T6 


935 


1059 


R38144_PEA_2_T10 


784 


908 


R38144_PEA_2_T13 


935 


1059 


R38144_PEA_2_T19 


935 


1059 


R38144_PEA_2_T27 


455 


579 



Segment cluster R38144JPEA_2_node_31 according to the present invention is supported 
5 by 95 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144JPEAJ2JT6, R38144JPEA__2_T10, 
R38144JPEA_2JT15, R38144_PEA_2_T19 and R38144JPEA_2JT27. Table 1 196 below 
describes the starting and ending position of this segment on each transcript. 

Table 1196 - Segment location on transcripts 



Transcript name M ' ^ ■ ' 


Segment;:/. . . • 
: starting position 


Segment /; ■ f 'M, 
ending position 


R38144_PEA_2_T6 


1060 


1204 


R38144_PEA_2_T10 


909 


1053 


R38144_PEA_2_T15 


935 


1079 


R38144_PEA_2_T19 


1060 


1204 


R38144_PEA_2_T27 


580 


724 



10 

Segment cluster R38144_PEA_2_node_46 according to the present invention is supported 
by 147 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144_PEA_2_T6, R38144_PEA_2_T10, 
15 R38144_PEA_2_T13, R38144_PEA_2_T15 and R38144_PEA_2_T27. Table 1 197 below 
describes the starting and ending position of this segment on each transcript. 
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Table 1197- Segment location on transcripts 



Transcript name 7 M * . - \ 4 > 7 

" \ - - -ft** * -' £ -i' . • ''" .. 


Segment 
starting position 


Segment . , 
ending position 


R38144_PEA_2_T6 


1373 


1544 


R38144_PEA_2_T10 


1197 


1368 


R38144_PEA_2_T13 


1203 


1374 


R38144_PEA_2_T15 


1223 


1394 


R3 8 1 44_PE A_2_T27 


868 


1039 



Segment cluster R38144_PEA_2_node_47 according to the present invention is supported 
5 by 147 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144JPEA_2_T6, R38144JPEA_2JT10, 
R38144_PEA_2_T13 5 R38144JPEA_2_T15 and R38144J>EA_2_T27. Table 1198 below 
describes the starting and ending position of this segment on each transcript. 

Table 1198- Segment location on transcripts 



i Transcript, name 


Segment '"■ . ' T C^, 


Segment. 




starting position 


ending position 


R38144_PEA_2_T6 


1545 


1919 


R38144_PEA_2_T10 


1369 


1743 


R38144_PEA_2_T13 


1375 


1749 


R38144_PEA_2_T15 


1395 


1769 


R38144_PEA_2_T27 


1040 


1414 



10 

Segment cluster R38144JPEA_2_node_49 according to the present invention is supported 
by 1 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144JPEA_2_T19. Table 1199 below describes 
1 5 the starting and ending position of this segment on each transcript. 
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Table 1199- Segment location on transcripts 



Transcript name ':■ r 


Segment - ; 
starting position 


/Se^ent,^ 
ending position 


R38144JPEAJMT19 


1327 


1448 



the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



10 



Segment cluster R38144_PEA_2_node_0 according to the present invention is supported 
by 101 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144_PEA_2__T6, R38144_PEA_2JT10 5 
R38144_PEA_2„T13 ? R38144JPEA_2JT15, R38144JPEAJ2JT19 and R38144JPEA_2_T27. 
Table 1200 below describes the starting and ending position of this segment on each transcript. 

Table 1201- Segment location on transcripts 



Transcript, name * • 


Segment ' 
starting position' - * 


Segment . 4#; 
ending position ; .f 


R38144_PEA_2_T6 




105 


R38144_PEA_2_T10 




105 


R38144_PEA_2_T13 




105 


R38144_PEA_2_T15 




105 


R38144_PEA_2_T19 




105 


R38144_PEA_2_T27 




105 



Segment cluster R38144_PEA_2_node_l according to the present invention is supported 
15 by 105 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144_PEA_2JT6, R38144JPEA_2_T10, 
R38144_PEA_2JT13, R38144_PEA_2JT15, R38144JPEA_2_T19 and R38144_PEA_2_T27. 
Table 1202 below describes the starting and ending position of this segment on each transcript. 

Table 1202- Segment location on transcripts 
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Transcript name , -.. V f 


Segment 


Segment 




starting position ;: 


ending position f 


R38144_PEA_2_T6 


106 


197 


R38144_PEA_2_T10 


106 


197 


R38144_PEA_2_T13 


106 


197 


R38144_PEA_2_T15 


106 


197 


R38144_PEA_2_T19 


106 


197 


R38144_PEA_2_T27 


106 


197 



Segment cluster R38144 PEAJ2_node_4 according to the present invention is supported 
by 107 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144_PEA_2JT6, R38144_PEA_2_T13 ? 
R38144JPEA_2_T15, R38144JPEA.2JT19 and R38144_PEA_2_T27. Table 1203 below 
describes the starting and ending position of this segment on each transcript. 

Table 1203- Segment location on transcripts 



Transcript name. , jff ' ; 

'J' "if , ' . ■ .* ' ' - '.-/V 


Segment 4: - 
starting position J ; 


Segment . :i-f- 
ending position 


R38144_PEA_2_T6 


198 


299 


R38144_PEA_2_T13 


198 


299 


R38144_PEA_2_T15 


198 


299 


R38144_PEA_2_T19 


198 


299 


R38144_PEA_2_T27 


198 


299 



Segment cluster R38144_PEA_2_node_5 according to the present invention can be found 
in the following transcript(s): R38144_PEA_2_T6, R38144_PEA_2_T13, 
R38144_PEA_2_T15, R38144_PEA_2_T19 andR38144_PEA_2_T27. Table 1204 below 
describes the starting and ending position of this segment on each transcript 
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Table 1204- Segment location on transcripts 



Transcript name . - ; 


Segment 

starting position _ 


Segment 

ending position * 


R38144_PEA_2_T6 


300 


308 


R38144_PEA_2_T13 


300 


308 


R38144_PEA_2_T15 


300 


308 


R38144_PEA_2_T19 


300 


308 


R38144_PEA_2_T27 


300 


308 



Segment cluster R38144_PEA_2_node_7 according to the present invention is supported 
5 by 92 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144JPEA_2_T6, R38144J>EA_2_T13, 
R38144_PEA_2_T15 3 R38144JPEA_2_T19 and R38144JPEAJ2JT27. Table 1205 below 
describes the starting and ending position of this segment on each transcript. 

Table 1205- Segment location on transcripts 



Transcript ruame \.- v , 


Segment ; ?' : f 


S egment , , ' . P 




starting position 


: ending position : ; . 


R38144_PEA_2_T6 


309 


348 


R38144_PEA_2_T13 


309 


348 


R38144_PEA_2_T15 


309 


348 


R38144_PEA_2_T19 


309 


348 


R38144_PEA_2_T27 


309 


348 



Segment cluster R38144_PEA_2_node_l 1 according to the present invention is supported 
by 106 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144_PEA_2_T6, R38144JPEA_2_T10, 
15 R38144JPEA_2_T13, R38144_PEA_2_T15, R38144_PEA_2_T19 and R38144_PEA_J2_T27. 
Table 1206 below describes the starting and ending position of this segment on each transcript. 
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Table 1206- Segment location on transcripts 



Transcript name. 


Segment 


Segment 




starting position 


ending position 


R38144_PEA_2_T6 


349 


454 


R38144_PEA_2_T10 


198 


303 


R38144_PEA_2_T13 


349 


454 


R38144_PEA_2_T15 


349 


454 


R38144_PEA_2_T19 


349 


454 


R38144_PEA_2_T27 


349 


454 



Segment cluster R38144JPEA_2_node_14 according to the present invention can be 
5 found in the following transcript(s): R38144JPEA_2_T6, R38144_PEA_2_T10, 

R38144JPEA_2_T13, R38144_PEA_2_T15 and R38144JPEA_2_T19. Table 1207 below 
describes the starting and ending position of this segment on each transcript. 

Table 1207- Segment location on transcripts 



littnserip't name^ '--p^ Jf- " •■'•:.,#'*. 


Segment 0- . ' <S.\ . 
starting pqsjtion 4%'- 


• Segment 
ending position 


R38144_PEA_2_T6 


455 


460 


R38144_PEA_2_T10 


304 


309 


R38144_PEA_2_T13 


455 


460 


R38144_PEA_2_T15 


455 


460 


R38144_PEA_2_T19 


455 


460 



10 

Segment cluster R38144_PEA_2_node_15 according to the present invention is supported 
by 105 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144_PEA_2_T6, R38144_PEA_2_T10, 
R38144_PEA_2_T13, R38144_PEA_2_T15 and R38144_PEA_2_T19. Table 1208 below 
15 describes the starting and ending position of this segment on each transcript. 



WO 2006/131783 



PCT/IB2005/004037 



1215 



Table 1208- Segment location on transcripts 



Transcript name •/ \. 


Segment 


Segment r 




starting position 


ending position 


R38144_PEA_2_T6 


461 


487 


R38144_PEA_2_T10 


310 


336 


R38144_PEA_2_T13 


461 


487 


R38144_PEA_2_T15 


461 


487 


R38144_PEA_2_T19 


461 


487 



Segment cluster R38144JPEAJ2_node_16 according to the present invention is supported 
5 by 106 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144_PEA_2_T6, R38144JPEA_2JT10, 
R38144_PEA_2_T13, R38144JPEAJ2JT15 and R38144_PEA_2_T19. Table 1209 below 
describes the starting and ending position of this segment on each transcript. 

Table 1209- Segment location on transcripts 



Transcript name \ % | - . 


p;Segment" \ .." 
{ starting position r ■■ 


Segment 
B ending position 


R38144_PEA_2_T6 


488 


580 


R38144_PEA_2_T10 


337 


429 


R38144_PEA_2_T13 


488 


580 


R38144_PEA_2_T15 


488 


580 


R38144_PEA_2_T19 


488 


580 



Segment cluster R38144_PEA_2_node_19 according to the present invention is supported 
by 93 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144JPEA_2_T6, R38144_PEA_2_T10 3 
15 R38144_PEA_2_T13, R38144_PEA_2_T15 and R38144_PEA_2_T19. Table 1210 below 
describes the starting and ending position of this segment on each transcript. 
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Table 1210- Segment location on transcripts 



Transcript napme '•' ... 


Segment 
starting position 


Segment 
ending position 


R38144_PEA_2_T6 


581 


615 


R38144_PEA_2_T10 


430 


464 


R38144_PEA_2_T13 


581 


615 


R38144_PEA_2_T15 


581 


615 


R38144_PEA_2_T19 


581 


615 



Segment cluster R38144_PEA_2node_20 according to the present invention can be 
5 found in the following transcript(s): R38 144_PEA_2_T6, R38 144 JPEA_2_T1 0, 

R38144JPEA__2_T13, R38144_PEA_2_T15 and R38144_JPEA_2_T19. Table 1211 below 
describes the starting and ending position of this segment on each transcript. 

Table 1211- Segment location on transcripts 



Transcript name 


: Segment "~*J • 


' Segment \J . ; 




starling position 


ending position s 


R38144_PEA_2_T6 


616 


625 


R38144_PEA_2_T10 


465 


474 


R38144_PEA_2_T13 


616 


625 


R3 8 1 44_PE A_2_T1 5 


616 


625 


R38144_PEA_2_T19 


616 


625 



10 

Segment cluster R38144_PEA_2_node_36 according to the present invention is supported 
by 95 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144_PEA_2_T6, R38144_PEA_2_T10, 
R38144_PEA_2_T13, R38144_PEA_2_T15, R38144_PEA_2_T19 and R38144_PEA_2_T27. 
15 Table 1212 below describes the starting and ending position of this segment on each transcript. 
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Table 1212- Segment location on transcripts 



Transcript name ; , $ 


Segment 


Segment 




starting position 


ending position 


R38144_PEA_2_T6 


1205 


1293 


R38144_PEA_2_T10 


1054 


1142 


R38144_PEA_2_T13 


1060 


1148 


R38144_PEA_2_T15 


1080 


1168 


R38144_PEA_2_T19 


1205 


1293 


R38144_PEA_2_T27 


725 


813 



Segment cluster R38144JPEA_2_node_37 according to the present invention is supported 
by 97 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144_PEA_2_T6, R38144_PEA_2JT10, 
R38144_PEA_2JT13, R38144_PEA_2_T15, R38144JPEA_2_T19 and R38144JPEA__2_T27. 
Table 1213 below describes the starting and ending position of this segment on each transcript. 



Table 1213- Segment location on transcripts 



Transcript narn^ \ 


' Segment. ",' ' . V,,';, 
starting position 


Segment :£f; ' -M 
ending position A t; 


R38144_PEA_2_T6 


1294 


1326 


R38144_PEA_2_T10 


1143 


1175 


R38144_PEA_2_T13 


1149 


1181 


R38144_PEA_2_T15 


1169 


1201 


R38144_PEA_2_T19 


1294 


1326 


R3 8 1 44_PE A_2_T27 


814 


846 



Segment cluster R3 8 1 44_PE A_2_node_43 according to the present invention can be 
found in the following transcript(s): R38144_PEA_2_T6. Table 1214 below describes the 
starting and ending position of this segment on each transcript. 
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Table 1214- Segment location on transcripts 



Transcript name ; 4 ^ ; ^ 


Segment ; 
startm^ position 


' Segment r - : C ■ ■. i: y 
ending position ; 4 


R38144_PEA_2_T6 


1327 


1346 



Segment cluster R38144JPEA_2_node_44 according to the present invention can be 
found in the following transcript(s): R38144JPEA_2_T6. Table 1215 below describes the 
starting and ending position of this segment on each transcript. 

Table 1215- Segment location on transcripts 



— — ' y i "V"" " ',: '. — : — " — ; . ; y ■ 1 • - -^ ; ,4 

Tr^sc^ y..' yj 


1 starting position : 


Segment - * \ '*J\ 
ending position ' 


R38144JPEA_2_T6 


1347 


1351 



Segment cluster R38144JPEA_2_node_45 according to the present invention can be 
found in the following transcript(s): R38144JPEA_2_T6, R38144JPEA_2_T10, 
R38144JPEA_2JT13 ? R38144J>EA_2jri5 and R38144_PEA_2JT27. Table 1216 below 
describes the starting and ending position of this segment on each transcript. 

Table 1216- Segment location on transcripts 



Transcript name " 7 .' ' 


Segment /{. 


Segment ' 




starting position 


ending position 


R38144_PEA_2_T6 


1352 


1372 


R38144_PEA_2_T10 


1176 


1196 


R38144_PEA_2_T13 


1182 


1202 


R38144_PEA_2_T15 


1202 


1222 


R38144_PEA_2_T27 


847 


867 
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Segment cluster R38144_PEA__2_node_51 according to the present invention is supported 
by 1 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R38144JPEA_2JT19. Table 1217 below describes 
the starting and ending position of this segment on each transcript. 



5 Table 121 7 - Segment location on transcripts 



Transcript name ' ; '■} 


Segment y- 
starting position 


Segment 

ending position § f 


R38144JPEA_2__T19 


1449 


1522 



10 

Variant protein alignment to the previously known protein: 
Sequence name: CT31_HUMAN 

15 

Sequence documentation : 

Alignment of: R38144_PEA_2_P6 x CT31_HUMAN 
20 Alignment segment 1/1: 

Quality: 4031.00 

Escore: 0 

Matching length: 413 Total 

25 length: 413 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 99.7 6 
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Total Percent Similarity: 100.00 Total Percent 

Identity: 99.76 

Gaps : 0 



Alignment : 

...» 

1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSY 5 0 

I I I I I I I I I ! i I I I I I I I i ] I I I I I I I M I I I I 1 I I I 1 I I I I ! I I I I I I I 

1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSY 5 0 
51 LENAFPFDELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEV 10 0 

I I I I I I I I I I 1 I ! I I i I M I I I I ! I H I I I I I I I I I I i I I I I I i I I I M I 

51 LENAFPFDELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRWEV 100 
101 LQDSVDFDIDVNASVFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPL 150 

I I I I I I I I I ! I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

101 LQDSVDFDIDVNASVFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPL 150 
151 LRMAEEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIV 20 0 

I I M I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

151 LRMAEEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIV 200 

201 E FAT L S S L T G D P VFE D VARVALMRL WE S RS D I G L VGN H I D VL T GKW VAQ D 250 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
201 E FAT L S S L T G D P VFE D VARVALMRL WE S RS D I GL VGN H I D VL T GKW V AQ D 250 

...» * 
251 AGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWV 300 

I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I 

251 AGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWV 30 0 

301 QMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGGLP 350 
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I I I I I I I I I I I I! I I I I M I I I I I I I I 1 I I I 1 M I I I I I I I I I 1 I I I I I I 

301 QMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGGLP 350 

351 EFYNIPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLELGRDAVESI 400 

5 I It I II ! I I I 1 1 I I I I 1 I 1 I I M I I I I i I I I I I I I I I II I I I I I I I 1 I I I 

351 EFYNIPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLELGRDAVESI 400 

4 01 EKISKVECGFATL 413 
I I I I I I I I I I I I : 

10 401 EKISKVECGFATI 413 



15 

Sequence name: CT31_HUMAN 
Sequence documentation : 

20 

Alignment of: R3 8144_PEA_2_P13 x CT31_HUMAN 

Alignment segment 1/1: 

25 Quality: 3167.00 
Escore: 0 

Matching length: 32 6 Total 
length: 32 6 

Matching Percent Similarity: 100.00 Matching Percent 
30 Identity: 99.39 
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Total Percent Similarity: 100.00 Total Percent 

Identity: 99.39 

Gaps : 0 

Alignment : 

1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSY 50 

I I I I 1 1 I I I I I I I I I 1 I I I I I I I M I ! I I I I I I I I I I i I I I I I I I I I I I I 

1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSY 50 

51 LENAFPFDELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRWEV 100 

I I I I I I I I I I I I I I I I I II I I I 1 I I I I I II I 1 I I I I I I I I I I I I I I I i I I 

51 LENAFPFDELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEV 100 



15 101 LQDSVDFDIDVNASVFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPL 150 

I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I 

101 LQDSVDFDIDVNASVFETNIRWGGLLSAHLLSKKAGVEVEAGWPCSGPL 150 

151 LRMAEEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIV 2 00 
20 | | | | | | I | | | M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

151 LRMAEEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIV 20 0 

201 E FAT L S S L T G D P VFE DVARVALMRL WE S RS D I GLVGNH I DVLT GKW VAQ D 250 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
25 201 E FAT L S S L T GD P VFE DVARVALMRL WE S RS D I G L VGN H I D VL T GKWVAQ D 250 

• • ■ • • 

251 AGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWV 300 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

251 AGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWV 300 



30 



301 QMYKGTVSMPVFQSLEAYWPGLQNLL 32 6 
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I I I II i 1 I 1 I I I I I I I 1 I I I I ! i : I : 
301 QMYKGTVSMPVFQSLEAYWPGLQSLI 32 6 



5 



Sequence name: CT3 INHUMAN 

10 

Sequence documentation : 

Alignment of: R38 14 4_PEA_2_P15 x CT 3 INHUMAN 
15 Alignment segment 1/1: 

Quality: 2725.00 

Escore: 0 

Matching length: 282 
20 length: 282 

Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 
25 Gaps : 0 

Alignment : 

1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSY 50 

30 | | I 1 | | I I | I I I I I I I I ! I I I I I I I I I I! I I I I I I! I I I I I I i I I I I II I 

1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSY 50 



Total 
Matching Percent 
Total Percent 
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51 LENAFPFDELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEV 100 

I I I ! I I I I I I I I I I I I I I I I I I I I I I i I I I I I 1 I I I I I I I I i I I I I I I ! 1 

51 LENAFPFDELRPLTCDGHDTWGS FSLTLIDALDTLLILGNVSEFQRVVEV 10 0 
5 • « • 

101 LQDSVDFDIDVNASVFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPL 15 0 

I I I I I I I I I 1 I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I 

101 LQDSVDFDIDVNASVFETNIRWGGLLSAHLLSKKAGVEVEAGWPCSGPL 150 
10 151 LRMAEEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIV 20 0 

I I I I I I I I I I I I I I I 11 I 1 I I I I I I I I I I I I 1 II I I I I I I I I I I I I I ! I I 

151 LRMAEEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIV 200 

201 E FAT L S S LT GD P VFE D VARVALMRL WE S RS DI GL VGNH I D VL T GKWVAQ D 250 

15 I I I | I I I I I I 1 II t i I I I I I I II I I I I I I I ! I I 1 I I I I I i I I I I I I I i I I 

201 EFATLSSLTGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAQD 250 

251 AG I GAGVDS YFE YLVKGAI LLQDKKLMAMFLE 282 

I I I t I I I II I I II I I I I I I I I I I I I I I I I I I I 

20 251 AG I GAGVDS YFE YLVKGAI LLQDKKLMAMFLE 282 



25 

Sequence name: CT31_HUMAN 
Sequence documentation : 

30 

Alignment of: R38144_PEA_2_P19 x CT 3 INHUMAN 
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Alignment segment 1/1: 

Quality : 

5 Escore: 0 

Matching length: 
length: 412 

Matching Percent Similarity: 
Identity: 100.00 
10 Total Percent Similarity: 

Identity: 100.00 

Gaps : 



4029.00 

412 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 



Alignment : 

15 ..... 

1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSY 50 

I I I I I 1 I I 11 I I I I I 11 I I 1 1 !! I II I I I I 11 I I I I 1 I II I I I I I I I! I I 
1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSY 50 

20 51 LENAFPFDELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRWEV 100 

I I I I I II I! I I 1 I I I I I I I I I I I I 1 I I I I I M I I I I I I I I I I I I 1 I I I I I 

51 LENAFPFDELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEV 10 0 
..... 

101 LQDSVDFDIDVNASVFETNIRWGGLLSAHLLSKKAGVEVEAGWPCSGPL 150 
25 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

101 LQDSVDFDIDVNASVFETNIRWGGLLSAHLLSKKAGVEVEAGWPCSGPL 150 

. . . • • 

151 LRMAEEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIV 20 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I 

30 151 LRMAEEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIV 200 
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201 EFATLSSLTGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAQD 250 

I I I I I I 1 I I I I 1 I I I I I I I I I I I I I I 1 I 11 I I I I I I I i ! 1 I I I I I I I I I I 

201 E FAT L S S L T G D P VFE D VARV ALMRLWE SRS D I GL VGN H I D VL T GKW VAQ D 250 
. . • - • 

5 251 AGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWV 300 
I I I I I I I I I I I I I I 1 I I I 1 I I I I i 1 I i I I I I i I I I I I I I 1 I I I I I I I II I 

251 AGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWV 30 0 

. • • 

301 QMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGGLP 350 

10 I I I I 1 I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I M I 1 I II I I I I I I I 

301 QMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGGLP 350 

. • • • * 

351 EFYNIPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLELGRDAVESI 40 0 

I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 1 I I I 

15 351 EFYNIPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLELGRDAVESI 40 0 

401 EKI SKVECGFAT 412 
I I I I I I I 1 I I I I 

4 01 EKI SKVECGFAT 412 

20 



25 

Sequence name: CT31__HUMAN 
Sequence documentation : 
30 Alignment of: R38144_PEA_2_P2 4 x CT31_HUMAN 
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Alignment segment 1/1: 

Quality: 4063.00 

Escore: 0 

5 Matching length: 418 Total 

length: 578 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 72.32 Total Percent 

10 Identity: 72.32 

Gaps : 1 

Alignment : 

..." * 

15 1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSY 50 

I I I I I I I I I I I i I ! I I I I I i I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSY 50 

51 LENAFPFDELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEV 100 
20 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

51 LENAFPFDELRPLTCDGHDTWGSFSLTLIDALDTLLILGNVSEFQRVVEV 100 



101 LQDSVDFDIDVNASVFETNIR 121 

I I I I I I II I I I I I I I I I I I I I 
25 101 LQDSVDFDIDVNASVFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPL 150 

121 121 

151 LRMAEEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIV 200 
30 ..... 

121 121 
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201 E FAT L S S L T G D P V FE D VARVA LMRL W E S R S D I G L VGN H I D VL T GK W V AQ D 250 

122 EYNKAIRNYTRFDDWYLWV 140 

5 I I I II 1 I I 11 I I 1 I I I I I I 

251 AGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWV 300 

141 QMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGGLP 190 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ) 1 1 i 1 1 1 1 M 1 1 1 m I i 1 1 1 1 1 M 1 1 1 i I 

10 301 QMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGGLP 350 

191 EFYNIPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLELGRDAVESI 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
351 EFYNIPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLELGRDAVESI 4 00 
15 ..... 

241 EKISKVECGFATIKDLRDHKLDNRMESFFLAETVKYLYLLFDPTNFIHNN 290 

I I 1 I I I I I I I I I II I I I I I I I I I 1 I I I I I I I II I I I I I I I I I I I I I I I I I 

4 01 EK I S KVE CGFAT I KDLRDHKLDNRME S FFLAE T VKYL YLLFDPTNF I HNN 45 0 

20 291 GSTFDAVITPYGECILGAGGYIFNTEAHPI DPAALHCCQRLKEEQWEVED 340 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

451 GSTFDAVITPYGECILGAGGYIFNTEAHPI DPAALHCCQRLKEEQWEVED 500 

341 LMREFYSLKRSRSKFQKNTVSSGPWEPPARPGTLFSPENHDQARERKPAK 390 
25 | | | | | | | | || | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ] 

501 LMREFYSLKRSRSKFQKNTVSSGPWEPPARPGTLFSPENHDQARERKPAK 55 0 

3 91 QKVPLLSCPSQPFTSKLALLGQVFLDSS 418 

I I I I I I I I I I I I I I I I I I I I I I I I II I I 
30 551 QKVPLLSCPSQPFTSKLALLGQVFLDSS 57 8 
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Sequence name: AAH16184 



Sequence documentation : 



10 Alignment of: R3814 4_PEA_2_JP3 6 x AAH16184 



Alignment segment 1/1: 



Quality: 

15 Escore: 0 

Matching length: 
length: 36 

Matching Percent Similarity: 
Identity: 100.00 
20 Total Percent Similarity: 

Identity: 100.00 

Gaps : 



364 .00 



36 



100.00 



Total 



100.00 Matching Percent 



Total Percent 



Alignment : 



25 



1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYR 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYR 



36 



36 



30 
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Sequence name: AAQ88 943 



Sequence documentation : 



Alignment of: R3 814 4__PEA_2_P3 6 x AAQ88943 



10 Alignment segment 1/1 



Quality : 

Escore: 0 

Matching length: 
15 length: 37 

Matching Percent Similarity: 
Identity: 97.30 

Total Percent Similarity: 
Identity: 97.30 
20 Gaps : 



362 .00 



37 



97 .30 



Total 



97.30 Matching Percent 



Total Percent 



Alignment : 



25 



1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRF 

I I I I I I! I I I I I I I I I I I I I I I I I I I 1 I I I I I 1 I I I 

1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYSF 



37 



37 



30 
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Sequence name: CT31_HUMAN 



Sequence documentation : 



Alignment of: R3814 4_PEA_2_P3 6 x CT 3 1 HUMAN 



Alignment segment 1/1: 



10 



Quality: 364.00 



Escore : 



0 



15 



Matching length: 
length: 3 6 

Matching Percent Similarity: 
. Identity: 100 . 00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



36 



100.00 



Total 



100.00 Matching Percent 



Total Percent 



20 Alignment: 



1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYR 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I M I 

1 MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYR 



36 



36 



25 



DESCRIPTION FOR CLUSTER HUMOSTRO 
Cluster HUMOSTRO features 3 transcript(s) and 30 segment(s) of interest, the names for 
which are given in Tables 1218 and 1219, respectively, the sequences themselves are given at 
30 the end of the application. The selected protein variants are given in table 1220. 

Table 1218 - Transcripts of interest 
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Transcript Name 


Sequence ID No. ; 


HUMOSTRO_PEA_l_PEA_l_T14 


141 


HUMOSTRO_PEA_l_PEA_l_T 1 6 


142 


HUMOSTRO_PEA_1_PEA_1_T30 


143 


Table 1219- Segments of interest 


OC^IiiCllt iNolJLlC ;f .,.v- a v ., -s^ .. 


Seoiience ID Wo * --f ^ 


HUMOSTRO PE A_ 1 _PEA_1 nocle U 


vol 


HUMOSTRO JPEA 1_PEA_1 node 1 0 


962 


HUMOSTRO_PEA_l JPEA_l_node_l 6 


963 


HUMOSTRO_PEA_l_PEA 1 node 23 


964 


ttt T"» *"v"v orriT* TVTi A 1 XkX"' A "1 -3 O 1 

HUMOSTRO_PEA_l_PEA l node_3 1 


96j> 


HUMO S TROPE A_ 1 _PE A_ 1 _n o de_43 


966 


HUMO STROPE A_ 1 JPEA_ 1 node 3 


967 


HUMOSTRO_PEA_ l_PEA_l_noae_D 


yoo 


HUMOSTRO_PEA_l_PEA_l_node_7 


969 


HUMOSTRO_PEA_l_PEA_l_node_8 


970 


HUMOSTRO_PEA_l_PEA_l_node_l 5 


971 


HUMOSTRO_PEA_l_PEA_l_node_l 7 


972 


HUMOSTRO_PEA_l_PEA_l_node_20 


973 


HUMOSTRO_PEA_l_PEA_l_node_2 1 


974 


HUMOSTRO_PEA_l_PEA_l_node_22 


975 


HUMOSTRO_PEA_l_PEA_l_node_24 


976 


HUMOSTRO_PEA_l_PEA_l_node_26 


977 


HUMOSTRO_PEA_l_PEA_l_node_27 


978 


HUMOSTRO_PEA_l_PEA_l_node_28 


979 


HUMOSTRO_PEA_l_PEA_l_node_29 


980 


HUMOSTRO_PEA_l_PEA_l_node_30 


981 


HUMOSTRO_PEA_l_PEA_l_node_32 


982 
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HUMOSTRO_PEA_lJPEA_l_node_34 


983 


HUMOSTRO_PEA_l_PEA_l_node_36 


984 


HUMOSTRO_PEA_l_PEA_l_node_37 


985 


HUMOSTRO_PEA_l_PEA_l_node_38 


986 


HUMOSTRO_PEA_l_PEA_l_node_39 


987 


HUMOSTRO_PEA_l_PEA_l_node_40 


988 


HUMOSTRO_PEA_l_PEA_l_node_4 1 


989 


HUMOSTRO_PEA_l_PEA_l_node_42 


990 



Table 1220- Proteins of interest 



Protein Name 


Sequence ID No. 


Cp^spondmg llai^cript(s) 


HUMOSTRO_PEA_l_PEA_ 
1_P21 


1627 


HUMOSTRO_PEA_l_PEA_ 
1_T14 


HUMOSTRO_PEA_l_PEA_ 
1_P25 


1628 


HUMOSTRO_PEA_l_PEA_ 
1_T16 


HUMOSTRO_PEA_l_PEA_ 
1_P30 


1629 


HUMOSTRO_PEA_l_PEA_ 
1_T30 



These sequences are variants of the known protein Osteopontin precursor (SwissProt 
5 accession identifier OSTPHUMAN; known also according to the synonyms Bone sialoprotein 
1; Urinary stone protein; Secreted phosphoprotein 1; SPP- 1 ; Nephropontin; Uropontin), SEQ ID 
NO: 1462, referred to herein as the previously known protein. 

Protein Osteopontin precursor is known or believed to have the following function(s): 
Binds tightly to hydroxyapatite. Appears to form an integral part of the mineralized matrix. 
10 Probably important to cell- matrix interaction. Acts as a cytokine involved in enhancing 

production of interferon- gamma and interleukin-12 and reducing production of interleukin-10 
and is essential in the pathway that leads to type I immunity (By similarity). The sequence for 
protein Osteopontin precursor is given at the end of the application, as "Osteopontin precursor 
amino acid sequence". Known polymorphisms for this sequence are as shown in Table 1221. 

1 5 Table 1221- Amino acid mutations for Known Protein 
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SNP position(s) onf 
amino acid sequence . 


Comment 


301 


R -> H (in dbSNP:4660). /FTId=VAR_014717. 


188 


D->H 


237 


T-> A 


275 - 278 


SHEF -> GNSL 



Protein Osteopontin precursor localization is believed to be Secreted. 



The previously known protein also has the following indication(s) and/or potential 
5 therapeutic use(s): Regeneration, bone. It has been investigated for clinical/therapeutic use in 
humans, for example as a target for an antibody or small molecule, and/or as a direct 
therapeutic; available information related to these investigations is as follows. Potential 
pharmaceutically related or therapeutically related activity or activities of the previously known 
protein are as follows: Bone formation stimulant. A therapeutic role for a protein represented by 
10 Hie cluster has been predicted. The cluster was assigned this field because there was information 
in the drug database or the public databases (e.g., described herein above) that this protein, or 
part thereof, is used or can be used for a potential therapeutic indication: Musculoskeletal. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: ossification; anti-apoptosis; inflammatory response; cell- matrix 
15 adhesion; cell-cell signaling, which are annotation(s) related to Biological Process; 

defense/immunity protein; cytokine; integrin ligand; protein binding; growth factor; apoptosis 
inhibitor, which are annotation(s) related to Molecular Function; and extracellular matrix, which 
are annotation(s) related to Cellular Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
20 Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
from <ht1p://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

Cluster HUMOSTRO can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
25 according to the previously described methods. The term "number" in the right hand column of 
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the table and the numbers on the y-axis of figure 46 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
Figure 46 and Table 1222. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: epithelial malignant tumors, a mixture of malignant tumors 
from different tissues, lung malignant tumors, breast malignant tumors, ovarian carcinoma and 
skin malignancies. 



Table 1222- Normal tissue distribution 



; Name of Tissue , % „ - j: - y ; 


3Sfumber*- 


Adrenal 


4 


Bladder 


0 


Bone 


897 


Brain 


506 


Colon 


69 


Epithelial 


548 


General 


484 


head and neck 


50 


Kidney 


5618 


Liver 


4 


Lung 


10 


lymph nodes 


75 


Breast 


8 


bone marrow 


62 


Muscle 


37 


Ovary 


40 


Pancreas 


845 


Prostate 


48 
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Skin 


13 


Stomach 


73 


Thyroid 


0 


Uterus 


168 



Table 1223- P values and ratios for expression in cancerous tissue 



Name of Tissue 


Pi - f i 


P2 


SPl 


R3 


SP2 


R4 : :■ 


Adrenal 


1.5e-01 


2.1e-01 


2.0e-02 


4.6 


4.4e-02 


3.6 


Bladder 


1.2e-01 


9.2e-02 


5.7e-02 


4.1 


2.1e-02 


4.3 


Bone 


4.9e-01 


7.4e-01 


4.1e-06 


0.6 


5.4e-01 


0.4 


Brain 


6.6e-01 


7.0e-01 


3.2e-01 


0.6 


1 


0.4 


Colon 


2.7e-01 


4.0e-01 


3.1e-01 


1.5 


5.2e-01 


1.1 


Epithelial 


2.0e-07 


1.6e-03 


9.8e-01 


0.7 


1 


0.5 


General 


1.2e-06 


1.2e-02 


7.9e-01 


0.8 


1 


0.6 


head and neck 


3.4e-01 


5.0e-01 


1 


0.7 


1 


0.7 


Kidney 


6.8e-01 


7.4e-01 


1 


0.2 


1 


0.1 


Liver 


3.3e-01 


2.5e-01 


1 


1.8 


2.3e-01 


2.6 


Lung 


4.3e-04 


4.6e-03 


2.1e-30 


15.0 


2.8e-27 


23.5 


lymph nodes 


6.7e-01 


8.7e-01 


8.1e-01 


0.7 


9.9e-01 


0.3 


Breast 


2.3e-01 


3.0e-01 


1.9e-04 


6.2 


4.1e-03 


4.3 


bone marrow 


7.5e-01 


7.8e-01 


1 


0.3 


2.0e-02 


1.2 


Muscle 


4.0e-02 


7.5e-02 


l.le-01 


4.6 


5.1e-01 


1.5 


Ovary 


4.7e-02 


8.4e-02 


1.9e-05 


5.4 


8.3e-04 


3.7 


Pancreas 


5.0e-02 


3.3e-01 


1 


0.3 


1 


0.2 


Prostate 


8.5e-01 


9.0e-01 


8.9e-01 


0.7 


9.5e-01 


0.6 


Skin 


1.6e-01 


1.6 e -01 


1.2e-10 


12.6 


5.2e-04 


4.1 


Stomach 


1.5e-01 


6.3 e -01 


5.0e-01 


1.2 


9.4e-01 


0.6 


Thyroid 


2.9e-01 


2.9e-01 


5.9e-02 


2.0 


5.9e-02 


2.0 


Uterus 


6.1e-02 


5.7 e -01 


l.le-01 


1.3 


7.0e-01 


0.7 
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As noted above, cluster HUMOSTRO features 3 transcript(s), which were listed in Table 
1 above. These transcript(s) encode for protein(s) which are variant(s) of protein Osteopontin 
precursor. A description of each variant protein according to the present invention is now 
provided. 

5 

Variant protein HUMOSTROJPEA_l_PEA_l_P21 according to the present invention has 
an amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
HUMOSTRO_PEA_l JPEA1T14. An alignment is given to the known protein (Osteopontin 
precursor) at the end of the application. One or more alignments to one or more previously 
10 published protein sequences are given at the end of the application. A brief description of the 

relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between HUMOSTRO JPE A 1 PEA_1P2 1 and OSTP_HUMAN: 

1. An isolated chimeric polypeptide encoding for HUMOSTRO_PEA_l_PEA_l_P21, 
1 5 comprising a first amino acid sequence being at least 90 % homologous to 

MRIAVICFCLLGITCAIPVKQADSGSSEEKQLYNKYPDAVATWLNPDPSQKQNLLAPQ 
corresponding to amino acids 1 - 58 of O S TP JHUM AN, which also corresponds to amino acids 
1-58 of HUMOSTRO JPEA_1_PEA_1_P2 1 , and a second amino acid sequence being at least 
70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
20 preferably at least 95% homologous to a polypeptide having the sequence VFLNFS 

corresponding to amino acids 59 - 64 of HUMOSTRO JPEA_1JPEA_1JP2 1 , wherein said first 
amino acid sequence and second amino acid sequence are contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of HUMOSTROJPEA_l JPEA_1_P21, 
comprising a polypeptide being at least 70%, optionally at least about 80%, preferably at least 

25 about 85%, more preferably at least about 90% and most preferably at least about 95% 
homologous to the sequence VFLNFS in HUMOSTRO JPEA_1_PEA_1_P2 1 . 



30 



The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
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secreted. The protein localization is believed to be secreted because of manual inspection of 
known protein localization and/or gene structure. 

Variant protein HUMOSTRO_PEA_l_PEA_l_P21 also has the following non-silent 
SNPs (Single Nucleotide Polymorphisms) as listed in Table 1224, (given according to their 
5 position(s) on the amino acid sequence, with the alternative amino acid(s) listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
HUMOSTRO_PEA_l_PEA_l_P21 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 1224- Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino aoid(s) 


Ptef^otjs% known ;SNP?. ., . 


7 


C->W 


No 


31 


Q ->R 


No 


47 


D->V 


Yes 


49 


S ->P 


No 



The glycosylation sites of variant protein HUMOSTRO_PEA__l_PEA_J_P21, as 
compared to the known protein Osteopontin precursor, are described in Table 1225 (given 
according to their position(s) on the amino acid sequence in the first column; the second column 
indicates whether the glycosylation site is present in the variant protein; and the last column 
1 5 indicates whether the position is different on the variant protein). 



Table 1225- Glycosylation site(s) 



Ppsitk)n(s) on known amino 
acid sequence 


Present in variant protein? 


79 


no 


106 


no 



Variant protein HUMOSTRO_PEA_l _PEA_1_P2 1 is encoded by the following 
transcript(s): HUMOSTROJPEA_l_PEA_l_T14, for which the sequence(s) is/are given at the 
20 end of the application. The coding portion of transcript HUMOSTRO JPE A_ 1 JPE A_ 1 _T 1 4 is 
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shown in bold; this coding portion starts at position 199 and ends at position 390. The transcript 
also has the following SNPs as listed in Table 1226 (given according to their position on the 
nucleotide sequence, with the alternative nucleic acid listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein 
5 HUMOSTRO_PEA_l_PEA_l_P21 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 1226- Nucleic acid SNPs 



SNP position on nucleotide , 
sequence 


: Alternative nucleic acid 


. Previously known SNP? <; 


136 


A->G 


Yes 


154 


T-> 


No 


159 


G->T 


Yes 


219 


C->G 


No 


274 


->G 


No 


290 


A->G 


No 


338 


A->T 


Yes 


343 


T->C 


No 


413 


G->C 


Yes 


707 


C->T 


Yes 


708 


C->A 


Yes 


715 


A->G 


Yes 


730 


A->C 


No 


730 


A->G 


No 


746 


T->C 


Yes 


767 


C->T 


No 


779 


G->A 


Yes 


866 


->G 


No 


869 


T-> 


No 


889 


-> A 


No 


891 


A->C 


No 
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891 


A->G 


No 


905 


T->C 


No 


910 


->G 


No 


910 


->T 


No 


997 


A->G 


No 


1026 


G->C 


No 


1042 


->G 


No 


1042 


->T 


No 


1071 


A-> 


No 


1071 


A->C 


No 


1098 


A-> 


No 


1105 


C ->T 


No 


1124 


->G 


No 


1135 


G->A 


Yes 


1136 


T-> 


No 


1136 


T->G 


No 


1173 


A->C 


No 


1173 


A->G 


No 


1179 


A->G 


No 


1214 


C->T 


Yes 


1246 


T-> 


No 


1246 


T-> A 


No 


1359 


A-> 


No 


1359 


A->G 


No 


1362 


T-> 


No 


1365 


C->T 


Yes 


1366 


G-> A 


Yes 


1408 


A->C 


No 


1418 


A->C 


No 


1433 


A->C 


No 
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1456 


A->C 


No 


1524 


T -> A 


No 


1524 


T->C 


No 


1547 


A->G 


Yes 


1553 


T-> 


No 


1574 


->G 


No 


1654 


A->C 


Yes 


1691 


A->G 


No 


1703 


A->C 


Yes 


1755 


A->C 


No 


1764 


T-> 


No 



Variant protein HUMOSTRO_PEA_l JPEA_1_P25 according to the present invention has 
an amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMOSTROJPEA_l JPEAl JT16. An alignment is given to the known protein (Osteopontin 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

10 Comparison report between HUMOSTRO _PEA_1 JPEA_1 JP25 and OSTP_HUMAN: 

l.An isolated chimeric polypeptide encoding fcr HUMOSTRO_PEA_l_PEA_l_P25, 
comprising a first amino acid sequence being at least 90 % homologous to 
MRIAVICFCLLGITCAIPVKQADSGSSEEKQ corresponding to amino acids 1 - 31 of 
OSTPHUMAN, which also corresponds to amino acids 1 - 31 of 

15 HUMOSTRO_PEA_l_PEA_l_P25, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence H corresponding to 
amino acids 32 - 32 of HUMOSTROJPEA_l_PEA_l_P25, wherein said first amino acid 
sequence and second amino acid sequence are contiguous and in a sequential order. 

20 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
5 prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein HUMOSTRO_PEA_l_PEA__l_P25 also has the following non-silent 
SNPs (Single Nucleotide Polymorphisms) as listed in Table 1227, (given according to their 
position(s) on the amino acid sequence, with the alternative amino acid(s) listed; the last column 
10 indicates whether the SNP is known or not; the presence of known SNPs in variant protein 

HUMOSTROJPEA_l JPEA_1_P25 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 1227- Amino acid mutations 



; SNP position(s) on amino acid 
sequence '• y:£ ■%' ,. 


Alternative amino acid(s) 


Previously known SNP? 


7 


C->W 


No 


31 


Q->R 


No 



15 The glycosylation sites of variant protein HUMOSTRO_PEA_l_PEA_l_P25, as 

compared to the known protein Osteopontin precursor, are described in Table 1228 (given 
according to their position(s) on the amino acid sequence in the first column; the second column 
indicates whether the glycosylation site is present in the variant protein; and the last column 
indicates whether the position is different on the variant protein). 

20 Table 1228- Glycosylation site(s) 



Position(s) on known amino 
acid sequence 


Present in variant protein? 


79 


no 


106 


no 
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Variant protein HUMOSTRO_PEA_l_PEA_l_P25 is encoded by the following 
transcript(s): HUMOSTROJPEA_l JPEA_1 JT16, for which the sequence(s) is/are given at the 
end of the application. The coding portion of transcript HUMOSTROJPEA_l_PEA_l_T16 is 
shown in bold; this coding portion starts at position 199 and ends at position 294. The transcript 
5 also has the following SNPs as listed in Table 1229 (given according to their position on the 
nucleotide sequence, with the alternative nucleic acid listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein 

HUMOSTROJ?EA _1 JPEA_1 JP25 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

1 0 Table 1229- Nucleic acid SNPs 



SNP position ori nucleotide , 
sequence W 


Alternative nucleic acid 


Previously known SNP? i 


136 


A->G 


Yes 


154 


T-> 


No 


159 


G->T 


Yes 


219 


C->G 


No 


274 


->G 


No 


290 


A->G 


No 


419 


C->T 


Yes 


454 


G->C 


Yes 


527 


A->T 


Yes 


532 


T->C 


No 


630 


C->T 


Yes 


631 


C-> A 


Yes 


638 


A->G 


Yes 


653 


A->C 


No 


653 


A->G 


No 


669 


T->C 


Yes 


690 


C->T 


No 


702 


G-> A 


Yes 
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789 


->G 


No 


792 


T-> 


No 


812 


->A 


No 


814 


A->C 


No 


814 


A->G 


No 


828 


T->C 


No 


833 


->G 


No 


833 


->T 


No 


920 


A->G 


No 


949 


G->C 


No 


965 


->G 


No 


965 


->T 


No 


994 


A-> 


No 


994 


A->C 


No 


1021 


A-> 


No 


1028 


C->T 


No 


1047 


->G 


No 


1058 


G-> A 


Yes 


1059 


T-> 


No 


1059 


T->G 


No 


1096 


A->C 


No 


1096 


A->G 


No 


1102 


A->G 


No 


1137 


C->T 


Yes 


1169 


T-> 


No 


1169 


T->A 


No 


1282 


A-> 


No 


1282 


A->G 


No 


1285 


T-> 


No 


1288 


C->T 


Yes 
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1289 


G -> A 


Yes 


1331 


A->C 


No 


1341 


A-> C 


No 


1356 i 


A->C 


No 


1379 1 


A->C 


No 


1447 


T-> A 


No 


1447 


T->C 


No 


1470 


A->G 


Yes 


1476 


T-> 


No 


1497 


->G 


No 


1577 


A->C 


Yes 


1614 


A->G 


No 


1626 


A->C 


Yes 


1678 


A->C 


No 


1687 


T-> 


No 



Variant protein HUMOSTRO_PEA_l_PEA_l JP30 according to the present invention has 
an amino acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 HUMOSTRO_PEA_1_PEA_1_T30. An alignment is given to the known protein (Osteopontin 
precursor) at the end of the application. One or more alignments to one or more previously 
published protein sequences are given at the end of the application. A brief description of the 
relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

10 Comparison report between HUMOSTRO J>EA_1 JPEA_1_P30 and OSTPJHUMAN: 

l.An isolated chimeric polypeptide encoding for HUMOSTRO_PEA_1_PEA_1_P30, 

comprising a first amino acid sequence being at least 90 % homologous to 

MRIAVICFCLLGITCAIPVKQADSGSSEEKQ corresponding to amino acids 1 - 31 of 

OSTPHUMAN, which also corresponds to amino acids 1 - 3 1 of 
15 HUMOSTRO_PEA_1_PEA_1_P30 ? and a second amino acid sequence being at least 70%, 

optionally at least 80% 5 preferably at least 85%, more preferably at least 90% and most 
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preferably at least 95% homologous to a polypeptide having the sequence VSIFYVFI 
corresponding to amino acids 32 - 39 of HUMOSTROJPEA_l JPEA_1JP30, wherein said first 
amino acid sequence and second amino acid sequence are contiguous and in a sequential order. 
2.An isolated polypeptide encoding for a tail of HUMOSTROJPEA_1_PEA_1_P30, 
5 comprising a polypeptide being at least 70%, optionally at least about 80%, preferably at least 
about 85%, more preferably at least about 90% and most preferably at least about 95% 
homologous to the sequence VSIFYVFI in HUMOSTRO_PEA_l JPEA__1_P30. 

The location of the variant protein was determined according to results from a number of 
1 0 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 
15 Variant protein HUMOSTROJPEA_1_PEA_1_P30 also has the following non-silent 

SNPs (Single Nucleotide Polymorphisms) as listed in Table 1230, (given according to their 
position(s) on the amino acid sequence, with the alternative amino acid(s) listed; the last column 
indicates whether the SNP is known or not; the presence of known SNPs in variant protein 
HUMOSTROJPEA _1_PEA_1 JP30 sequence provides support for the deduced sequence of this 
20 variant protein according to the present invention). 

Table 1230- Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) .>.■; 


j Previously known SNP? 


7 


C->W 


No 


31 


Q->R 


No 



The glycosylation sites of variant protein HUMOSTROJPEA_l_PEAJ JP30, as 
compared to the known protein Osteopontin precursor, are described in Table 1231 (given 
25 according to their position(s) on the amino acid sequence in the first column; the second column 
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indicates whether the glycosylation site is present in the variant protein; and the last column 
indicates whether the position is different on the variant protein). 

Table 1231- Glycosylation site(s) 



Position(s) on known amino 
acid sequence ' 


Pre^iit in variant protein? : 


79 


no 


106 


no 



5 Variant protein HUMOSTRO_PEA_1_PEA_1__P30 is encoded by the following 

transcript(s): HUMOSTRO_PEA_1_PEA_1_T30 5 for which the sequence(s) is/are given at the 
end of the application. The coding portion of transcript HUMOSTROJPEA_l JPEA_1_T30 is 
shown in bold; this coding portion starts at position 199 and ends at position 315. The transcript 
also has the following SNPs as listed in Table 1232 (given according to their position on the 
10 nucleotide sequence, with the alternative nucleic acid listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein 

HUMOSTRCHPEA 1PEA1P30 sequence provides support for the deduced sequence of this 
variant protein according to the present invention). 

Table 1232- Nucleic acid SNPs 



SNP position on nucleotide 
sequence 


Alternative nucleic acid 


': Previously known SNP? 


136 


A->G 


Yes 


154 


T-> 


No 


159 


G->T 


Yes 


219 


C->G 


No 


274 


->G 


No 


290 


A->G 


No 



15 As noted above, cluster HUMOSTRO features 30 segments), which were listed in Table 

2 above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
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of particular interest. A description of each segment according to the present invention is now 
provided. 

Segment cluster HUMOSTRO_PEA_l_PEA_l_node_0 according to the present 
5 invention is supported by 333 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
HUMOSTROJPEAJ JPEA_1JT14, HUMOSTRO_PEA_l_PEA_l__T16 and 
HUMOSTROJPEA__l JPEAl JT30. Table 1233below describes the starting and ending 
position of this segment on each transcript. 

10 Table 1234- Segment location on transcripts 



Transcript name. V%--$*' . 


: . Segment §„ 
starling position • 


|SegmentJ^,; --f\ 
ending position -'4 


HUMOSTRO_PEA_l_PEA_l_T14 


1 


184 


HUMOSTRO_PEA_l_PEA_l_Tl 6 


1 


184 


HUMOSTRO_PEA_1_PEA_1_T30 


1 


184 



Segment cluster HUMOSTRO_PEA_l_PEA_l_node_10 according to the present 
invention is supported by 4 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
15 HUMOSTROJPEA_l_PEA_lJT16. Table 1235 below describes the starting and ending 
position of this segment on each transcript. 

Table 1235- Segment location on transcripts 



Transcript name ; 


Segment 
starting position 


\ Segment 

| ending position 


HUMOSTRO_PEA__l JPEA_J_T1 6 


292 


480 



20 Segment cluster HUMO STROPE A_ 1 _PE A_l_node_ 1 6 according to the present 

invention is supported by 6 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
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HUMOSTROJPEA^l_PEA_l_T14. Table 1236 below describes the starting and ending 
position of this segment on each transcript. 

Table 1236- Segment location on transcripts 



Transcript name . /;? V 4*" f 


Segment 
^starting position 


I Segmpjt : 
endmg^positioii! V 


HUMOSTROJ>EA_lJPEA_l_T14 


373 


638 



Segment cluster HUMOSTRO_PEA_l JPEA_l_node__23 according to the present 
invention is supported by 334 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 

HUMOSTRO_PEA_l_PEA_l_T14 and HUMOSTRO_PEA_l_PEA_l_T16. Table 1237 below 
10 describes the starting and ending position of this segment on each transcript. 



Table 1237 - Segment location on transcripts 



TSai^cript name? I •■ ' \f ;: 'l 


: . Seginent . -* fff •- 
starting position 


Segment * 
ending position y 


HUMOSTRO_PEA_l_PEA_l_T14 


804 


967 


HUMOSTRO_PEA_l_PEA_l_T 1 6 


727 


890 



Segment cluster HUMOSTROJPEA_l JPEA_l_node_3 1 according to the present 
15 invention is supported by 350 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 

HUMOSTRO_PEA_l_PEA_l„T14 and HUMOSTROJPEA_l_PEA_l_T16. Table 1238 below 
describes the starting and ending position of this segment on each transcript. 

Table 1238- Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


HUMOSTROJPEA_lJPEA_l_T14 


1164 


1393 
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HUMOSTRO PEA 1 PEA 1 T16 



1087 



1316 



Segment cluster HUMOSTROJPEA_l_PEA_l_node_43 according to the present 
invention is supported by 192 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 

HUMOSTROJ>EA_l J>EA_1_T14 and HUMOSTRO_PEA_l JPEA_1_T16. Table 1239 below 
describes the starting and ending position of this segment on each transcript. 

Table 1239 - Segment location on transcripts 



Transcript name ' . - ., -i- ' 

: - ' -ft ■ -v ■ " • \ r^,. : -.y* ,. 

> .''H - .. ":. s . ' ■ ' - % \ - 


Segment 
starting position 


Segment 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


1810 


1846 


HUMOSTRO_PEA_l_PEA_l_T16 


1733 


1769 



According to an optional embodiment of the present invention, short segments related to 
10 the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



15 



Segment cluster HUMOSTROJPEA_l_PEA_l__node_3 according to the present 
invention is supported by 353 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
HUMOSTROJPEA_l_PEA_l_T14, HUMOSTRO_PEA__l__PEA_l_T16 and 
HUMOSTRO_PEA_1_PEA_1_T30. Table 1240 below describes the starting and ending 
position of this segment on each transcript. 

Table 1240- Segment location on transcripts 



Transcript name 


Segment 


Segment 




starting position 


ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


185 


210 


HUMOSTRO_PEA_l_PEA_l_T16 


185 


210 


HUMOSTRO_PEA_1_PEA_1_T30 


185 


210 



20 
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Segment cluster HUMOSTROJ?EA_l JPEA_l_node_5 according to the present 
invention is supported by 353 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
5 HUMOSTROJPEA_l JPEA_1 JT14, HUMOSTCO J>EA_1 JPEA_1 JT1 6 and 

HUMOSTROJPEA_1JPEA_1_T30. Table 1241 below describes the starting and ending 
position of this segment on each transcript. 

Table 1241- Segment location on transcripts 





Segment :: ■ v 
starting position 


Segment f\ t v- 
ending position ht 


HUMOSTRO_PEA_l_PEA_l_T 1 4 


211 


252 


HUMOSTRO_PEA_l_PEA_l_T16 


211 


252 


HUMOSTRO_PEA_1_PEA_1_T30 


211 


252 



10 

Segment cluster HUMOSTROJPEA_l_PEA_l_node_7 according to the present 
invention is supported by 357 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
HUMOSTROJPEA_l _PEA_1 JT14, HUMOSTRO_PEA_l_PEA_l_T16 and 
15 HUMOSTRO JPEA_1_PEA_1 JT3 0 . Table 1242 below describes the starting and ending 



position of this segment on each transcript. 
Table 1242- Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


253 


291 


HUMOSTRO_PEA_l_PEA_l_Tl 6 


253 


291 


HUMOSTRO_PEA_1_PEA_1_T30 


253 


291 
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Segment cluster HUMOSTRO_PEA_l_PEA_J_node_8 according to the present 
invention is supported by 1 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
HUMOSTROJPEA_l JPEA1T30. Table 1243 below describes the starting and ending 



5 position of this segment on each transcript. 
Table 1243- Segment location on transcripts 



Transcript name > 


Segment % C 
starting position - ' 


f Segment . , 
ending positiQn| ; 


HUMOSTROJPEAJJ>EA_1_T30 


292 


378 



Segment cluster HUMOSTRO__PEA_l__PEA_l__node_15 according to the present 
10 invention is supported by 366 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 

HUMOSTROJ>EA„l_PEA_l_T14 and HUMOSTRO_PEA_l JPEA_1 JT16. Table 1244 below 
describes the starting and ending position of this segment on each transcript. 

Table 1244 - Segment location on transcripts 



Transcript name • 


Segment 
starting position 


Segment . . i 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


292 


372 


HUMOSTRO_PEA_l_PEA_l_Tl 6 


481 


561 
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Segment cluster HXJMOSTRO_PEA_l_PEA_l_node_17 according to the present 
invention is supported by 261 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
20 HUMOSTRO_PEA_l_PEA_l_T14 and HUMOSTRO_PEA_l_PEA_l_Tl 6. Table 1245 below 
describes the starting and ending position of this segment on each transcript. 

Table 1245 - Segment location on transcripts 



WO 2006/131783 



PCT/IB2005/004037 



1253 



( Txap$cttpt name, r j r ■-i,^ 


Segment 
starting position 


Segment 
ending positio n 


HUMOSTRO_PEA_l_PEA_l_T14 


639 


680 


HUMOSTRO J>EA_1JPEA_1_T1 6 


562 


603 



Segment cluster HUMOSTROJPEA_l JPEA_l_node_20 according to the present 
invention can be found in the following transcript(s): HUMOSTROJPEA_l JPEA_1_T14 and 



5 HUMOSTRO_PEA_l JPEA_1_T16. Table 1246 below describes the starting and ending 
position of this segment on each transcript. 

Table 1246 - Segment location on transcripts 



TVanscfip^ name ' } if 1 


Segment^ 
^starting p<>sitipn 


Segment 

ending position - 


HUMOSTRO_PEA_lJPEA_lJT14 


681 


688 


HUMOSTROJPEA_l_PEA_l_T16 


604 


611 



10 Segment cluster HUMOSTRO_PEAJJPEA_l_node_21 according to the present 

invention is supported by 315 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 

HUMOSTROJPEA_l_PEA_lJT14 and HUMOSTROJPEA_l_PEA_l_T16. Table 1247 below 
describes the starting and ending position of this segment on each transcript. 

15 Table 1247 - Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


689 


738 


HUMOSTRO_PEA_l_PEA_l_T 1 6 


612 


661 



WO 2006/131783 



PCT/IB2005/004037 



1254 

Segment cluster HUMOSTROJPEA_l JPEA_l_node_22 according to the present 
invention is supported by 322 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 

HUMOSTROJPEA_l_PEA__l_T14 and HUMOSTRO_PEA_lJPEA_l_T16. Table 1248 below 



5 describes the starting and ending position of this segment on each transcript. 
Table 1248 - Segment location on transcripts 



Transcript name ' : f-% • . v. - 


Segment . . 
r starting position 


Segment 5 
Ending position 


HUMOSTRO_PEA_l_PEA_l_T 1 4 


739 


803 


HUMOSTRO_PEA_l_PEA_l_T 1 6 


662 


726 



Segment cluster HUMOSTRO_PEA_l__PEA__l_node_24 according to the present 
10 invention is supported by 270 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 

HUMOSTRO_PEA_l_PEA_l_T14 and HUMOSTRO_PEA_l„PEA_l_T16. Table 1249 below 
describes the starting and ending position of this segment on each transcript. 

Table 1249 - Segment location on transcripts 



. Ti^cn|^'nam0 , .. ■ ' . s . 


Segment v ; « . 
starting position ; $' 


Segment !> 
ending position , 


HUMOSTRO_PEA_l_PEA_l_T14 


968 


1004 


HUMOSTRO_PEA_l_PEA_l_T 1 6 


891 


927 
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Segment cluster HUMOSTROJPEA_l JPEA_l_node__26 according to the present 
invention can be found in the following transcript(s): HUMOSTROJ>EA_l J>EA_1 JT14 and 
HUMOSTRO_PEA_l JPEA_1_T16. Table 1250 below describes the starting and ending 
20 position of this segment on each transcript. 

Table 1250 - Segment location on transcripts 
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Transcript name ... ; i . '. . ?' 


Segment 
starting position 


Segment 
ending position 


HUMOSTRO_PEA_l _PEA_1_T 1 4 


1005 


1022 


HUMOSTRO_PEA_l_PEA_l_T16 


928 


945 



Segment cluster HUMOSTRO_PEA_lJPEA_l_node_27 according to the present 
invention is supported by 260 libraries. The number of libraries was determined as previously 
5 described. This segment can be found in the following transcript(s): 

HUMOSTROJPEA_lJPEA_lJT14 and HUMOSTROJPEA_l JPEA_1 JT16. Table 1251 below 
describes the starting and ending position of this segment on each transcript. 

Table 1251 - Segment location on transcripts 



Transcript name ■ - "pf fi- 

- - ■ v -\ f-fe"- ■ ■ : ■•■-'"<"•■ ■ ' - t 


Segment 

starting position - 


Segment 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


1023 


1048 


HUMOSTRO_PEA_l_PEA_l_T 1 6 


946 


971 
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Segment cluster HUMOSTRO_PEA_l_PEA_l_node_28 according to the present 
invention is supported by 273 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 

HUMOSTRO_PEA_l_PEA_l_T14 and HUMOSTRO_PEA_l_PEA_l_T16. Table 1252 below 
1 5 describes the starting and ending position of this segment on each transcript. 



Table 1252- Segment location on transcripts 



Transcript name 


Segment 
starting position 


Segment 
' ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


1049 


1100 


HUMOSTRO_PEA_l_PEA_l_T16 


972 


1023 
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Segment cluster HUMO STROJPE A_ 1 JPE A_ 1 jno de 2 9 according to the present 
invention is supported by 272 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
5 HUMOSTROJPEA_l_PEA__lJT14 and HUMOSTRO_PEA_l JPEA_1_T16. Table 1253 below 
describes the starting and ending position of this segment on each transcript. 

Table 1253- Segment location on transcripts 



Transcript name W -l'' 


^Segment v 5 ; - \f. 
starting position 


) Segment <f* 
ending position , 


HUMOSTRO_PEA_l_PEA_l_T14 


1101 


1151 


HUMOSTRO_PEA_l_PEA_l_Tl 6 


1024 


1074 



1 0 Segment cluster HUMOSTRO_PEA_l_PEA__l_node_30 according to the present 

invention can be found in the following transcript(s): HUMOSTRO_PEA_l_PEA_l_T14 and 
HUMOSTRO JPEA_1_PEA_1_T1 6. Table 1254 below describes the starting and ending 
position of this segment on each transcript. 

Table 1254- Segment location on transcripts 



Transcript name ' • •'• *; ' .' 


Segment ■: ■ : : - 
| starting position: 


Segment . ■ ':( '' A 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


1152 


1163 


HUMOSTRO_PE A_1_PEA_1_T 1 6 


1075 


1086 



15 

Segment cluster HUMOSTRO_PE A_ 1 JPE A_l _node_3 2 according to the present 
invention is supported by 293 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
20 HUMOSTRO JPEA_1JPEA_1_T14 and HUMOSTRO_PEA_l JPEA_1_T16. Table 1255 below 
describes the starting and ending position of this segment on each transcript. 
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Table 1255- Segment location on transcripts 



Transcript name ; v 


Segment .;/ 
Starting position ; 


Segment * 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


1394 


1427 


HUMOSTRO_PEA_l_PEA_l_T16 


1317 


1350 



Segment cluster HUMOSTROPEAl JPEA__1 jtiode_34 according to the present 
5 invention is supported by 301 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 

HUMOSTROJ>EA_l JPEA_1_T14 and HUMOSTROJ>EA_l JPEA_1 JT16. Table 1256 below 
describes the starting and ending position of this segment on each transcript. 

Table 1256 - Segment location on transcripts 



Transcript name/ .; 


- Segment '--f 
i starting position' ' "%g 


, Segment Y - 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


1428 


1468 


HUMOSTRO_PE A_l _PE A_1_T 1 6 


1351 


1391 
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Segment cluster HUMOSTRO_PEA_l JPEA„l_node_36 according to the present 
invention is supported by 292 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 
15 HUMOSTROJ>EA_lJPEA_l_T14 and HUMOSTRO_PEA_l_PEA_l_T16. Table 1257 below 
describes the starting and ending position of this segment on each transcript. 

Table 1257 - Segment location on transcripts 



Transcript name 


1 Segment 

I starting position 


Segment 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


1469 


1504 


HUMOSTRO_PEA_l_PEA_l_Tl 6 


1392 


1427 
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Segment cluster HUMOSTRO_PEA_l__PEA_l_node__37 according to the present 
invention is supported by 295 libraries. The number of libraries was determined as previously 
5 described. This segment can be found in the following transcript(s): 

HUMOSTRO_PEA_l_PEA_l_T14 and HUMOSTRO_PEA„l„PEA_l_T16. Table 1258 below 
describes the starting and ending position of this segment on each transcript. 

Table 1258- Segment location on transcripts 





Segment + 
starting position 


Segment 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


1505 


1623 


HUMOSTRO_PEA_l_PEA_l_T16 


1428 


1546 



Segment cluster HUMO STRO PE A_ 1 JPE A_ 1 _node_3 8 according to the present 
invention can be found in the following transcript(s): HUMOSTRO_PEA_l_PEA_l_T14 and 
HUMOSTROPE A_ 1 _PE A_ 1 _T 1 6 . Table 1259 below describes the starting and ending 
position of this segment on each transcript. 

15 Table 1259 - Segment location on transcripts 



Transcript name f i , .f 


Segment , " . 
! starting position 


Segment 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


1624 


1634 


HUMOSTRO_PEA_l_PEA_l_Tl 6 


1547 


1557 



Segment cluster EIUMOSTRO_PEA_l_PEA_l_node_39 according to the present 
invention is supported by 268 libraries. The number of libraries was determined as previously 
20 described. This segment can be found in the following transcript(s): 

HUMOSTRO_PEA_l_PEA_l JT14 and HUMOSTRO_PEA_l_PEA_l_T16. Table 1260 below 
describes the starting and ending position of this segment on each transcript. 
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Table 1260 - Segment location on transcripts 



Transcript name "k ; 


1 Segment 
starting position , " 


Segment 
ending position 


HUMOSTRO_PEA_lJPEA_l_T14 


1635 


1725 


HUMOSTRO_PE A_l _PE A_l _T 1 6 


1558 


1648 



Segment cluster HUMOSTROJPEA_l_PEA_l_node_40 according to the present 
5 invention can be found in the following transcript(s): HUMOSTROJPEA_l JPEA_1_T14 and 
HUMOSTRO_PEA_lJPEA_l_T16. Table 1261 below describes the starting and ending 
position of this segment on each transcript. 

Table 1261 - Segment location on transcripts 



•Trd^qript~na|n^ - • \ ' ; % 


Segment 4 : / , • 
starting position 


Segment, ■- 
ending position 


HUMOSTRO_PEA_l_PEA_l_T14 


1726 


1743 


HUMOSTRO_PEA_l_PEA_l_Tl 6 


1649 


1666 
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Segment cluster HUMOSTROJPEA_l JPEA_l_node_41 according to the present 
invention can be found in the following transcript(s): HUMOSTRO_PEA__l_PEA_l_T14 and 
HUMOSTRO_PEA_l_PEA_l_T16. Table 1262 below describes the starting and ending 
position of this segment on each transcript. 

15 Table 1262 - Segment location on transcripts 



Transcript name 


[Segment 
starting position 


> Segment 
ending position 


HTJMOSTRO_PEA_l_PEA_l_T14 


1744 


1749 


HUMOSTRO_PEA_l_PEA_l_Tl 6 


1667 


1672 
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Segment cluster HUMOSTROJPEA_l JPEA_1 jnode_42 according to the present 
invention is supported by 224 libraries. The number of libraries was determined as previously 
described. This segment can be found in the following transcript(s): 

HUMOSTROJPEA_l_PEA_l_T14 and HUMOSTRO_PEA_l_PEA_l_T16. Table 1263 below 



5 describes the starting and ending position of this segment on each transcript. 
Table 1263 - Segment location on transcripts 



— - .. ..- ; — ; .... . ■, .. 

Transcript -name, " *T ? • 

• '' . .. ; iJ ; : i v . ■ it J ; 


Segment ~' : y 
starting positibii | J5 


Segment 

ending position # 


HUMOSTRO_PEA_lJPEA_l_T14 


1750 


1809 


HUMOSTROJPEA_l_PEA_l_T16 


1673 


1732 



10 



Variant protein alignment to the previously known protein: 

Sequence name: OSTP_HUMAN 

15 Sequence documentation: 

Alignment of: HUMO S TROUPE A_1_PEA_1__P 2 1 x OSTPJHUMAN 
Alignment segment 1/1: 

20 

Quality: 578.00 

Escore: 0 

Matching length: 58 Total 

length: 58 

25 Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 
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Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

5 Alignment: 



1 MRIAVICFCLLGITCAIPVKQADSGSSEEKQLYNKYPDAVATWLNPDPSQ 50 

I 1 I I I I I I I I I I I I I I I i I I I I I I 1 I I II I ! I I I I I 1 I 1 I 1 i I I I I I I I 1 

1 MRIAVICFCLLGITCAIPVKQADSGSSEEKQLYNKYPDAVATWLNPDPSQ 5 0 

51 KQNLLAPQ 5 8 

I I I I I I I I 

51 KQNLLAPQ 58 



10 



15 



20 Sequence name: OSTP__HUMAN 
Sequence documentation : 

Alignment of: HUMOSTRO_PEA__l_PEA_l_P25 x OSTP_HUMAN 

25 

Alignment segment 1/1: 

Quality: 301.00 

Escore: 0 

30 Matching length: 31 Total 

length: 31 



WO 2006/131783 PCT/IB2005/004037 



Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



1262 

100.00 
100 . 00 



Matching Percent 



Total Percent 



Alignment : 



10 



1 MRIAVICFCLLGITCAIPVKQADSGSSEEKQ 

I I I 1 I I I I 1 I I i I I i I I I I I I II I I I II I I I 

1 MRIAVICFCLLGI TCAIPVKQADSGSSEEKQ 



31 



31 



Sequence name: OSTP_HUMAN 
Sequence documentation : 

Alignment of: HUMOSTRO_PEA_1_PEA_1_P30 x OSTP_HUMAN 
Alignment segment 1/1: 

Quality: 301.00 

Escore: 0 

Matching length: 31 Total 

length: 31 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 
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Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

5 Alignment: 

1 MRIAVICFCLLGITCAIPVKQADSGSSEEKQ 31 

I I I I 1 I I I 1 I 1 I I I I I I I I I I I I I ! I I I I I I 

1 MRIAVICFCLLGITCAIPVKQADSGSSEEKQ 31 

10 
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DESCRIPTION FOR CLUSTER Rl 1723 
Cluster R11723 features 6 transcript(s) and 26 segment(s) of interest, the names for which 
are given in Tables 1264 and 1265, respectively, the sequences themselves are given at the end 
20 of the application. The selected protein variants are given in table 1266. 

Table 1264 - Transcripts of interest 



Transcript Name . y " 


Sequence ID No. j . 


R11723_PEA_1_T15 


144 


R11723_PEA_1_T17 


145 


R11723_PEA_1_T19 


146 


R11723JPEA_1_T20 


147 


R11723_PEA_1_T5 


148 


R11723_PEA_1_T6 


149 



Table 1265 - Segments of interest 
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Segment Name " .. 


Sequence ID No. 


Rl 1723_PEA_l_node_13 


991 


Rl 1723_PEA_l_node_16 


992 


Rl 1723_PEA_l_node_19 


993 


Rl 1723_PEA_l_node_2 


994 


Rl 1723_PEA_l_node_22 


995 


Rl 1723_PEA_l_node_3 1 


996 


Rl 1723_PEA_l_node_10 


997 


Rl 1723_PEA_l_node_l 1 


998 


Rl 1723_PEA_l_node_15 


999 


Rl 1723_PEA_l_node_l 8 


1000 


Rl 1723_PEA_l_node_20 


1001 


Rl 1723_PEA_l_node_21 


1002 


Rl 1723_PEA_l_node_23 


1003 


Rl 1723_PEA_l_node_24 


1004 


Rl 1 723_PEA_l_node_25 


1005 


Rl 1723_PEA_l_node_26 


1006 


Rl 1723_PEA_l_node_27 


1007 


Rl 1723_PEA_l_node_28 


1008 


Rl 1723_PEA_l_node_29 


1009 


Rl 1723_PEA_l_node_3 


1010 


Rl 1723_PEA_l_node_30 


1011 


Rl 1723_PEA_l_node_4 


1012 


Rl 1723_PEA_l_node_5 


1013 


Rl 1723_PEA_l_node_6 


1014 


Rl 1723_PEA_l_node_7 


1015 


Rl 1723_PEA_l_node_8 


1016 



Table 1266- Proteins of interest 
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Protein Name " *■' '•' ,- ■ 


Sequence ID No. 


R11723_PEA_1JP2 


1409 


R11723_PEA_1_P6 


1410 


R11723_PEA_1_P7 


1411 


R11723_PEA_1_P13 


1412 


R11723_PEA_1_P10 


1413 



Cluster Rl 1723 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
5 according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the y-axis of figure 47 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

10 Overall, the following results were obtained as shown with regard to the histograms in 

Figure 47 and Table 1267. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: epithelial malignant tumors, a mixture of malignant tumors 
from different tissues and kidney malignant tumors. 

1 5 Table 1267 - Normal tissue distribution 



Name of Tissue : 


Number 


Adrenal 


0 


Brain 


30 


Epithelial 


3 


General 


17 


head and neck 


0 


Kidney 


0 


Lung 


0 
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Breast 


0 


Ovary 


0 


Pancreas 


10 


Skin 


0 


Uterus 


0 



Table 1268 - P values and ratios for expression in cancerous tissue 



Name of Tissue V 

£ £ - — — : — £ 


pis —5 


P2 


SP1 


1*3 * m ~ 


SP2 




Adrenal 


4.2e-01 


4.6e-01 


4.6e-01 


2.2 


5.3e-01 


1.9 


Brain 


2.2e-01 


2.0e-01 


1.2e-02 


2.8 


5.0e-02 


2.0 


Epithelial 


3.0e-05 ! 


6.3e-05 


1.8e-05 


6.3 


3.4e-06 


6.4 


General 


7.2e-03 


4.0e-02 


1.3e-04 


2.1 


l.le-03 


1.7 


head and neck 


1 


5.0e-01 


1 


1.0 


7.5e-01 | 


1.3 


Kidney 


1.5e-01 


2.4e-01 


4.4e-03 


5.4 


2.8e-02 


3.6 


Lung 


1.2e-01 


1.6e-01 


1 


1.6 


1 


1.3 


Breast 


5.9e-01 


4.4e-01 


1 


1.1 


6.8e-01 


1.5 


Ovary 


1.6e-02 


1.3e-02 


1.0e-01 


3.8 


7.0e-02 


3.5 


Pancreas 


5.5e-01 


2.0e-01 


3.9e-01 


1.9 


1.4e-01 


2.7 


Skin 


1 


4.4e-01 


1 


1.0 


1.9e-02 


2.1 


Uterus 


1.5e-02 


5.4e-02 


1.9e-01 


3.1 


1.4e-01 


2.5 



5 

As noted above, contig Rl 1723 features 6 transcript(s), which were listed in Table 1 
above. A description of each variant protein according to the present invention is now provided. 

1 0 Variant protein Rl 1723_PEA_1 JP2 according to the present invention has an amino acid 

sequence as given at the end of the application; it is encoded by transcript(s) 
Rl 1723 JPEA_1_T6. The location of the variant protein was determined according to results 
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from a number of different software programs and analyses, including analyses from SignalP 
and other specialized programs. The variant protein is believed to be located as follows with 
regard to the cell: secreted. The protein localization is believed to be secreted because both 
signal-peptide prediction programs predict that this protein has a signal peptide, and neither 
5 trans- membrane region prediction program predicts that this protein has a trans -membrane 
region.. 

Variant protein Rl 1723JPEAJ JP2 also has the following non- silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1269, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
10 the SNP is known or not; the presence of known SNPs in variant protein Rl 1723_PEA_1_P2 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 1269 - Amino acid mutations 



SNP position(s) on amino acid 
sequence ; '.' ' : : V irS 


Alternative amino acid(s) '-J 


Previously known SNP? ; 

\ " ' * >4 ' •• ' H 


107 


H->P 


Yes 


70 


G-> 


No 


70 


G->C 


No 



1 5 Variant protein Rl 1723_PEA_1_P2 is encoded by the following transcript(s): 

Rl 1723 JPEA_1_T6, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Rl 1723_PEA_1_T6 is shown in bold; this coding portion starts at 
position 1716 and ends at position 2051. The transcript also has the following SNPs as listed in 
Table 1270 (given according to their position on the nucleotide sequence, with the alternative 

20 nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Rl 1723JPEA1 JP2 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 1270 - Nucleic acid SNPs 



WO 2006/131783 


1268 


PCT/IB2005/004037 


SNP position on nucleotide Jf\ 
sequence • t%* ; 


Alternative nucleic acid 


Previously known SNP? 


1231 


C->T 


Yes 


1278 


G->C 


Yes 


1923 


G-> 


No 


1923 


G->T 


No 


2035 


A->C 


Yes 


2048 


A->C 


No 


2057 


A->G 


Yes 



Variant protein Rl 1723JPEA 1_P6 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
5 Rl 1723JPEA_1_T15. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 

Comparison report between Rl 1723 J>EA_1_P6 and Q8IXM0 (SEQ ID NO:1707): 
10 l.An isolated chimeric polypeptide encoding for Rl 1723JPEA_1__P6, comprising a first 

amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence 

MWLGIAATFCGLFLLPGFALQIQCYQCEEFQL>WDCSSPEFIVNCTVNVQDMCQKEV 
15 MEQSAGIMYRKSCASSAACLIASAGSPCRGLAPGREEQRALHKAGAVGGGVR 

corresponding to amino acids 1-110 of Rl 1723 JPEA_1 JP6, and a second amino acid sequence 
being at least 90 % homologous to 

MYAQALLWGVLQRQAAAQHLHEHPPKLLRGHRVQERVDDRAEVEKRLREGEEDHV 
RPEVGPRPVVLGFGRSHDPPNLVGHPAYGQCHNNQPWADTSRRERQRKEKHSM 
20 corresponding to amino acids 1-112 of Q8IXM0, which also corresponds to amino acids 1 1 1 - 
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222 of R11723JPEA_1JP6, wherein said first and second amino acid sequences are contiguous 
and in a sequential order. 

2. An isolated polypeptide encoding for a head of R11723_PEA_1 JP6, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
5 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIWCTVNVQDMCQKE 
MEQSAGIMYRKSCASSAACLIASAGSPCRGLAPGREEQRALHKAGAVGGGVR of 
R11723_PEA_1_P6. 

10 

Comparison report between Rl 1723_PEA_1 JP6 and Q96AC2 (SEQ ID NO: 1708): 
LAn isolated chimeric polypeptide encoding for R11723_PEA_1JP6, comprising a first 
amino acid sequence being at least 90 % homologous to 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
15 MEQSAGIMYRKSCASSAACLIASAG corresponding to amino acids 1 - 83 of Q96AC2, 

which also corresponds to amino acids 1-83 of Rl 1723JPEA_1__P6, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
SPCRGLAPGREEQRALHKAGAVGGGVRlVrYAQALLVVGV 
20 RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPWLGFGRSHDPPNLVGHPAYGQ 
CHNNQPWADTSRRERQRKEKHSMRTQ corresponding to amino acids 84 - 222 of 
R11723 PEA_1_P6, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

2.An isolated polypeptide encoding for a tail of R11723JPEA_1JP6, comprising a 
25 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

SPCRGLAPGREEQRALHKAGAVGGGVRMYAQALLWGVLQRQAAAQHLHEHPPKLL 
RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPWLGFGRSHDPPNLVGHPAYGQ 
30 CHNNQPWADTSRRERQRKEKHSMRTQ in Rl 1723_PEA_1 JP6. 
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Comparison report between Rl 1723JPEA_1 JP6 and Q8N2G4 (SEQ ID NO: 1709): 
1 An isolated chimeric polypeptide encoding for R11723_PEA_1_P6, comprising a first 
amino acid sequence being at least 90 % homologous to 
5 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 

MEQSAGIMYRKSCASSAACLIASAG corresponding to amino acids 1 - 83 of Q8N2G4, 
which also corresponds to amino acids 1 - 83 of Rl 1723JPEA_1 JP6, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
10 SPCRGLAPGREEQRALHKAGAVGGGVRMYAQALLVVGVLQRQAAAQHLHEHPPKLL 
RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPWLGFGRSHDPPNLVGHPAYGQ 
CHNNQPWADTSRRERQRKEKHSMRTQ corresponding to amino acids 84 - 222 of 
R11723JPEA_1_P6, wherein said first and second amino acid sequences are contiguous and in 
a sequential order* 

15 2 An isolated polypeptide encoding for a tail of Rl 1723_PEA_1__P6, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

SPCRGLAPGREEQRALHKAGAVGGGVRMYAQALLVVGVLQRQAAAQHLHEHPPKLL 
20 RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPWLGFGRSHDPPNLVGHPAYGQ 
CHlSnSfQPWADTSRRERQRKEKHSMRTQ in R11723JPEA_1_P6. 

Comparison report between Rl 1723_PEA_1_P6 and BAC85518 (SEQ ID NO: 1710): 
l.An isolated chimeric polypeptide encoding for Rl 1723_PEA_1 JP6, comprising a first 
25 amino acid sequence being at least 90 % homologous to 

MWLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIWCTVNVQDMCQKEV 
MEQSAGIMYRKSCASSAACLIASAG corresponding to amino acids 24 - 106 of BAC85518, 
which also corresponds to amino acids 1 - 83 of Rl 1723_PEA_1_P6, and a second amino acid 
sequence being at least 70%, optionally at least 80%, preferably at least 85%, more preferably at 
30 least 90% and most preferably at least 95% homologous to a polypeptide having the sequence 
SPCRGLAPGREEQRALHKAGAVGGGVRMYAQALLVVGVLQRQAAAQHLHEHPPKLL 
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RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPVVLGFGRSHDPPNLVGHPAYGQ 
CHNNQPWADTSRRERQRKEKHSMRTQ corresponding to amino acids 84 - 222 of 
Rl 1723_PEA_1_P6 3 wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

5 2.An isolated polypeptide encoding for a tail of Rl 1 723 JPEA_1 JP6, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

SPCRGLAPGREEQRALHKAGAVGGGVRMYAQALLVVGVLQRQAAAQHLHEHPPKLL 
10 RGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPVVLGFGRSHDPPNLVGHPAYGQ 
CHNNQPWADTSRRERQRKEKHSMRTQ in R11723JPEA_1_P6. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 

15 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signa^peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein R11723JPEA_1JP6 also has the following non-silent SNPs (Single 

20 Nucleotide Polymorphisms) as listed in Table , (given according to their 1271 position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein Rl 1723 JPEA_1_P6 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

25 Table 1271- Amino acid mutations 



SNP positions) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


180 


G-> 


No 


180 


G->C 


No 


217 


H->P 


Yes 
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Variant protein Rl 1723 JPEA_1 JP6 is encoded by the following transcript(s): 
Rl 1723_PEA_1 JT15, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Rl 1723JPEA_1_T15 is shown in bold; this coding portion starts at 
5 position 434 and ends at position 1099. The transcript also has the following SNPs as listed in 
Table 1272 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Rl 1723_PEA_1_P6 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

1 0 Table 12 72 - Nucleic acid SNPs 



SNPiposition on rfu&leotide 

; ;sequeticef % . : ' f : V$;%* 


Alternative; riucleic acid i 


Previously known SNIP? . . - 

y ''f- "-iff y> . " '• .. * 


971 


G-> 


No 


971 


G->T 


No 


1083 


A->C 


Yes 


1096 


A->C 


No 


1105 


A->G 


Yes 



Variant protein Rl 1723 JPEA_1 JP7 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
15 Rl 1723_PEA_1_T17. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 

Comparison report between Rl 1723_PEA_1_P7 and Q96AC2: 
20 l.An isolated chimeric polypeptide encoding for Rl 1723_PEA_1JP7, comprising a first 

amino acid sequence being at least 90 % homologous to 
MWVLGIAATFCGLFLLPGFALQIQ 

MEQSAG corresponding to amino acids 1 - 64 of Q96AC2, which also corresponds to amino 
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acids 1-64 of Rl 1723JPEA_1_P7, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
SHCVTRLECSGTISAHCNLCLPGSNDHPT corresponding to amino acids 65 - 93 of 
5 Rl 1723_PEA_1_P7, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

2. An isolated polypeptide encoding for a tail of Rl 1723 JPEA_1 JP7, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
1 0 sequence SHCVTRLECSGTISAHCNLCLPGSNDHPT in Rl 1 723 JPEA_1 JP7. 
Comparison report between Rl 1723JPEA_1 JP7 and Q8N2G4: 

LAn isolated chimeric polypeptide encoding for Rl 1723JPEA_1_P7, comprising a first 
amino acid sequence being at least 90 % homologous to 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
15 MEQSAG corresponding to amino acids 1 - 64 of Q8N2G4, which also corresponds to amino 
acids 1-64 of Rl 1723JPEA_1 JP7, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
SHCVTRLECSGTISAHCNLCLPGSNDHPT corresponding to amino acids 65 - 93 of 
20 Rl 1723 JPEAJL JP7, wherein said first and second amino acid sequences are contiguous and in 
a sequential order, 

2. An isolated polypeptide encoding for a tail of Rl 1723JPEA_1_P7, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
25 sequence SHCVTRLECSGTISAHCNLCLPGSNDHPT in Rl 1 723_PEA_1_P7. 

Comparison report between Rl 1723 JPEA_1_P7 and BAC85273 : 

LAn isolated chimeric polypeptide encoding for Rl 1723JPEA_1_P7, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
30 preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence MWVLG corresponding to amino acids 1-5 of R11723JPEA_1J?7, second 
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amino acid sequence being at least 90 % homologous to 

IAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEVMEQSAG 
corresponding to amino acids 22 - 80 of BAC85273, which also corresponds to amino acids 6 - 
64 of Rl 1723_PEA_1_P7, and a third amino acid sequence being at least 70%, optionally at 
5 least 80%, preferably at least 85%, more preferably at least 90% and most preferably at least 
95% homologous to a polypeptide having the sequence 

SHCVTRLECSGTISAHCNLCLPGSNDHPT corresponding to amino acids 65 - 93 of 

Rl 1723_PEA_1 JP7, wherein said first, second and third amino acid sequences are contiguous 

and in a sequential order. 

10 2. An isolated polypeptide encoding for a head of R11723JPEA_1JP7, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence MWVLG of R11723_PEA_1JP7. 

3. An isolated polypeptide encoding for a tail of Rl 1723_PEA_1_P7, comprising a 

15 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence SHCVTRLECSGTISAHCNLCLPGSNDHPT in R11723_PEAJ_P7. 

Comparison report between R11723JPEAJ_P7 and BAC85518: 
20 1 .An isolated chimeric polypeptide encoding for Rl 1723_PEA_1 JP7, comprising a first 

amino acid sequence being at least 90 % homologous to 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
MEQSAG corresponding to amino acids 24 - 87 of BAC85518, which also corresponds to 
amino acids 1-64 of R11723_PEA_1_P7, and a second amino acid sequence being at least 
25 70%, optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
SHCVTRLECSGTISAHCNLCLPGSNDHPT corresponding to amino acids 65 - 93 of 
Rl 1723 JPEA_1 JP7, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

30 2.An isolated polypeptide encoding for a tail of Rl 1723 JPEA_1 JP7, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
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more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence SHCVTRLECSGTISAHCNLCLPGSNDHPT in Rl 1723_PEA_1 JP7. 

The location of the variant protein was determined according to results from a number of 
5 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 
10 Variant protein Rl 1723JPEA_1 JP7 also has the following non- silent SNPs (Single 

Nucleotide Polymorphisms) as listed in Table 1273, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; tte presence of known SNPs in variant protein Rl 1723_PEA_1_P7 
sequence provides support for the deduced sequence of this variant protein according to the 



15 present invention). 

Table 1273- Amino acid mutations 



SNP positioja(^) on amino acid ; 
sequence *. ". 


Alternative ommo acid(s)v ov. 


Previously known SNP? 


67 


C->S 


Yes 



Variant protein Rl 1723JPEA_1_P7 is encoded by the following transcript(s): 
R11723JPEAJMT17, for which the sequence(s) is/are given at the end of the application. The 

20 coding portion of transcript Rl 1723JPEA1JT17 is shown in bold; this coding portion starts at 
position 434 and ends at position 712. The transcript also has the following SNPs as listed in 
Table 1274 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Rl 1723_PEA_1_P7 sequence provides support for the deduced 

25 sequence of this variant protein according to the present invention). 

Table 1274- Nucleic acid SNPs 
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SNP position on nucleotide 
sequence. . y 


Alternative nucleic acid 

: t l , ■ ■■ ' ■ .'. , 


Previously known SNP? 

. V .. • _'-V. 


625 


G->T 


Yes 


633 


G->C 


Yes 


1303 


C ->T 


Yes 



Variant protein Rl 1723JPEA_1_P13 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
5 Rl 1723 PEA1T19. One or more alignments to one or more previously published protein 
sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 
Comparison report between Rl 1723_PEA_1_P13 and Q96AC2: 

1. An isolated chimeric polypeptide encoding for Rl 1723JPEA__1JP13, comprising a first 
10 amino acid sequence being at least 90 % homologous to 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 

MEQSA corresponding to amino acids 1-63 of Q96AC2, which also corresponds to amino 
acids 1-63 of Rl 1723JPEA_1 JP13, and a second amino acid sequence being at least 70%, 
optionally at least 80% 5 preferably at least 85%, more preferably at least 90% and most 
15 preferably at least 95% homologous to a polypeptide having the sequence 
DTKRTNTLLFEMRHFAKQLTT corresponding to amino acids 64 - 84 of 
Rl 1723_PEA_1_P13, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

2 . An isolated polypeptide encoding for a tail of R11723_PEAJL_P13, comprising a 
20 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 

more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence DTKRTNTLLFEMRHFAKQLTT in Rl 1723_PEA_1_P13. 

The location of the variant protein was determined according to results from a number of 
25 different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
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secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

5 Variant protein Rl 1723 JPEA__1 JP13 is encoded by the following transcript(s): 

Rl 1723JPEA_1 JT19 and Rl 1723_PEA_J__T5, for which the sequence(s) is/are given at the end 
of the application. The coding portion of transcript Rl 1723JPEA_1_T19 is shown in bold; this 
coding portion starts at position 434 and ends at position 685. The transcript also has the 
following SNPs as listed in Table 1275 (given according to their position on the nucleotide 
10 sequence, with the alternative nucleic acid listed; the last column indicates whether the SNP is 
known or not; the presence of known SNPs in variant protein Rl 1723JPEA 1P13 sequence 
provides support for the deduced sequence of this variant protein according to the present 
invention). 

Table 1275 - Nucleic acid SNPs 



SNP position ou nucleotide 
. sequence 


Alternative iiuclM0 &id , $ 


Previously known SNP? 


778 


G->T 


Yes 


786 


G->C 


Yes 


1456 


C->T 


Yes 



15 

Variant protein Rl 1723JPEA_1 JP10 according to the present invention has an amino 
acid sequence as given at the end of the application; it is encoded by transcript(s) 
Rl 1723 JPEA_1_T20. One or more alignments to one or more previously published protein 
20 sequences are given at the end of the application. A brief description of the relationship of the 
variant protein according to the present invention to each such aligned protein is as follows: 

Comparison report between Rl 1723_PEA_1_P10 and Q96AC2: 

1 An isolated chimeric polypeptide encoding for Rl 1 723_PEA_1_P 1 0 ? comprising a first 
25 amino acid sequence being at least 90 % homologous to 
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MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
MEQSA corresponding to amino acids 1 - 63 of Q96AC2, which also corresponds to amino 
acids 1-63 of Rl 1723JPEA_1_P10, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
5 preferably at least 95% homologous to a polypeptide having the sequence 

DRVSLCHEAGVQWNNFSTLQPLPPRLK corresponding to amino acids 64 - 90 of 

Rl 1723 PEA1 JP10, wherein said first and second amino acid sequences are contiguous and in 

a sequential order. 

2. An isolated polypeptide encoding for a tail of Rl 1723JPEA_1 JP10, comprising a 
10 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence DRVSLCHEAGVQWNNFSTLQPLPPRLK in Rl 1723_PEA_l__Pia 

Comparison report between Rl 1723_JPEA_1_P10 and Q8N2G4: 
15 l.An isolated chimeric polypeptide encoding for Rl 1723JPEA_1_P10, comprising a first 

amino acid sequence being at least 90 % homologous to 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
MEQSA corresponding to amino acids 1-63 of Q8N2G4, which also corresponds to amino 
acids 1-63 of Rl 1723 PEA 1 P10 5 and a second amino acid sequence being at least 70%, 
20 optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence 
DRVSLCHEAGVQWNNFSTLQPLPPRLK corresponding to amino acids 64 - 90 of 
Rl 1723_PEA_1_P10, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

25 2. An isolated polypeptide encoding for a tail of Rl 1723 JPEA_1_P10, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence DRVSLCHEAGVQWNNFSTLQPLPPRLK in R11723_PEA_1JP10. 

3 0 Comparison report between Rl 1 723 JPEA_1_P 1 0 and B AC85273 : 
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1. An isolated chimeric polypeptide encoding for Rl 1723_JPEA__1 JP10, comprising a first 
amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, more 
preferably at least 90% and most preferably at least 95% homologous to a polypeptide having 
the sequence MWVLG corresponding to amino acids 1 - 5 of Rl 1723JPEA_1JP10, second 

5 amino acid sequence being at least 90 % homologous to 

IAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDM 
corresponding to amino acids 22 - 79 of BAC85273, which also corresponds to amino acids 6 - 
63 of Rl 1723JPEA_1 JP10, and a third amino acid sequence being at least 70% ? optionally at 
least 80%, preferably at least 85%, more preferably at least 90% and most preferably at least 
10 95% homologous to a polypeptide having the sequence 

DRVSLCHEAGVQWNNFSTLQPLPPRLK corresponding to amino acids 64 - 90 of 
R11723_PEA_1JP10, wherein said first, second and third amino acid sequences are contiguous 
and in a sequential order. 

2. An isolated polypeptide encoding for a head of R11723JPEA__1 JP10, comprising a 
15 polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 

more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence MWVLG of Rl 1723JPEA_1 JP10. 

3 . An isolated polypeptide encoding for a tail of Rl 1723_PEA_1_P10, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 

20 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence DRVSLCHEAGVQWNNFSTLQPLPPRLK in R11723_PEA_1_P10. 

Comparison report between Rl 1723_PEAJ JP10 and BAC85518: 

l.An isolated chimeric polypeptide encoding for Rl 1723_PEA_1_P10, comprising a first 

25 amino acid sequence being at least 90 % homologous to 

MWLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIWCTVNVQDMCQKEV 
MEQSA corresponding to amino acids 24 - 86 of BAC85518, which also corresponds to amino 
acids 1-63 of Rl 1723JPEA _1 JP10, and a second amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 

30 preferably at least 95% homologous to a polypeptide having the sequence 

DRVSLCHEAGVQWNNFSTLQPLPPRLK corresponding to amino acids 64 - 90 of 
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Rl 1723JPEA_1 JP10, wherein said first and second amino acid sequences are contiguous and in 
a sequential order. 

2. An isolated polypeptide encoding for a tail of Rl 1723_PEA_1_P10, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
5 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence DRVSLCHEAGVQWNNFSTLQPLPPRLK in Rl 1723_PEA_1_P10. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 

10 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signal-peptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein R11723_PEA_1_P10 also has the following non-silent SNPs (Single 

15 Nucleotide Polymorphisms) as listed in Table 1276, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein Rl 1723JPEA_1 JP10 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

20 Table 1276 - Amino acid mutations 



SNP pbsMpn(s) ori amiao acid 
sequence ; v u .'• 


■• »■■ 

Alternative amino acid(s) 


Previously known SNP? 


66 


V->F 


Yes 



Variant protein Rl 1723_PEA_1_P10 is encoded by the following transcript(s): 
R11723JPEA_1JT20, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Rl 1723JPEA_1_T20 is shown in bold; this coding portion starts at 
25 position 434 and ends at position 703. The transcript also has the following SNPs as listed in 
Table 1277 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 



WO 2006/131783 



PCT/IB2005/004037 



1281 

known SNPs in variant protein Rl 1723_PEA_1_P10 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 



Table 1277- Nucleic acid SNPs 



SNP position on nucleotide tj. 
sequence? 1;. - ; _ V . : : 


Alternative nucleic acid 


Previously known SNP? 


629 


G->T 


Yes 


637 


G->C 


Yes 


1307 


C ->T 


Yes 


As noted above, cluster R11723 features 26 segment(s), wl 


hich were listed in Table 2 



5 above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
of particular interest. A desmption of each segment according to the present invention is now 
provided. 



10 Segment cluster Rl 1723_PEA_l_node_13 according to the present invention is supported 

by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Rl 1723JPEAJL_T19, Rl 1723 JPEA_1_T5 and 
Rl 1723_PEA_1_T6. Table 1278 below describes the starting and ending position of this 
segment on each transcript. 

15 Table 1278- Segment location on transcripts 



Transcript name '"J- r 


Segment -starting position ; 


j Segment ending position f 


R11723_PEA_1_T19 


624 


776 


R11723_PEA_1_T5 


624 


776 


R11723_PEA_1_T6 


658 


810 



Segment cluster Rl 1723 JPEA_1 jtiode_ 16 according to the present invention is supported 
by 3 libraries. The number of libraries was determined as previously described. This segment 
20 can be found in the following transcript(s): R11723_PEA_1_T17, R11723_PEA_1_T19 and 
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R11723J?EA_1_T20. Table 1279 below describes the starting and ending position of this 
segment on each transcript. 

Table 1279- Segment location on transcripts 





Segment starting position-)' 


Segment ending position 


R11723_PEA_1_T17 


624 


1367 


R11723_PEA_1_T19 


777 


1520 


R11723_PEA_1_T20 


628 


1371 



Segment cluster Rl 1723JPEA_l_node_19 according to the present invention is supported 
by 45 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Rl 1723JPEA_1_T5 and Rl 1723_PEA__1 JT6. Table 
1280 below describes the starting and ending position of this segment on each transcript. 

1 0 Table 1280- Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position •" i- 


R11723_PEA_1_T5 


835 


1008 


R11723_PEA_1_T6 


869 


1042 



Segment cluster Rl 1723_PEA_l_node_2 according to the present invention is supported 
by 29 libraries. The number of libraries was determined as previously described. This segment 
15 can be found in the following transcript(s): Rl 1723JPEA_1 JT15, Rl 1723_PEA_1_T17, 

R11723J>EAJLT19 ? R11723JPEAJLT20, R11723JPEA_1_T5 and R11723JPEA_1JT6. 



Table 1281 below describes the starting and ending position of this segment on each transcript. 
Table 1281- Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


R11723_PEA_1_T15 


1 


309 


R11723_PEA_1_T17 


1 


309 


R11723_PEA_1_T19 


1 


309 
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R11723JPEA_1_T20 


1 


309 


R11723_PEA_1_T5 


1 


309 


R11723_PEA_1_T6 


1 


309 



Segment cluster Rl 1723_PEA_1 jnode_22 according to the present invention is supported 
by 65 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Rl 1723 J>EA_1_T5 and Rl 1723_PEA_1__T6. Table 
1282 below describes the starting and ending position of this segment on each transcript. 



Table 1282- Segment location on transcripts 





S^gMejit starting position § 


Segment endiixg pfi^ition 


R11723_PEA_1JT5 


1083 


1569 


R11723_PEA_1_T6 


1117 


1603 



10 Segment cluster Rl 1723_PEA_1 jnode_3 1 according to the present invention is supported 

by 70 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Rl 1723 JPEA_1 JT15, Rl 1723_PEA_1_T5 and 
Rl 1723_PEA__1_T6. Table 1283 below describes the starting and ending position of this 
segment on each transcript (it should be noted that these transcripts show alternative 

1 5 poly adenylation) . 

Table 1283 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


R11723_PEA_1_T15 


1060 


1295 


R11723_PEA_1_T5 


1978 


2213 


R11723_PEA_1_T6 


2012 


2247 



^ According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 



20 
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Segment cluster Rl 1723_PEA_1 jaode_10 according to the present invention is supported 
by 38 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Rl 1723_PEA_1_T15, Rl 1723JPEA_1_T17, 
Rl 1723_PEA_1_T19, Rl 1723 J>EA_1 JT20, Rl 1723 J>EA_1_T5 and Rl 1723_PEA_1_T6. 



5 Table 1284 below describes the starting and ending position of this segment on each transcript. 
Table 1284 - Segment location on transcripts 



Transcript name : 


Segment starting position I ' ! t 


Segment ending position 


R11723_PEA_1_T15 


486 


529 


R11723_PEA_1_T17 


486 


529 


R11723_PEA_1_T19 


486 


529 


R11723_PEA_1_T20 


486 


529 


R11723_PEA_1_T5 


486 


529 


R11723_PEA_1_T6 


520 


563 



Segment cluster Rl 1723JPEA_l_ix)de_ll according to the present invention is supported 
10 by 42 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R11723_PEA_1__T15, R11723JPEA_1JT17, 
Rl 1723_PEA_1_T19, Rl 1723JPEAJ JT20, Rl 1723 J>EA_1_T5 and Rl 1723_PEA„1_T6. 
Table 1285 below describes the starting and ending position of this segment on each transcript. 

Table 1285 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


R11723_PEA_1_T15 


530 


623 


R11723_PEA_1_T17 


530 


623 


R11723_PEA_1_T19 


530 


623 


R11723_PEA_1_T20 


530 


623 


R11723_PEA_1_T5 


530 


623 


R11723_PEA_1_T6 


564 


657 



15 
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Segment cluster Rl 1723_PEA_l_node_15 according to the present invention can be 
found in the following transcript(s): R11723JPEA_1_T20. Table 1286 below describes the 
starting and ending position of this segment on each transcript. 

5 Table 1286 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position j f 


R11723_PEA_1_T20 


624 


627 



Segment cluster Rl 1723_PEA_l_node_18 according to the present invention is supported 
by 40 libraries. The number of libraries was determined as previously described. This segment 
10 can be found in the following transcript(s): Rl 1723JPEAJJT15, Rl 1723_PEA_1_T5 and 
Rl 1723_JPEA_1 JT6. Table 1287 below describes the starting and ending position of this 
segment on each transcript. 

Table 1287- Segment location on transcripts 



Transcript nmrie 


Segment starting position 


Segment ending position. 


R11723_PEA_1_T15 


624 


681 


R11723_PEA_1_T5 


777 


834 


R11723_PEA_1_T6 


811 


868 



15 

Segment cluster Rl 1723 JPEA_1 jaode _20 according to the present invention can be 
found in the following transcript(s): R11723JPEA_1_T5 and R11723_PEA_1_T6. Table 1288 
below describes the starting and ending position of this segment on each transcript. 

Table 1288- Segment location on transcripts 



Transcript name 


Segment starting position 


j Segment ending position 


R11723_PEA_1_T5 


1009 


1019 


R11723_PEA_1_T6 


1043 


1053 



20 
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Segment cluster Rl 1723_PEA_l_node_21 according to the present invention is supported 
by 36 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R11723JPEA_1_T5 and Rl 1723JPEA_1JT6. Table 



5 1289 below describes the starting and ending position of this segment on each transcript. 
Table 1289 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


R11723_PEA_1_T5 


1020 


1082 


R11723_PEA_1_T6 


1054 


1116 



Segment cluster Rl 1723JPEA_l_node_23 according to the present invention is supported 
10 by 39 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Rl 1723_PEA_1_T5 and Rl 1723_PEA__1_T6. Table 
1290 below describes the starting and ending position of this segment on each transcript. 

Table 1290 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


R11723_PEA_1_T5 


1570 


1599 


R11723_PEA_1_T6 


1604 


1633 



15 

Segment cluster R11723_PEA_l_node_24 according to the present invention is supported 
by 51 libraries. The number of libraries was detenxiined as previously described. This segment 
can be found in the following transcript(s): Rl 1723JPEA_1_T15 5 Rl 1723JPEAJLT5 and 
R11723JPEA_1_T6. Table 1291 below describes the starting and ending position of this 
20 segment on each transcript. 



Table 1291 - Segment location on transcripts 



Transcript name 


Segment starting position 


; Segment ending position 


R11723_PEA_1_T15 


682 


765 
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R11723_PEA_1_T5 


1600 


1683 


R11723_PEA_1_T6 


1634 


1717 



Segment cluster Rl 1723JPEA_l_node_25 according to the present invention is supported 
by 54 libraries. The number of libraries was determined as previously described. This segment 
5 can be found in the following transcript(s): Rl 1723_PEA_1_T15, Rl 1723_PEA_1 JT5 and 
R11723_PEA_1_T6. Table 1292 below describes the starting and ending position of this 
segment on each transcript. 

Table 1292 - Segment location on transcripts 



Transcnpt name i } *\ 


Segment starting pb?ition 


: Segment ending position 


R11723_PEA_1_T15 


766 


791 


R11723JPEA_1_T5 


1684 


1709 


R11723JPEA__1_T6 


1718 


1743 



10 

Segment cluster Rl 1723JPEA_l_node_26 according to the present invention is supported 
by 62 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Rl 1723 __PEA_1_T15, Rl 1723_PEA_1_T5 and 
R11723JPEA_1_T6. Table 1293 below describes the starting and ending position of this 
1 5 segment on each transcript. 

Table 1293 - Segment location on transcripts 



Transcript name . 


Segment starting position 


■ Segment ending position 


R11723_PEA_1_T15 


792 


904 


R11723_PEA_1_T5 


1710 


1822 


R11723_PEA_1_T6 


1744 


1856 



Segment cluster R11723_PEA_l_node_27 according to the present invention is supported 
20 by 67 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): Rl 1723JPEA_1_T15, R11723_PEA_1 JT5 and 
Rl 1723JPEA_1_T6. Table 1294 below describes the starting and ending position of this 
segment on each transcript. 

Table 1294 - Segment location on transcripts 



Transcript name 


, Segment starting position 


Segment ending position 


R11723_PEA_1_T15 


905 


986 


R11723_PEA_1_T5 


1823 


1904 


R11723_PEA_1_T6 


1857 


1938 



5 



Segment cluster Rl 1723_PEA_l_node_28 according to the present invention can be 
found in the following transcript(s): R11723_PEA_1JT15, R11723JPEA_1 JT5 and 
R11723JPEA _1„T6. Table 1295 below describes the starting and ending position of this 
10 segment on each transcript. 

Table 1295 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment eridittg position . 


R11723_PEA_1_T15 


987 


1010 


R11723_PEA_1_T5 


1905 


1928 


R11723_PEA_1_T6 


1939 


1962 



Segment cluster Rl 1723_PEA_l_node_29 according to the present invention is supported 
15 by 69 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R11723JPEAJ JT15, R11723_PEA_1_T5 and 
Rl 1723JPEA_1 JT6. Table 1296 below describes the starting and ending position of this 
segment on each transcript. 

Table 1296 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


R11723_PEA_1_T15 


1011 


1038 
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R11723JPEA_1_T5 


1929 


1956 


R11723_PEA_1_T6 


1963 


1990 



Segment cluster Rl 1723_PEA_l_node_3 according to the present invention can be found 
in the following transcript(s): R11723JPEA_1_T15, Rl 1723_PEA_1_T17, 



5 R11723_PEA_1_T19, R11723_PEA_1_T20, Rl 1723JPEA_1 JT5 and R11723_PEAJMT6. 
Table 1297 below describes the starting and ending position of this segment on each transcript. 

Table 1297 - Segment location on transcripts 



Transcript name ■ , 


Segment startingiposirion '. .;' 


pSegment ending position 


R11723_PEA_1_T15 


310 


319 


R11723_PEA_1_T17 


310 


319 


R11723_PEA_1_T19 


310 


319 


R11723_PEA_1_T20 


310 


319 


Rl 1723_PEA_1_T5 


310 


319 


R11723_PEA_1_T6 


310 


319 



10 Segment cluster Rl 1723JPEA_l_node_30 according to the present invention can be 

found in the following transcript(s): R11723_PEA_1_T15, R11723_PEA_1_T5 and 
Rl 1723JPEA__1 JT6. Table 1298 below describes the starting and ending position of this 
segment on each transcript. 

Table 1298 - Segment location on transcripts 



Transcript name 


Segment starting position 


: Segment ending position 


R11723_PEA_1_T15 


1039 


1059 


R11723_PEA_1_T5 


1957 


1977 


R11723_PEA_1_T6 


1991 


2011 



15 
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Segment cluster Rl 1723JPEA_l_node_4 according to the present invention is supported 
by 25 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Rl 1723_PEA_1_T15 ? Rl 1723JPEA_1_T17, 
Rl 1723JPEAJ JT19, Rl 1723JPEA_1_T20, Rl 1723_PEA_1_T5 and Rl 1723JPEA_1_T6. 
Table 1299 below describes the starting and ending position of this segment on each transcript. 



Table 1299 - Segment location on transcripts 



Transcript name - '.- 


Segment starting position. 


Segment ending position 


R11723_PEA_1_T15 


320 


371 


R11723_PEA_1_T17 


320 


371 


R11723_PEA_1_T19 


320 


371 


R11723_PEA_1_T20 


320 


371 


R11723_PEA_1_T5 


320 


371 


R11723_PEA_1_T6 


320 


371 



Segment cluster Rl 1723_PEA_l_node_5 according to the present invention is supported 
by 26 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R11723JPEA_1_T15, Rl 1723JPEA_1_T17, 
R11723_PEA_1_T19, R11723JPEA_1_T20 5 R11723_PEA_1JT5 and R11723J?EA_1_T6. 



Table 1300 below describes the starting and ending position of this segment on each transcript. 
Table 1300 - Segment location on transcripts 



Transcript name 


; Segment starting position 


i. Segment ending position 


R11723_PEA_1_T15 


372 


414 


R11723_PEA_1_T17 


372 


414 


R11723_PEA_1_T19 


372 


414 


R11723_PEA_1_T20 


372 


414 


R11723_PEA_1_T5 


372 


414 


R11723_PEA_1_T6 


372 


414 
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Segment cluster Rl 1723JPEA_1 jntode 6 according to the present invention is supported 
by 27 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Rl 1723JPEAJ JT15, Rl 1723JPEA_1_T17, 
5 Rl 1723_PEA_1_T19, Rl 1723JPEA__1 JT20, Rl 1723 JPEAJ JT5 and Rl 1723_PEA_1_T6. 



Table 1301 below describes the starting and ending position of this segment on each transcript. 
Table 1301 - Segment location on transcripts 



Transcript name 


Segment starting position 


fgmcntencKng position 


R11723_PEA_1_T15 


415 


446 


R11723_PEA_1_T17 


415 


446 


R11723_PEA_1_T19 


415 


446 


R11723_PEA_1_T20 


415 


446 


R11723_PEA_1_T5 


415 


446 


R11723_PEA_1_T6 


415 


446 



10 Segment cluster Rl 1723JPEA_l_node_7 according to the present invention is supported 

by 29 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Rl 1723_PEA__1 JT15, Rl 1723JPEA_1_T17, 
R11723_PEA_1_T19, Rl 1723_PEA_1_T20, R11723_PEA_1_T5 and R11723JPEA_J_T6. 
Table 1302 below describes the starting and ending position of this segment on each transcript. 

15 Table 1302 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


R11723_PEA_1_T15 


447 


485 


R11723_PEA_1_T17 


447 


485 


R11723_PEA_1_T19 


447 


485 


R11723_PEA_1_T20 


447 


485 


R11723_PEA_1_T5 


447 


485 


R11723_PEA_1_T6 


447 


485 
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Segment cluster Rl 1723 PEA 1 node_8 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 



5 can be found in the following transcript(s): Rl 1723_PEA_1_T6. Table 1303 below describes 
the starting and ending position of this segment on each transcript. 

Table 1303 - Segment location on transcripts 



Transcript name ,: •>'••'. 


Segment siting position', f - V 


Segment ending pbsploii 5 


R11723_PEA_1_T6 


486 


519 



10 



15 Variant protein alignment to the previously known protein: 

Sequence name: /tmp/gp6eQTLWqk/mFt j UpUzhb : Q8IXM0 

Sequence documentation : 

20 Alignment of: Rl 1 7 2 3_PEA_1__P 6 x Q8IXM0 

Alignment segment 1/1: 

Quality: 1128.00 

25 Escore: 0 

Matching length: 112 Total 

length: 112 



WO 2006/131783 



PCT/IB2005/004037 



Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



1293 

100 . 00 
100.00 



Matching Percent 



Total Percent 



Alignment : 



10 



111 MYAQALLVVGVLQRQAAAQHLHEHPPKLLRGHRVQERVDDRAEVEKRLRE 160 

I I I I I I 1 i I I I I I 1 I I I I I II I M I I I I I I I I I I I 1 I I I I I I I I I I I ! I I 

1 MYAQALLVVGVLQRQAAAQHLHEHPPKLLRGHRVQERVDDRAEVEKRLRE 50 



15 



161 GEEDHVRPEVGPRPVVLGFGRSHDPPNLVGHPAYGQCHNNQPWADTSRRE 210 
I I I I I I I I I I I I I II I I I I I I I II I II II I I I I I I I I I I I I I I I I I I I I I 
51 GEEDHVRPEVGPRPWLGFGRSHDPPNLVGHPAYGQCHNNQPWADTSRRE 100 



20 



211 RQRKEKHSMRTQ 
I II I I I I I I I I I 
101 RQRKEKHSMRTQ 



222 



112 



25 

Sequence name : /tmp/gp6eQTLWqk/mFt jUpUzhb : Q9 6AC2 
Sequence documentation : 
30 Alignment of: R11723_PEA_1_P6 x Q9 6AC2 



WO 2006/131783 



PCT/IB2005/004037 



Alignment segment 1/1: 



1294 



Quality : 

Escore: 0 
5 Matching length: 

length: 83 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
10 Identity: 100.00 

Gaps : 



835 .00 



83 



100 .00 



Total 



100.00 Matching Percent 



Total Percent 



Alignment : 



15 



1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 50 

i I 1 I I I I 1 I I I I I I I I I 1 I I I I 1 I I I II I I I I I I I I I I I I I i 1 I I I I I I I 
1 MWVLGI AATFCGLFLLPGFALQIQCYQCEEFQLNNDCS S PEFI VNCTVNV 50 



20 



51 QDMCQKEVMEQSAGIMYRKSCASSAACLIASAG 

I I I I I I I I I I I I II I I I I I I I I I I I i I I I I I I I 

51 QDMCQKEVMEQS AG IMYRKS CAS S AACL I AS AG 



83 



83 
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Sequence name: /tmp/gp6eQTLWqk/mFt jUpUzhb : Q8N2G4 
30 Sequence documentation: 
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Alignment of: Rll 7 23_PEA_1_P6 x Q8N2G4 

Alignment segment 1/1: 

5 Quality: 
Escore: 0 

Matching length: 
length: 83 
Matching Percent Similarity: 
10 Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 

15 Alignment: 

1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 50 

I I! I! I ! I I I I I I II i I I I II I I I I I I I I I II I II I I 1 I I I I I I I I I I I I 

1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 50 
20 ... 

51 QDMCQKEVMEQSAGIMYRKSCASSAACLIASAG 83 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I ! I 

51 QDMCQKEVMEQSAGIMYRKSCASSAACLIASAG 83 



30 Sequence name: /tmp/gp6eQTLWqk/mFtjUpUzhb:BAC85518 



835 . 00 

83 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 
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Sequence documentation : 



Alignment of: Rl 1 7 2 3_PEA_1_P 6 x BAC85518 



5 Alignment segment 1/1: 



Quality: 

Escore: 0 

Matching length: 
10 length: 83 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 
15 Gaps: 



835 .00 



83 



100.00 



Total 



100.00 Matching Percent 



Total Percent 



Alignment : 



20 



1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCS S PEFI VNCTVNV 50 

I ! I I I I I I I I ! I I I I I I I I I i I ! I I I I I I I I I I I I I I I I I I I ! I i I I I I I 

24 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSS PEFI VNCTVNV 73 



25 



51 QDMCQKEVMEQSAGIMYRKSCASSAACLIASAG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 

7 4 QDMCQKEVMEQSAGIMYRKSCASSAACLIASAG 



83 



106 



30 



WO 2006/131783 



PCT/IB2005/004037 



1297 

Sequence name : /tmp/VXj dFlzdBX/bexTxThOTh : Q96AC2 



Sequence documentation : 



5 Alignment of: R11723_PEA 1_P7 x Q96AC2 



Alignment segment 1/1: 



Quality : 

10 Escore: 0 

Matching length: 
length: 64 

Matching Percent Similarity: 
Identity: 100.00 
15 Total Percent Similarity: 

Identity: 100.00 

Gaps : 



654 .00 



64 



100.00 



Total 



100.00 Matching Percent 



Total Percent 



20 



Alignment : 



1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 50 

I I I I 1 I ! I I ! I I I I I I II I I I I I I I I I I I I 1 I I I I I I 1 I I I I I I I I I I I I 

1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 50 



25 



51 QDMCQKEVMEQSAG 

I ! I I I I I I I I I I I I 

51 QDMCQKEVMEQSAG 



64 



64 



30 
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Sequence name : /tmp/VX j dFl zdBX/bexTxThOTh : Q8N2G4 
5 Sequence documentation: 

Alignment of: Rl 1 7 2 3_PEA_1_P 7 x Q8N2G4 
Alignment segment 1/1: 

10 

Quality: 

Escore: 0 

Matching length: 
length: 64 
15 Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 

20 

Alignment : 

1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 50 

I I I I I I M I I I I I I I I I I I I I I I I I I I 1 I I I I ! I I I I I I I I I I I I I I I I I 

25 1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 50 

51 QDMCQKEVMEQSAG 64 
I I I I II I I I I I I I I 

51 QDMCQKEVMEQSAG 64 

30 



654 .00 

64 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 
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5 Sequence name: /tmp/VXj dFlzdBX/bexTxThOTh : BAC85273 



Sequence documentation : 



10 



Alignment of: R11723_PEA_1_P7 x BAC85273 



Alignment segment 1/1: 



Quality : 

Escore: 0 
15 Matching length: 

length: 5 9 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
20 Identity: 100.00 

Gaps : 



600 . 00 



59 



100 .00 



Total 



100.00 Matching Percent 



Total Percent 



Alignment : 



25 



6 IAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQ 55 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

22 IAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQ 71 



30 



5 6 KEVMEQSAG 
1 I I I I I I I I 
72 KEVMEQSAG 



64 



80 
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10 



Sequence name : /tmp/VXj dFlzdBX/bexTxThOTh :BAC85518 



Sequence documentation : 



Alignment of: R11723_PEA_1_P7 x BAC85518 



Alignment segment 1/1: 



15 Quality: 654.00 

Escore: 0 

Matching length: 64 
length: 64 
Matching Percent Similarity: 
20 Identity: 100.00 

Total Percent Similarity: 100.00 
Identity: 100.00 

Gaps : 0 



Total 



100.00 Matching Percent 



Total Percent 



25 Alignment: 



1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 50 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2 4 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 73 



30 



51 QDMCQKEVMEQSAG 



64 
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3301 

I 1 I I I I I 1 I I I ! I I 

7 4 QDMCQKEVMEQSAG 8 7 



5 



Sequence name : /tmp/OLMSexEmIh/pc7Z7XmlYR : Q96AC2 

10 

Sequence documentation : 



Alignment of: R11723_PEA_1_P1 0 x Q96AC2 



15 Alignment segment 1/1: 



Quality: 

Escore: 0 

Matching length: 
20 length: 63 

Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 
25 Gaps : 



645.00 



63 



100.00 



Total 



100.00 Matching Percent 



Total Percent 



Alignment : 

1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 50 
30 I I I I I I I I I I I I I I I I I ! I 1 I I I I I ! I I I I I I I I I i I I I I I I I I M I I I I 

1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 5 0 
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51 QDMCQKEVMEQSA 63 
I I I 1 I I i I I 1 1 I I 

51 QDMCQKEVMEQSA 63 



10 

Sequence name: /tmp/OLMSexEmIh/pc7Z7XmlYR : Q8N2G4 
Sequence documentation : 
15 Alignment of: Rl 1 7 2 3_PE A_1_P1 0 x Q8N2G4 
Alignment segment 1/1: 

Quality: 645.00 

20 Escore: 0 

Matching length: 63 Total 

length: 63 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 
25 Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 

Gaps : 0 

Alignment : 

30 ..... 

1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTWV 50 
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I I I I I I I I I ! I 1 I I I I I I i I I I I I I I I I I I I I 1 I I I t i I I I i I I ! I I I I I 

1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 5 0 

51 QDMCQKEVMEQSA 63 
I I I I I I 1 I I I I II 

51 QDMCQKEVMEQSA 63 



10 



Sequence name : /tmp/OLMSexEmIh/pc7Z7XmlYR: BAC85273 



15 Sequence documentation: 



Alignment of: Rl 1 7 2 3_PEA_1_P 1 0 x BAC85273 



20 



30 



Alignment segment 1/1: 

Quality: 591.00 



Escore: 0 

Matching length: 
length: 58 
25 Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 



58 



100.00 



Alignment : 



Total 



100.00 Matching Percent 



Total Percent 
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6 IAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQ 55 

I ! I I I I I I I I I t I I I ! I I I i I I I I I I I I II 1 I I I I II I I I I I I I I I I 11 I 

22 IAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQ 71 

5 6 KEVMEQSA 63 
I I I I II I I 

7 2 KEVMEQSA 7 9 



15 Sequence name : /tmp/OLMSexEraIh/pc7Z7XmlYR : BAC8551 8 
Sequence documentation : 

Alignment of: R11723_PEA_1_P10 x BAC85518 

20 

Alignment segment 1/1: 

Quality: 645.00 

Escore: 0 

25 Matching length: 63 Total 

length: 63 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

30 Identity: 100.00 

Gaps : 0 
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Alignment : 

• • ■ . . 

1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 5 0 

5 I I I I I I I I I I I t 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I 

24 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 7 3 

51 Q DMCQKE VME Q S A 6 3 

I I I I I I I I I 1 I I I 

10 7 4 QDMCQKEVMEQSA 8 6 



15 



Alignment of: Rl 1 7 2 3_PEA_1_P 1 3 x Q96AC2 



20 



Alignment segment 1/1: 

Quality: 645.00 



Escore: 0 

Matching length: 
length: 63 
25 Matching Percent Similarity: 
Identity: 100.00 

Total Percent Similarity: 
Identity: 100.00 

Gaps : 

30 



63 



100.00 



Total 



100.00 Matching Percent 



Total Percent 



Alignment : 
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1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 5 0 

I I I I I I I I I I I I I I I I I 1 ! I I I I 1 i I I I I I I I I I I I I I I I I I I I I I I 1 I I 

1 MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNV 50 

51 Q DMCQKE VMEQ S A 63 
I I I I 1 1 1 I I I I I I 

51 Q DMCQKE VMEQS A 63 



10 

It should be noted that the nucleotide transcript sequence of known protein (PSEC, also 
referred to herein as the "wild type" or WT protein) feature at least one SNP that appears to 
affect the coding region, in addition to certain silent SNPs. This SNP does not have an effect on 
the Rl 1723JPEA_1_T5 splice variant sequence): "G-> " resulting in a missing nucleotide 

15 (affects amino acids from position 91 onwards). The missing nucleotide creates a frame shift, 
resulting in a new protein. This SNP was not previously identified and is supported by 5 ESTs 
out of -70 ESTs in this exon. 

It should be noted that the variants of this cluster are variants of the hypothetical protein 
PSEC0181 (referred to herein as "PSEC"). Furthermore, use of the known protein (WT protein) 

20 for detection of lung cancer, alone or in combination with one or more variants of this cluster 
and/or of any other cluster and/or of any known marker, also comprises an embodiment of the 
present invention. 



Expression of Rl 1723 transcripts which are detectable by amplicon as depicted in sequence 
25 name Rl 1723 segl3 in normal and cancerous lung tissues 

Expression of transcripts detectable by or according to Rl 1723 segl3, Rl 1723 segl3 amplicon 
(SEQ ID NO: 1684), and Rl 1723 segl3F (SEQ ID NO: 1682), and Rl 1723 segl3R (SEQ ID 
NO: 1683), primers was measured by real time PCR. In parallel the expression of four 
housekeeping genes PBGD (GenBank Accession No. BC019323; amplicon - PBGD-amplicon, 
30 SEQ ID NO:334), HPRT1 (GenBank Accession No. NM_000194; amplicon - HPRT1- 

amplicon, SEQ ID NO: 1297), and SDHA (GenBank Accession No. NMJ304168; amplicon - 
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SDH A- amplicon, SEQ ID NO:331) 5 Ubiquitin (GenBank Accession No. BC000449; amplicon - 
Ubiquitin- amplicon, SEQ ID NO:328) was measured similarly. For each RT sample, the 
expression of the above amplicon was normalized to the geometric mean of the quantities of the 
housekeeping genes. The normalized quantity of each RT sample was then divided by the 
5 median of the quantities of the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 
96-99, Table 2 "Tissue samples in testing panel", above), to obtain a value of fold up -regulation 
for each sample relative to median of the normal PM samples. 

Figure 48 is a histogram showing over expression of the above -indicated transcripts in 
cancerous lung samples relative to the normal samples. The number and percentage of samples 
10 that exhibit at least 5 fold over- expression, out of the total number of samples tested is indicated 
in the bottom. 

As is evident from Figure 48, the expression of transcripts detectable by the above 
amplicon(s) in cancer samples was higher than in the non- cancerous samples (Sample Nos. 47- 
50, 90-93, 96-99 Table 2 "Tissue samples in testing panel"). Notably an over- expression of at 
15 least 5 fold was found in 10 out of 15 adenocarcinoma samples, and in 4 out of 8 small cells 
carcinoma samples. 

Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: Rl 1723 segl3F forward primer; and 
20 Rl 1 723 segl 3R reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: Rl 1723 segl3. 

R11723segl3F (SEQ ID NO: 1682), - ACACTAAAAGAACAAACACCTTGCTC 
25 Rl 1 723segl 3R (SEQ ID NO: 1 683), - TCCTCAGAAGGCACATGAAAGA 

R11723segl3 -amplicon (SEQ ID NO: 1684),: 

ACACTAAAAGAACAAACACCTTGCTCTTCGAGATGAGACATTTTGCCAAGCA 
GTTGACCACTTAGTTCTCAAGAAGCAACTATCTCTTTCATGTGCCTTCTGAGGA 

30 Expression of Rl 1723 transcripts which are detectable by amplicon as depicted in sequence 

name R11723segl3 in different normal tissues 
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Expression of Rl 1723 transcripts detectable by or according to Rl 1723segl3 amplicon 
(SEQ ID NO: 1684), and Rl 1723segl3F (SEQ ID NO: 1682),, Rl 1723segl3R (SEQ ID NO: 
1683), was measured by real time PCR. In parallel the expression of four housekeeping genes 
5 RPL19 (GenBank Accession No. NM_000981; RPL19 amplicon, SEQ ID NO:1630), TATA 
box (GenBank Accession No. NM_003194; TATA amplicon, SEQ ID NO: 1633), UBC 
(GenBank Accession No. BC000449; amplicon - Ubiquitin-amplicon, SEQ ID NO:328) and 
SDHA (GenBank Accession No. NMJ304168; amplicon - SDH A- amplicon, SEQ ID NO:331) 
was measured similarly. For each RT sample, the expression of the above amplicon was 
10 normalized to the geometric mean of the quantities of the housekeeping genes. The normalized 
quantity of each RT sample was then divided by the median of the quantities of the ovary 
samples (Sample Nos. 18-20, Table 2 "Tissue samples in normal panel" above), to obtain a 
value of relative expression of each sample relative to median of the ovary samples. 

15 Rl 1723segl3F (SEQ ID NO: 1682), - ACACTAAAAGAACAAACACCTTGCTC 

Rl 1723segl3R (SEQ ID NO: 1683), - TCCTCAGAAGGCACATGAAAGA 
Rl 1723segl3 - amplicon (SEQ ID NO: 1684),: 

ACACTAAAAGAACAAACACCTTGCTCTTCGAGATGAGACATTTTGCCAAGCAGTTG 
ACCACTTAGTTCTCAAGAAGCAACTATCTCTTTCATGTGCCTTCTGAGGA 
20 The results are presented in Figure 49, showing the expression of Rl 1723 transcripts which 
are detectable by amplicon as depicted in sequence name Rl 1723segl3 in different normal 
tissues. 

Expression of Rl 1723 transcripts, which are detectable by amplicon as depicted in sequence 
25 name R11723 juncll-18 in normal and cancerous lung tissues. 

Expression of transcripts detectable by or according to juncll-18, R11723 juncll-18 
amplicon (SEQ ID NO: 1687) and Rl 1723 juncl 1-18F (SEQ ID NO: 1685) and Rl 1723 juncl 1- 
18R (SEQ ID NO: 1686) primers was measured by real time PCR (this junction is found in the 
known protein sequence or "wild type" (WT) sequence, also termed herein the PSEC sequence). 
30 In parallel the expression of four housekeeping genes PBGD (GenBank Accession No. 
BC019323; amplicon - PBGD- amplicon, SEQ ID NO:334), HPRT1 (GenBank Accession No. 
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NMJ300194; amplicon - HPRT1 -amplicon, SEQ ID NO:1297), SDHA (GenBank Accession 
No. NM_004168; amplicon - SDHA- amplicon, SEQ ID NO:331), and Ubiquitin (GenBank 
Accession No. BC000449; amplicon - Ubiquitin- amplicon, SEQ ID NO:328) was measured 
similarly. For each RT sample, the expression of the above amplicon was normalized to the 
5 geometric mean of the quantities of the housekeeping genes. The normalized quantity of each 
RT sample was then divided by the median of the quantities of the normal post-mortem (PM) 
samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, above: "Tissue samples in lung cancer 
testing panel"), to obtain a value of fold up-regulation for each sample relative to median of the 
normal PM samples. 

10 Figui*e 50 is a histogram showing over expression of the above -indicated transcripts in 

cancerous lung samples relative to the normal samples. Values represent the average of 
duplicate experiments. Error bars indicate the minimal and maximal values obtained. 

As is evident from Figure 50, the expression of transcripts detectable by the above 
amplicon in cancer samples was higher than in the non-cancerous samples (Sample Nos. 47-50, 

15 90-93, 96-99 Table 2 "Tissue samples in lung cancer testing panel"). Notably an over- 
expression of at least 5 fold was found in 11 out of 15 adenocarcinoma samples, 4 out of 16 
squamous cell carcinoma samples, 1 out of 4 large cell carcinoma samples and in 5 out of 8 
small cells carcinoma samples. 

20 Primer pairs are also optionally and preferably encompassed within the present 

invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: Rl 1723 juncl 1-1 8F forward primer; 
and Rl 1723 juncl 1-1 8R reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 

25 use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: Rl 1723 juncl 1- 
18. 

R11723juncll-18F (SEQ ID NO: 1685)- AGTGATGGAGCAAAGTGCCG 
Rl 1723 juncl 1-18R (SEQ ID NO: 1686)- CAGCAGCTGATGCAAACTGAG 
30 Rl 1723 juncl 1-18- amplicon (SEQ ID NO: 1687) 
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AGTGATGGAGCAAAGTGCCGGGATCATGTACCGCAAGTCCTGTGCATCATCAGCGG 
CCTGTCTCATCGCCTCTGCCGGGTACCAGTCCTTCTGCTCCCCAGGGAAACTGAACT 
CAGTTTGCATCAGCTGCTG 



5 

Expression of Rl 1723 transcripts, which were detected by amplicon as depicted in the sequence 
name R11723 juncll-18 in different normal tissues. 



Expression of R11723 transcripts detectable by or according to R11723segl3 amplicon (SEQ ID 
10 NO: 1687) and RU723 juncl 1-18F (SEQ ID NO: 1685), R11723 juncl 1-1 8R(SEQ ID NO: 
1686) was measured by real time PCR. In parallel the expression of four housekeeping genes 
RPL19 (GenBank Accession No. NM 000981; RPL19 amplicon, SEQ ID NO:1630), TATA 
box (GenBank Accession No. NM_003194; TATA amplicon, SEQ ID NO: 1633), UBC 
(GenBank Accession No. BC000449; amplicon - Ubiquitin- amplicon, SEQ ID NO:328) and 
1 5 SDHA (GenBank Accession No. NM_004 1 68; amplicon - SDH A- amplicon, SEQ ID NO:33 1) 
was measured similarly. For each RT sample, the expression of the above amplicon was 
normalized to the geometric mean of the quantities of the housekeeping genes. The normalized 
quantity of each RT sample was then divided by the median of the quantities of the ovary 
samples (Sample Nos. 18-20 Table 3 above), to obtain a value of relative expression of each 
20 sample relative to median of the ovary samples. 



R11723juncll-18F (SEQ ID NO: 1685)- AGTGATGGAGCAAAGTGCCG 
Rl 1723 juncl 1-18R (SEQ ID NO: 1686)- CAGCAGCTGATGCAAACTGAG 
25 Rl 1723 juncl 1-18 - amplicon (SEQ ID NO: 1687) 

AGTGATGGAGCAAAGTGCCGGGATCATGTACCGCAAGTCCTGTGCATCATCAGCGG 
CCTGTCTCATCGCCTCTGCCGGGTACCAGTCCTTCTGCTCCCCAGGGAAACTGAACT 
CAGTTTGCATCAGCTGCTG 



30 
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The results are demonstrated in Figure 73, showing the expression of Rl 1723 transcripts, which 
were detected by amplicon as depicted in the sequence name Rl 1723 juncl 1-18 in different 
normal tissues. 

5 Cloning of this variant 
Full length validation 

RNA preparation 

Human adult papillary adenocarcinoma ovary RNA pool (lot# ILS1408) was obtained 
from ABS (http://www.absbioreagents, Wilmington, DE 19801, USA com). Total RNA 
10 samples were treated with DNasel (Ambion Cat # 1906). 
RT PCR 

RT preparation 

Purified RNA (1 ug) was mixed with 150 ng Random Hexamer primers (Invitrogen Cat 
# 48190-01 1) and 500 uM dNTP (Takara, Cat # B9501-1) in a total volume of 15.6ul DEPC- 

15 H 2 0 (Beit Haemek, Cat # 01-852-1A). The mixture was incubated for 5 min at 65°C and then 
quickly chilled on ice. Thereafter, 5 ul of 5X Superscript II first strand buffer (Invitrogen, Cat # 
Y00146), 2.4ul 0.1M DTT (Invitrogen, Cat #Y00147) and 40 units RNasin (Promega, Cat # 
N251A) were added, and the mixture was incubated for 2 min at 42°C. Then, 1 ul (200units) of 
Superscriptll (Invitrogen, Cat #18064-022) was added and the reaction was incubated for 50 

20 min at 42°C and then inactivated at 70°C for 15min. The resulting cDNA was diluted 1:20 in TE 
buffer (10 mM Tris pH=8, 1 mM EDTA pH=8). 
PCR amplification and analysis 

cDNA (5ul), prepared as described above, was used as a template in PCR reactions. The 
amplification was done using AccuPower PCR PreMix (Bioneer, Korea, Cat# K2016), under the 
25 following conditions: lul - of each primer (lOuM) 
PSECfor- TGCTGTCGCCTCCTCTGATG 
PSECrev- CCTCAGAAGGCACATGAAAG 

plus 13ftl- H2O were added into AccuPower PCR PreMix tube with a reaction program of 5 
minutes at 94°C; 35 cycles of: [30 seconds at 94°C, 30 seconds at 52°C, 40 seconds at 72°C] and 
30 10 minutes at 72°C. At the end of the PCR amplification, products were analyzed on agarose 
gels stained with ethidium bromide and visualized with UV light. PCR product was extracted 
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from the gel using QiaQuick™ gel extraction kit (Qiagen™, Cat #28706). The extracted DNA 
product (Figure 79) was sequenced by direct sequencing using the gene specific primers from 
above (Hy-Labs, Israel), resulting in the expected sequence of PSEC variant R11723_PEA_1 T5 
(Figure 80). 

5 It was concluded that the predicted PSEC variant Rl 1723JPEAJL T5 is indeed a 

naturally expressed variant in an adult papillary adenocarcinoma ovary human tissue as shown 
in Figure 79. 

Cloning of PSEC variant R11723_PEA_1 T5 into bacterial expression vector 

The PSEC splice variant R11723_PEA_1 T5 coding sequence was prepared for cloning 
10 by PCR amplification using the fragment described above as template and Platinum Pfx DNA 
polymerase (Invitrogen Cat # 1 1708021) under the following conditions: 5 ul- Amplification 
X10 buffer (Invitrogen Cat # 1 1708021); 2ul - PCR product from above; lul - dNTPs (lOmM 
each); 1 ul MgS04 (50mM) 5ul enhancer solution (Invitrogen Cat # 1 1708021); 33M- H 2 0; 
lul — of each primer (lOuM) and 1.25 units of Tag polymerase [Platinum Pfx DNA polymerase 
1 5 (Invitrogen Cat # 1 1708021)] in a total reaction volume of 50ul with a reaction program of 3 

minutes at 94°C; 29 cycles of: [30 seconds at 94°C 5 30 seconds at 58°C, 40 seconds at 68°C] and 
7 minutes at 68 °C. The Primers listed below include specific sequences of the nucleotide 
sequence corresponding to the splice variant and Nhel and Hindlll restriction sites. 

PSEC Nhelfor- ATAGCTAGCATGTGGGTCCTAGGCATCGCGG 

20 PSEC Hindlllrev- CCCAAGCTTCTAAGTGGTCAACTGCTTGGC 

The PCR product was then double digested with Nhel and Hindlll (New England 
Biolabs (UK) LTD) (Figure 81), and inserted into pRSET-A (Invitrogen, Cat# V351-20), 
previously digested with the same enzymes, in- frame to an N- terminal 6His-tag, to give 
HisPSEC T5 pRSET (Figure 82). The coding sequence encodes for a protein having the 6His- 

25 tag at the N' end (6His residues in a row at one end of the protein), and 8 additional amino acids 
encoded by the pRSET vector. 

The sequence of the PSEC insert in the final plasmid, as well as its flanking regions, 
were verified by sequencing and found to be identical to the desired sequences. The complete 
sequence of His PSEC T5 pRESTA, including the sequenced regions, is shown in Figure 84. 

30 Figure 83 shows the translated sequence of PSEC variant Rl 1723_PEA_1 T5. 
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Bacterial culture and induction of protein expression 

HisPSEC pRSETA DNA was transformed into competent DH5a cells (Invitrogen 
Cat#l 8258-012). Ampicillin resistant transformants were screened and positive clones were 
5 further analyzed by restriction enzyme digestion and sequence verification. 

In order to express the recombinant protein, HisPSEC pRSETA DNA was further 
transformed into competent BL21Gold cells (Stratagene Cat#230134) and BL21star (Invitrogen 
Cat# 44-0054). Ampicillin resistant transformants were screened and positive clones were 
selected. 

1 0 Bacterial cells containing the HisPSEC T5 pRSET vector or empty pRSET vector (as 

negative control) were grown in LB medium, supplemented with Ampicillin (50 ug/ml) and 
chloramphenicol (34 ug/ml), until O.D.600nm reached 0.55. This value was reached in about 3 
hours. ImM IPTG (Roche, Cat #724815) was added and the cells were grown at 37 °C 
overnight. 1 ml aliquots of each culture were removed for gel analysis at time zero, 3 hrs after 

15 induction and following overnight incubation (TO ,T3 and TO/N, respectively). 

Expression Results 

The time course of small-scale expression of PSEC in BL21Gold is demonstrated in 
Figure 85. The expression of a recombinant protein with the appropriate molecular weight (9.2 
20 kDa) was visualized by Western Blot with anti-His antibodies (BD Clontech, Ref 631212, 
Figure 85), but not by Coomassie staining (data not shown). Similar expression pattern was 
obtained with BL21 star as well (data not shown). 

These results show that the protein encoded by PSEC variant Rl 1723_PEA_1 T5 is 
indeed expressed in bacterial cells. 

25 



30 



DESCRIPTION FOR CLUSTER R16276 
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Cluster R16276 features 1 transcript(s) and 5 segment(s) of interest, the names for which 
are given in Tables 1305 and 1306, respectively, the sequences themselves are given at the end 
of the application. The selected protein variants are given in table 1307. 



5 



Table 1305 - Transcripts of interest 


Transcript Name ; : - *. 


Sequence ID Noi ,, ; . ; • .. • • 


R16276_PEA_1_T6 


150 


Table 1306 - Segments of interest 


? Segment -Name '* '. % ; - 


Sequence ID No. ',':>■'"• ''**,?'*• .' •" , . . 


Rl 6276_PEA_l_node_0 


1017 


Rl 6276_PEA_l_node_6 


1018 


Rl 6276_PEA_l_node_l 


1019 


Rl 6276_PEA_l_node_4 


1020 


Rl 6276_PEA_l_node_5 


1021 



Table 1307 - Proteins of interest 



.Protein Name i ' - 


Sequence ID No. " 


Corresponding Transcript(s) 


R16276_PEA_1_P7 


1414 


R16276_PEA_1_T6 



10 These sequences are variants of the known protein NOV protein homolog precursor 

(SwissProt accession identifier NOVJHUMAN; known also according to the synonyms NovH; 
Nephroblastoma overexpressed gene protein homolog), SEQ ID NO: 1463, referred to herein as 
the previously known protein. 

Protein NOV protein homolog precursor is known or believed to have the following 

15 function(s): Immediate- early protein, likely to play a role in cell growth regulation (By 

similarity). The sequence for protein NOV protein homolog precursor is given at the end of the 
application, as "NOV protein homolog precursor amino acid sequence". Known polymorphisms 
for this sequence are as shown in Table 1308. 

Table 1308 - Amino acid mutations for Known Protein 
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SNP positionfs) on * 
amino acid sequence 


Comment ^ , 


97 


N->K 



Protein NOV protein homolog precursor localization is believed to be Secreted. 

The following GO Annotation(s) apply to the previously known protein. The following 
annotation(s) were found: regulation of cell growth, which are annotation(s) related to 
5 Biological Process; insulin- like growth factor binding; growth factor, which are annotation(s) 
related to Molecular Function; and extracellular, which are annotation(s) related to Cellular 
Component. 

The GO assignment relies on information from one or more of the SwissProt/TremBl 
Protein knowledgebase, available from <http://www.expasy.ch/sprot/>; or Locuslink, available 
10 from <http://www.ncbi.nlm.nih.gov/projects/LocusLink/>. 

Cluster R16276 can be used as a diagnostic marker according to overexpression of 
transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
15 the table and the numbers on the y-axis of figure 51 refer to weighted expression of ESTs in 
each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
20 Figure 51 and Table 1309. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: lung malignant tumors. 



Table 1310 - Normal tissue distribution 



Name of Tissue 


Number 


Adrenal 


977 


Bone 


32 


Brain 


24 
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Colon 


0 


Epithelial 


63 


General 


43 


Kidney 


24 


Liver 


341 


Lung 


0 


Breast 


0 


Muscle 


20 


Ovary 


0 


Pancreas 


0 


Prostate 


24 


Skin 


13 


Stomach 


146 


Uterus 


0 



Table 1311 - P values and ratios for expression in cancerous tissue 



Name of Tissue. 


PI 


P2 v; : 


SP1 %■[ 


R3 %< 


SP2 1 - 




Adrenal 


5.9e-01 


6.2e-01 


1 


0.2 


9.9e-01 


0.2 


Bone 


5.5e-01 


7.3e-01 


1 


0.8 


1 


0.6 


Brain 


2.8e-01 


4.4e-01 


6.8e-01 


0.9 


8.9e-01 


0.6 


Colon 


2.6e-01 


3.3e-01 


4.9e-01 


2.0 


5.9e-01 


1.7 


Epithelial 


2.6e-01 


2.9e-01 


9.7e-01 


0.6 


1 


0.5 


General 


4.1e-01 


6.8e-01 


9.4e-01 


0.7 


1 


0.5 


Kidney 


8.3e-01 


7.7e-01 


6.2e-01 


1.2 


5.3e-01 


1.4 


Liver 


9.1e-01 


7.5e-01 


1 


0.1 


1 


0.1 


Lung 


2.3e-02 


9.1e-02 


8.0e-04 


10.5 


2.1e-02 


5.1 


Breast 


5.9e-01 


6.7e-01 


6.9e-01 


1.5 


8.2e-01 


1.2 


Muscle 


5.2e-01 


6.1e-01 


2.7e-01 


3.2 


6.3e-01 


1.2 


Ovary 


6.2e-01 


6.5e-01 


6.8e-01 


1.5 


7.7e-01 


1.3 
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Pancreas 


3.3e-01 


4.4e-01 


4.2e-01 


2.4 


5.3e-01 


1.9 


Prostate 


9.3e-01 


9.4e-01 


1 


0.5 


9.4e-01 


0.6 


Skin 


9.2e-01 


6.8e-01 


1 


0.5 


4.1e-01 


1.1 


Stomach 


5.0e-01 


7.3e-01 


5.0e-01 


0.6 


9.7e-01 


0.4 


Uterus 


2.4e-01 


1.6e-01 


2.9e-01 


2.5 


4.1e-01 


2.0 



As noted above, cluster R16276 features 1 transcript(s), which were listed in Table 1 
above. These transcript(s) encode for protein(s) which are variant(s) of protein NOV protein 
homolog precursor. A description of each variant protein according to the present invention is 
now provided. 

5 

Variant protein R16276_PEA_1_P7 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
R16276_PEA_1__T6. An alignment is given to the known protein (NOV protein homolog 
precursor) at the end of the application. One or more alignments to one or more previously 
1 0 published protein sequences are given at the end of the application. A brief description of the 

relationship of the variant protein according to the present invention to each such aligned protein 
is as follows: 

Comparison report between R16276_PEA_1 JP7 and NOVJHUMAN: 

l.An isolated chimeric polypeptide encoding for R16276JPEA_1JP7 J> comprising a first 

1 5 amino acid sequence being at least 90 % homologous to 

MQSVQSTSFCLRKQCLCLTFLLLHLLGQVAATQRCPPQCPG corresponding to amino 
acids 1-41 of NO V_HUM AN, which also corresponds to amino acids 1 - 41 of 
R16276 PEA 1P7, a bridging amino acid Q corresponding to amino acid 42 of 
R16276_PEA_1 JP7, a second amino acid sequence being at least 90 % homologous to 

20 CPATPPTCAPGVRAVLDGCSCCLVCARQRGESCSDLEPCDESSGLYCDRSADPSNQTGI 
CT corresponding to amino acids 43 - 103 of NOVJHUMAN, which also corresponds to amino 
acids 43 - 103 of Rl 6276_PEA_1_P7, and a third amino acid sequence being at least 70%, 
optionally at least 80%, preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% homologous to a polypeptide having the sequence GNPAPSAV 

25 corresponding to amino acids 104 - 111 of R16276JPEA_1_P7, wherein said first amino acid 
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sequence, bridging amino acid, second amino acid sequence and third amino acid sequence are 
contiguous and in a sequential order. 

2. An isolated polypeptide encoding for a tail of R16276_PEA_1_P7, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
5 more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence GNPAPSAV in R16276JPEA_1 JP7. 

The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 

10 programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region. 

Variant protein R16276JPEA1P7 also has the following non- silent SNPs (Single 

15 Nucleotide Polymorphisms) as listed in Table 1312, (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein R16276_PEA_1 JP7 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

20 Table 1313 - Amino acid mutations 



SHPiposition(s> <m aitimo 
sequence < 


Alternative amino acid(s) ; 


i Preykmsfy Mown SNF? : > 


42 


Q->R 


Yes 



The glycosylation sites of variant protein R16276_PEA_1_P7, as compared to the known 
protein NOV protein homolog precursor, are described in Table 1314 (given according to their 
position(s) on the amino acid sequence in the first column; the second column indicates whether 
25 the glycosylation site is present in the variant protein; and the last column indicates whether the 
position is different on the variant protein). 

Table 1314 - Glycosylation site(s) 
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Position(s) on known amino 
acid Sequence 


Present in variant protein? 


Position in variant protein? 


280 


no 




97 


yes 


97 



Variant protein R16276_PEA_1JP7 is encoded by the following transcript(s): 
Rl 6276_PEA_1_T6, for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript Rl 6276__PEA_1 JT6 is shown in bold; this coding portion starts at 
5 position 445 and ends at position 777. The transcript also has the following SNPs as listed in 
Table 1315 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein Rl 6276_PEA_1_P7 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

10 Table 1315 - Nucleic acid SNPs 



SNP position, on nucleotide ] ; 
sequence - . ; ; ;: ' ■ ;;.v-; ~ 


Alternative nucleic acid : 


; Prev iously known SNP? 


371 


G-> 


No 


430 


A->G 


No 


569 


A->G 


Yes 


729 


C->A 


Yes 


827 


G->T 


Yes 



As noted above, cluster R16276 features 5 segment(s), which were listed in Table 2 above 
and for which the sequence(s) are given at the end of the application. These segment(s) are 
portions of nucleic acid sequence(s) which are described herein separately because they are of 
particular interest. A description of each segment according to the present invention is now 



15 provided. 

Segment cluster R16276_PEA_l_node_0 according to the present invention is supported 
by 35 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): R16276 PEA 1_T6. Table 1316 below describes 
the starting and ending position of this segment on each transcript. 

Table 1316 - Segment location on transcripts 



Transcript name v . ■ f j h . ? \ 


Segment . } 
starting position : ; 


' Segment s . : 
endmg position: 


R16276_PEA_1_T6 


1 


438 



5 

Segment cluster R16276 PEA l node_6 according to the present invention is supported 
by 2 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): Rl 6276JPEA_1_T6. Table 1317 below describes 
the starting and ending position of this segment on each transcript. 

1 0 Table 1317 - Segment location on transcripts 



Transcript J3|ma^ \ I >/| 


Segment J% 
^starting position 


Segment ;.; % ' - 
p ending position 


R16276_PEA_1_T6 


755 


876 



According to an optional embodiment of the present invention, short segments related to 
the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description 

15 Segment cluster R16276 PEA 1 node l according to the present invention is supported 

by 37 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R16276JPEA_1_T6. Table 1318 below describes 
the starting and ending position of this segment on each transcript. 

Table 1318 ~ Segment location on transcripts 



Transcript name 


Segment 

; starting position 


Segment 
ending position 


R16276JPEA_1_T6 


439 


528 



20 
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Segment cluster R16276_PEA_1 _node_4 according to the present invention is supported 
by 38 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R16276_PEA_1_T6. Table 1319 below describes 



5 the starting and ending position of this segment on each transcript. 
Table 1319 - Segment location on transcripts 



Transcript name ? j . 


Segment 
starting position 


Segment .-■ ; .;' p \ 
ending position < : 


Rl 6276JPEA_1 JT6 


529 


639 



Segment cluster R16276_PEA_l_node_5 according to the present invention is supported 
10 by 37 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): R16276_PEA_1_T6. Table 1320 below describes 
the starting and ending position of this segment on each transcript. 

Table 1320 - Segment location on transcripts 





' Segment ■ y ; V-^Vy*' 
starting position 


Segment : | 
ending position . ^ 


R16276JPEA_1_T6 


640 


754 



15 



20 

Variant protein alignment to the previously known protein: 

Sequence name : NOV_HUMAN 
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Sequence documentation : 



Alignment of: Rl 627 6_PEA_1_P7 x NOV_ HUMAN 



5 Alignment segment 1/1: 



Quality: 1042.00 

Escore: 0 

Matching length: 103 
10 length: 103 

Matching Percent Similarity: 100.00 
Identity: 99.03 

Total Percent Similarity: 100.00 
Identity: 99.03 
15 Gaps: 0 



Total 



Matching Percent 



Total Percent 



Alignment : 



20 



1 MQSVQSTSFCLRKQCLCLTFLLLHLLGQVAATQRCPPQCPGQCPATPPTC 5 0 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I t I I I I I I I I I I : I 1 I i I I I I 

1 MQSVQSTSFCLRKQCLCLTFLLLHLLGQVAATQRCPPQCPGRCPATPPTC 50 



25 



51 APGVRAVLDGCSCCLVCARQRGE SCSDLEPCDESSGLYCDRSADPSNQTG 100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

51 APGVRAVLDGCSCCLVCARQRGESCSDLEPCDESSGLYCDRSADPSNQTG 10 0 



30 



101 ICT 
I I I 

101 ICT 



103 



103 
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Combined expression of 6 sequences H61775seg8, HUMGRP5E junc3-7, 
M85491Seg24, Z21368 juncl7-21, HSSTROL3seg24 and Z25299seg20 in normal and 

cancerous lung tissues. 

5 Expression of immunoglobulin superfamily, member 9, gastrin- releasing peptide, Ephrin 

type-B receptor 2 precursor, SUL 1JHUMAN, Stromelysin-3 precursor (EC 3.4.24.-) (Matrix 
metalloproteinase-11) (MMP-11) (ST3) (SL-3) and Secretory leukocyte protease inhibitor Acid- 
stable proteinase inhibitor transcripts detectable by or according to H61775seg8 (SEQ ID NO: 
1636), HUMGRP5E junc3-7 (SEQ ID NO: 1648), M85491Seg24 (SEQ ID NO: 1639), Z21368 

10 juncl7-21 (SEQ ID NO: 1642), HSSTROL3seg24 (SEQ ID NO: 1675) and Z25299seg20 
amplicons (SEQ ID NO: 1669) and H61775seg8F (SEQ ID NO: 1634), H61775seg8R (SEQ 
ID NO: 1635), HUMGRP5E junc3-7F (SEQ ID NO: 1646), HUMGRP5E junc3-7R (SEQ ID 
NO: 1647), M85491Seg24F (SEQ ID NO: 1637), M85491Seg24R (SEQ ID NO: 1638), Z21368 
juncl7-21F (SEQ ID NO: 1640), Z21368 juncl7-21R (SEQ ID NO: 1641), HSSTROL3seg24F 

15 (SEQ ID NO: 1673), HSSTROL3seg24R (SEQ ID NO: 1674), Z25299seg20F (SEQ ID NO: 
1667), Z25299seg20R (SEQ ID NO: 1668) primers was measured by real time PCR. In parallel 
the expression of four housekeeping genes — PBGD (GenBank Accession No. BCO 19323; 
amplicon - PBGD-amplicon, SEQ ID NO:334), HPRT1 (GenBank Accession No. NM_000194; 
amplicon - HPRT1 -amplicon, SEQ ID NO: 1297), Ubiquitin (GenBank Accession No. 

20 BC000449; amplicon - Ubiquitin-amplicon, SEQ ID NO:328) and SDHA (GenBank Accession 
No. NM_004168; amplicon - SDHA- amplicon, SEQ ID NO:331) was measured similarly. For 
each RT sample, the expression of the above amplicons was normalized to the geometric mean 
of the quantities of the housekeeping genes. The normalized quantity of each RT sample of each 
amplicon was then divided by the median of the quantities of the normal post-mortem (PM) 

25 samples detected for the same amplicon (Sample Nos. 47-50, 90-93, 96-99, Table 2„ "Tissue 
samplesin testing panel", above), to obtain a value of fold up-regulation for each sample relative 
to median of the normal PM samples. The reciprocal of this ratio was calculated for 
Z25299seg20 (SEQ ID NO: 1669), to obtain a value of fold down-regulation for each sample 
relative to median of the normal PM samples. 

30 Figures 52-53 are histograms showing differential expression of the above -indicated 

transcripts in cancerous lung samples relative to the normal samples. The number and percentage 
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of samples that exhibit at least 5 fold differential of at least one of the sequences, out of the total 
number of samples tested is indicated in the bottom. 

As is evident from Figures 52-53, differential expression of at least 5 fold in at least one 
of the sequences was found in 15 out of 15 adenocarcinoma samples, 14 out of 16 squamous cell 
5 carcinoma samples, 4 out of 4 large cell carcinoma samples and in 8 out of 8 small cell 
carcinoma samples. 

Statistical analysis was applied to verify the significance of these results, as described 
below. Threshold of 5 fold differential expression of at least one of the amplicons was found to 
differentiate between cancer and normal samples with P value of 7.82E-06 in adenocarcinoma, 
10 2.63E-04 in squamous cell carcinoma, 8.24E-03 in large cell adenocarcinoma and 3.57E-04 in 
small cell carcinoma as checked by exact fisher test. 

The above values demonstrate statistical significance of the results. 



15 



DESCRIPTION FOR CLUSTER H53626 



20 Cluster H53626 features 2 transcript(s) and 20 segment(s) of interest, the names for which 

are given in Tables 1321 and 1322, respectively, the sequences themselves are given at the end 



of the application. 




Table 1321 - Transcripts of interest 




Transcript Name 


SEQ ID NO: 


H53626_PEA_1_T15 


16 


H53626_PEA_1_T16 


17 


Table 1322 - Segments of interest 


Segment Name 


SEQ ID NO: 
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H53626_PEA_ 1 _node_ 1 5 


18 


H53 626_PE A_l _node_22 


19 


H53626_PE A_ 1 _node_25 


306 


H53626_PEA_1 _node_26 


307 


H53626_PEA_ 1 _node_27 


308 


H53626_PEA_l_node_34 


309 


H53626_PEA_l_node_35 


310 


H53626_PEA_l_node_36 


311 


H53626_PEA_l_node_l 1 


312 


H53626_PEA_l_node_l 2 


313 


H53626_PEA_l_node_l 6 


314 


H53626_PEA_l_node_l 9 


315 


H53626_PEA_l_node_20 


316 


H53626_PEA_l_node_24 


317 


H53626_PEA_l_node_28 


318 


H53626_PEA_l_node_29 


319 


H53626_PEA_l_node_30 


320 


H53626_PEA_l_node_3 1 


321 


H53626_PEA_l_node_32 


322 


H53626_PEA_l_node_33 


323 


Table 1323 - Proteins of interest 


Transcript Name 


; SEQ ID NO: ^ .V ' . ,' : }i. 


H53626_PEA_1_P4 


324 


H53626_PEA_1_P5 


325 



Cluster H53626 can be used as a diagnostic marker according to overexpression of 
5 transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given 
according to the previously described methods. The term "number" in the right hand column of 
the table and the numbers on the 3^ axis of figure 76 below refer to weighted expression of ESTs 
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in each category, as "parts per million" (ratio of the expression of ESTs for a particular cluster to 
the expression of all ESTs in that category, according to parts per million). 

Overall, the following results were obtained as shown with regard to the histograms in 
5 Figure 76 and Table 1324. This cluster is overexpressed (at least at a minimum level) in the 
following pathological conditions: epithelial malignant tumors, a mixture of malignant tumors 
from different tissues and myosarcoma. 



Table 1324 - Normal tissue distribution 



Name of Tissue : . : 


dumber ... 5 , 


adrenal 


4 


bone 


233 


brain 


33 


colon 


0 


epithelial 


12 


general 


17 


head and neck 


0 


kidney 


8 


lung 


25 


breast 


8 


muscle 


0 


ovary 


7 


pancreas 


10 


prostate 


8 


skin 


0 


stomach 


73 


Thyroid 


0 


uterus 


0 



10 

Table 1325 - P values and ratios for expression in cancerous tissue 
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Name of Tissue 


PI 


P2 . 


SP1 . 


R3 .. 


SP2 


R4 


adrenal 


6.4e-01 


4.2e-01 


2.1e-01 


3.1 


1.3e-02 


4.1 


bone 


5.8e-01 


8.1e-01 


9.8e-01 


0.3 


1.0e+00 


0.3 


brain 


2.2e-01 


2.6e-01 


8.1e-01 


0.8 


8.9e-01 


0.6 


colon 


2.3e-01 


1.4e-01 


1.5e+00 


1.2 


4.6e-01 


1.9 


epithelial 


8.3e-02 


4.8e-03 


6.4e-02 


1.5 


6.6e-08 


4.1 


general 


2.4e-03 


1.5e-05 


l.le-03 


1.6 


2.0e-12 


3.1 


head and neck 


2.1e-01 


3.3e-01 


0.0e+00 


0.0 


0.0e+00 


0.0 


kidney 


7.3e-01 


5.8e-01 


5.8e-01 


1.3 


5.7e-02 


2.0 


lung 


8.3e-01 


5.5e-01 


7.9e-01 


0.8 


3.2e-02 


2.1 


breast 


6.5e-01 


2.7e-01 


6.9e-01 


1.2 


7.8e-02 


1.9 


muscle 


1.5e+00 


2.9e-01 


1.5e+00 


1.0 


3.5e-03 


4.1 


ovary 


6.7e-01 


5.6e-01 


1.5e-01 


1.7 


7.0e-02 


2.7 


pancreas 


2.3e-01 


2.0e-01 


3.9e-01 


1.9 


8.2e-02 


2.3 


prostate 


9.0e-01 


9.0e-01 


6.7e-01 


1.1 


1.8e-01 


1.9 


skin 


1.5e+00 


4.4e-01 


1.5e+00 


1.0 


6.4e-01 


1.6 


stomach 


9.0e-01 


3.4e-01 


1.0e+00 


0.3 


6.1e-01 


0.9 


Thyroid 


2.4e-01 


2.4e-01 


1.5e+00 


1.1 


1.5e+00 


1.1 


uterus 


2.1e-01 


2.4e-01 


2.9e-01 


2.5 


2.6e-01 


2.2 



5 As noted above, contig H53626 features 2 transcript(s), which were listed in Table 1321 

above. A description of each variant protein according to the present invention is now provided. 

Variant protein H53626JPEA_1_P4 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
10 H53626_PEA_1_T15. The alignment to the wild type protein is given at the end of the 
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application. A brief description of the relationship of the variant protein according to the present 
invention to the wild type protein is as follows: 

Comparison report between H53626_PEA_1_P4 and wild type Q8N441 (SEQ ID 
NO: 1699): 

5 l.An isolated chimeric polypeptide encoding for H53626JPEA_1_P4, comprising a first 

amino acid sequence being at least 90 % homologous to 

MTPSPLLLLLLPPLLLGAFPPAAAARGPPKMADKVVPRQVARLGRTVRLQCPVEGDPPP 
LTMWTKDGRTIHSGWSRFRVLPQGLKVKQVEREDAGVWCKATNGFGSLSWYTLV^ 
LDDISPGKESLGPDSSSGGQEDPASQQWARPRFTQPSKMRRRVIARPVGSSVRLKCVAS 

10 GHPRPDITWMKDDQALTRPEAAEPREXKWTLSLKNLRPEDSGKYTCRVSNRAGAINAT 
YKVDVIQRTRSKPVLTGTHPVNTTVDFGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGR 
HNSTIDVGGQKFWLPTGDVWSRPDGSYLNKLLITRARQDDAGMYICLGANTMGYSFR 
SAFLTVLP corresponding to amino acids 1 - 357 of Q8N441, which also corresponds to amino 
acids 1 - 357 of H53626 PEA1P4, second amino acid sequence being at least 70%, optionally 

15 at least 80%, preferably at least 85%, more preferably at least 90% and most preferably at least 
95% homologous to a polypeptide having the sequence 

GARLPRHATPCWCPDPPPGPGVPPTGWGPTLPSRAVLARSSAEGGQPRGTVSTAPGMG 
LGCSPGLCVGVPLPTSFPLALA corresponding to amino acids 358 - 437 of 
H53626JPEA_1 JP4, and a third amino acid sequence being at least 90 % homologous to 

20 DPKPPGPPVASSSSATSLPWPWIGIPAGAVFILGTLLLWLCQ 

RPPGTARDRSGDKDLPSLAALSAGPGVGLCEEHGSPAAPQHLLGPGPVAGPKEYPKLY 
TDIHTHTHTHSHTHSHVEGKVHQHIHYQC corresponding to amino acids 358 - 504 of 
Q8N441, which also corresponds to amino acids 438 - 584 of H53626JPEA_1_P4, wherein said 
first, second and third amino acid sequences are contiguous and in a sequential order. 

25 2.An isolated polypeptide encoding for an edge portion of H53626_PEA_1JP4, 

comprising an amino acid sequence being at least 70%, optionally at least about 80%, preferably 
at least about 85%, more preferably at least about 90% and most preferably at least about 95% 
homologous to the sequence encoding for 

GARLPRHATPCWCPDPPPGPGVPPTGWGPTLPSRAVLARSSAEGGQPRGTVSTAPGMG 
30 LGCSPGLCVGVPLPTSFPLALA, corresponding to H53626JPEA_1_P4. 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
membrane. The protein localization is believed to be membrane because although both signal- 
5 peptide prediction programs agree that this protein has a signal peptide, both trans- membrane 
region prediction programs predict that this protein has a trans -membrane region downstream of 
this signal peptide.. 

Variant protein H53626_PEA_1 JP4 also has the following noivsilent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1326, (given according to their position(s) on the 
1 0 amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein H5 3 62 6_PE A_ 1 _P4 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 1326 - Amino acid mutations 



SNP position(s) on amino acid 
sequence 


Alternative amino acid(s) 


Previously known SNP? 


193 


R->L 


Yes 


300 


G-> 


No 


319 


Y->H 


No 


442 


P->Q 


Yes 


504 


R->L 


Yes 


521 


G-> 


No 


544 


P->L 


Yes 


573 


E->G 


No 



15 

Variant protein H53626_PEA_1_P4 is encoded by the following transcript(s): 
H53626_PEA_1_T15 ? for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript II 5362 6_PE A_ 1 _T 1 5 is shown in bold; this coding portion starts at 
position 17 and ends at position 1771. The transcript also has the following SNPs as listed in 
20 Table 1327 (given according to their position on the nucleotide sequence, with the alternative 
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nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
known SNPs in variant protein H53626_PEA_1_P4 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 



Table 1327 - Nucleic acid SNPs 



SNP position on nucleotide 
sequence ;:; • 


Alternative nucleic acid * 


Previously known SNP? 


76 


G -> A 


Yes 


340 


G->T 


No 


1647 


C->T 


Yes 


1734 


A->G 


No 


1797 


G-> 


No 


1948 


A->G 


Yes 


2193 


C->T 


Yes 


2308 


C ->T 


Yes 


2333 


C->G 


Yes 


2648 


C->T 


Yes 


2649 


G-> A 


Yes 


2765 


C ->T 


Yes 


594 


G->T 


Yes 


2972 


G-> A 


Yes 


3027 


C ->G 


Yes 


907 


T->C 


Yes 


916 


C-> 


No 


971 


T->C 


No 


1135 


G -> A 


Yes 


1341 


C -> A 


Yes 


1527 


G->T 


Yes 


1579 


C-> 


No 



5 
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Variant protein H53626JPEA_1 JP5 according to the present invention has an amino acid 
sequence as given at the end of the application; it is encoded by transcript(s) 
H53626_PEA_1_T16. The alignment to the wild type protein is given at the end of the 
application. A brief description of the relationship of the variant protein according to the present 
5 invention to the wild type protein is as follows: 

Comparison report between H53626_PEA_1 _P5 and wild type Q9H4D7 (SEQ ID 
NO: 1700): 

l.An isolated chimeric polypeptide encoding for H53626JPEA__1 JP5, comprising a first 
amino acid sequence being at least 90 % homologous to 

10 mtpspllllllpplllgafppaaaargppkmadkvvprqvarlgrtvrlqcpvegdppp 
ltmwtkdgrtihsgwsrfrvlpqglkvkqveredagvyvckatngfgslsvnytlvv 
lddispgke:slgpdsssggqedpasqqwarprftqpskmrrrviarpvgssvrixcva 
ghprpditwmkddqaltrpeaaeprkkkwtlslknlrpedsgkytcrvsnragainat 

YKVDVIQRTRSKPVLTGTHPVNTTVDFGGTTSFQCK corresponding to amino acids 1 - 269 
15 of Q9H4D7, which also corresponds to amino acids 1 - 269 of H53626_PEA_1_P5, and a 

second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence 

TQNRQGHLWPPRPRPLACRGPWSSASQPALSSSWAPCSCGFARPRRSRAPPRLPLPCLG 
20 TARRGRPATAAETRTFPRWPPSALALVWGCVRSMGLRQPPSTYWAQAQLLALSCTPNS 
TQTSTHTHTHTLTHTHTWRARSTSTSTISARRHRICSGHGGAGQTGRLGGWRTELQTKA 
GDPWRGGMASTPGSLCVRHSPWTHTHRHTHYLDACMHTHARTRAP corresponding to 
amino acids 270 - 490 of H53626_PEA_1_P5, wherein said first and second amino acid 
sequences are contiguous and in a sequential order. 
25 2. An isolated polypeptide encoding for a tail of H53626_PEA_1_P5, comprising a 

polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

TQNRQGHLWPPRPRPLACRGPWSSASQPALSSSWAPCSCGFARPRRSRAPPRLPLPCLG 
30 TARRGRPATAAETRTFPRWPPSALALVWGCVRSMGLRQPPSTYWAQAQLLALSCTPNS 
TQTSTHTHTHTLTHTHTWRARSTSTSTISARRHRICSGHGGAGQTGRLGGWRTELQTKA 
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GDPWRGGMASTPGSLCVRHSPWTHTHRHTHYLDACMHTHARTRAP in 
H53626_PEA_1_P5. 

Comparison report between H53626_PEA_1_P5 and wild type Q8N441 : 
1 An isolated chimeric polypeptide encoding for H53626_PEA_1_P5, comprising a first 
5 amino acid sequence being at least 90 % homologous to 

MTPSPLLLLLLPPLLLGAFPPAAAARGPPKMADKVVPRQVARLGRTVRLQCPVEGDPPP 
LTMWTKJDGRTIHSGWSRFRVLPQGLKVKQVEREDAGVYVCKATNGFGSLSVNYTLVV 
LDDISPGKESLGPDSSSGGQEDPASQQWARPRFTQPSKMRRRVIARPVGSSVRLKCVAS 
GHPRPDITWMKDDQALTRPEAAEPRKKKWTLSLKNLRPEDSGKYTCRVSNRAGAINAT 
10 YKVDVIQRTRSKPVLTGTHPVNTTVDFGGTTSFQCK corresponding to amino acids 1 - 269 
of Q8N441, which also corresponds to amino acids 1 - 269 of H53626_PEA_1_P5, and a 
second amino acid sequence being at least 70%, optionally at least 80%, preferably at least 85%, 
more preferably at least 90% and most preferably at least 95% homologous to a polypeptide 
having the sequence 

1 5 TQNRQGHLWPPRPRPLACRGPWSSASQPALSSSWAPCSCGFARPRRSRAPPRLPLPCLG 
TARRGRPATAAETRTFPRWPPSALALVWGCVRSMGLRQPPSTYWAQAQLLALSCTPNS 
TQTSTHTHTHTLTHTHTWRARSTSTSTISARRHRICSGHGGAGQTGRLGGWRTELQTKA 
GDPWRGGMASTPGSLCVRHSPWTHTHRHTHYLDACMHTHARTRAP corresponding to 
amino acids 270 - 490 of H53626JPEA_1_P5, wherein said first and second amino acid 

20 sequences are contiguous and in a sequential order. 

2.An isolated polypeptide encoding for a tail of H53626JPEA1P5, comprising a 
polypeptide being at least 70%, optionally at least about 80%, preferably at least about 85%, 
more preferably at least about 90% and most preferably at least about 95% homologous to the 
sequence 

25 TQNRQGHLWPPRPRPLACRGPWSSASQPALSSSWAPCSCGFARPRRSRAPPRLPLPCLG 
TARRGRPATAAETRTFPRWPPSALALVWGCVRSMGLRQPPSTYWAQAQLLALSCTPNS 
TQTSTHTHTHTLTHTHTWRARSTSTSTISARRHRICSGHGGAGQTGRLGGWRTELQTKA 
GDPWRGGMASTPGSLCVRHSPWTHTHRHTHYLDACMHTHARTRAPin 
H53626_PEA_1_P5. 



30 
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The location of the variant protein was determined according to results from a number of 
different software programs and analyses, including analyses from SignalP and other specialized 
programs. The variant protein is believed to be located as follows with regard to the cell: 
secreted. The protein localization is believed to be secreted because both signatpeptide 
prediction programs predict that this protein has a signal peptide, and neither trans -membrane 
region prediction program predicts that this protein has a trans -membrane region.. 

Variant protein H53626_PEA_1 JP5 also has the following non-silent SNPs (Single 
Nucleotide Polymorphisms) as listed in Table 1328 (given according to their position(s) on the 
amino acid sequence, with the alternative amino acid(s) listed; the last column indicates whether 
the SNP is known or not; the presence of known SNPs in variant protein H53626JPEA 1_P5 
sequence provides support for the deduced sequence of this variant protein according to the 
present invention). 

Table 1328 - Amino acid mutations 



sisfP posMon(s)£g arninQ^id 

^quenfe^;- r; ' \ J? • M- • 


Alternative amino aeid(s) S; ; . 


Previously kno\vn SNP? 


193 


R->L 


Yes 


274 


Q->K 


Yes 


336 


A->S 


Yes 


353 


A-> 


No 


376 


Q -> * 


Yes 


405 


R->G 


No 


426 


G-> 


No 


476 


Y->C 


Yes 



Variant protein H53626_PEAJ_P5 is encoded by the following transcript(s): 
H53626_PEA_1_T16 5 for which the sequence(s) is/are given at the end of the application. The 
coding portion of transcript H53626JPEA_1_T16 is shown in bold; this coding portion starts at 
position 17 and ends at position 1489. The transcript also has the following SNPs as listed in 
Table 1329 (given according to their position on the nucleotide sequence, with the alternative 
nucleic acid listed; the last column indicates whether the SNP is known or not; the presence of 
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known SNPs in variant protein H53626JPEA_1_P5 sequence provides support for the deduced 
sequence of this variant protein according to the present invention). 

Table 1329 - Nucleic acid SNPs 



SNP position on nucleotide" 
sequence 


Alternative nucleic acid . 


Previously known SNP? 


76 


G-> A 


Yes 


340 


G->T 


No 


1688 


C->T 


Yes 


1803 


C->T 


Yes 


1828 


C->G 


Yes 


2143 


C->T 


Yes 


2144 


G-> A 


Yes 


2260 


C->T 


Yes 


2467 


G->A 


Yes 


2522 


C->G 


Yes 


594 


G->T 


Yes 


836 


C->A 


Yes 


1022 


G->T 


Yes 


1074 


C-> 


No 


1142 


C->T 


Yes 


1229 


A->G 


No 


1292 


G-> 


No 


1443 


A->G 


Yes 



5 



As noted above, cluster H53626 features 20 segment(s), which were listed in Table 2 
above and for which the sequence(s) are given at the end of the application. These segment(s) 
are portions of nucleic acid sequence(s) which are described herein separately because they are 
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of particular interest. A description of each segment according to the present invention is now 
provided. 

Segment cluster H53626 PEA1 jntode_15 according to the present invention is supported 
5 by 25 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626_PEA_1 JT15 and H53626JPEA_1_T16. 
Table 1330 below describes the starting and ending position of this segment on each transcript. 

Table 1330 - Segment location on transcripts 



Transcript name | 


[Segment starting position 


Segment ending position, ; 


H53626_PEA_1_T15 


96 


343 


H53626_PEA_1_T16 


96 


343 



10 

Segment cluster H53626 PEA1 node 22 according to the present invention is supported 
by 42 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626_PEA_1_T15 and H53626_PEA_1_T16. 
15 Table 1332 below describes the starting and ending position of this segment on each transcript. 

Table 1332 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position ■ ; 


H53626_PEA_1_T15 


450 


734 


H53626_PEA_1_T16 


450 


734 j 



Segment cluster H53626 PEA l_node_25 according to the present invention is supported 
20 by 41 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626_PEA_1_T15. Table 1334 below describes 
the starting and ending position of this segment on each transcript. 

Table 1334 - Segment location on transcripts 
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Transcript name ff- i 


Segment starting position 


Segment ending position 


H53626_PEA_1_T15 


824 


1088 



Segment cluster H53626JPEA_l_node_26 according to the present invention is supported 
by 5 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626JPEA_1_T15. Table 1336 below describes 
the starting and ending position of this segment on each transcript. 

Table 1336 - Segment location on transcripts 



Ifkrisci^tiame . ^<:%. ; 


Segmenf starting position v 


: Segment etiiing jbositicin v % 


H53626J>EA__1_T15 


1089 


1328 



Segment cluster H53626JPEA_l_node_27 according to the present invention is supported 
by 106 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626_PEA_1_T15 and H53626_PEA__1_T16. 
Table 1338 below describes the starting and ending position of this segment on each transcript. 

Table 1338 - Segment location on transcripts 



Transcript name 


Segment starting position • ' 


Segment ending position 


H53626_PEA_1_T15 


1329 


2228 


H53626_PEA_1_T16 


824 


1723 



Segment cluster H53626_PEA_l_nodeJ34 according to the present invention is supported 
by 121 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626JPEA_1_T15 and H53626_PEA_1 JT16. 
Table 1340 below describes the starting and ending position of this segment on each transcript. 

Table 1340- Segment location on transcripts 



Transcript name 



Segment starting position 



Segment ending position 
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H53626_PEA_1_T15 


2507 


2977 


H53626_PEA_1_T16 


2002 


2472 



Segment cluster H53626JPEA_l_node_35 according to the present invention is supported 
by 85 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626JPEA__1JT15 and H53626_PEA__1_T16. 
5 Table 1342 below describes the starting and ending position of this segment on each transcript. 

Table 1342 - Segment location on transcripts 



Transcript name. , 


Segment starting position 


; Segment ending position 


H53626_PEA_1_T15 


2978 


3148 


H53626_PEA_1_T16 


2473 


2643 



Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
10 expressed in various disease conditions, particularly cancer. The following oligonucleotides 
were found to hit this segment, shown in Table 1343. 

Table 1343 - Oligonucleotides related to this segment 



01iigbiiikieoti& name: fi, 


Oykexpressed in cancers J 

.... ■ *■•> 


(Ship reference \£ ;, 


NA 







1 5 Segment cluster H53626_PEA_l_node_36 according to the present invention is supported 

by 69 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626JPEAJLT15 and H53626_PEA_1 JT16. 
Table 1344 below describes the starting and ending position of this segment on each transcript. 

Table 1344 - Segment location on transcripts 



Transcript name 


Segment starting position 


•'' Segment ending position 


H53626_PEA_1_T15 


3149 


3322 


H53626_PEA_1_T16 


2644 


2817 
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Microarray (chip) data is also available for this segment as follows. As described above 
with regard to the cluster itself, various oligonucleotides were tested for being differentially 
expressed in various disease conditions, particularly cancer. The following oligonucleotides 



5 were found to hit this segment, shown in Table 13455. 
Table 1345 - Oligonucleotides related to this segment 



Oligonucleotide name Z' 


Overexpressed in cancers 


Chip reference • i 


NA 







1 0 According to an optional embodiment of the present invention, short segments related to 

the above cluster are also provided. These segments are up to about 120 bp in length, and so are 
included in a separate description. 

Segment cluster H53626 PEA l node 1 1 according to the present invention is supported 
15 by 12 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626_PEA_1_T15 and H53626_PEA_1JT16. 
Table 1346 below describes the starting and ending position of this segment on each transcript. 

Table 1346 - Segment location on transcripts 



/Transcript name 


Segment starting position 


Segment ending position' ' ; ; 


H53626_PEA_1_T15 


1 


55 


H53626_PEA_1_T16 


1 


55 



20 

Segment cluster H53626JPEA_l_node_12 according to the present invention is supported 
by 11 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626JPEA_1_T15 and H53626JPEA_1__T16. 
Table 1347 below describes the starting and ending position of this segment on each transcript. 
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Table 1347 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H53626_PEA_1_T15 


56 


95 


H53626_PEA_1_T16 


56 


95 



Segment cluster H53626_PEA_l_node_16 according to the present invention can be 
5 found in the following transcripts): H53626_PEA_1 JT15 and H53626JPEA_1_T16. Table 
1348 below describes the starting and ending position of this segment on each transcript. 

Table 1348 - Segment location on transcripts 



Transcriptase ' ^ 


Segment starting position 


Segment ending position 


H53626_PEA_1_T15 


344 


368 


H53626_PEA_1_T16 


344 


368 



10 Segment cluster H5 3 62 6 JPE A_ 1 _no de_ 1 9 according to the present invention is supported 

by 25 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626JPEA_1_T15 and H53626JPEAJJT16. 
Table 1349 below describes the starting and ending position of this segment on each transcript. 

Table 1349 - Segment location on transcripts 



Transcript name ' 1 {;• 


Segment starting position ; , 


Segment ending position 


H53626_PEA_1_T15 


369 


419 


H53626_PEA_1_T16 


369 


419 



15 

Segment cluster H53626_PEA_l_node_20 according to the present invention is supported 
by 27 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626JPEA_1_T15 and H53626JPEA_1_T16. 
20 Table 1350 below describes the starting and ending position of this segment on each transcript. 
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Table 1350 - Segment location on transcripts 



Transcript name ; | 


Segment starting position 

\< * ■ 


Segment ending position 


H53626_PEA_1JT15 


420 


449 


H53626JPEA_1_T16 


420 


449 



Segment cluster H53626JPEA_l_node_24 according to the present invention is supported 
5 by 34 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626_PEA_1_T15 and H53626_PEA_1__T16. 
Table 1351 below describes the starting and ending position of this segment on each transcript. 

Table 1351 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H53626_PEA_1_T15 


735 


823 


H53626_PEA_1_T16 


735 


823 



10 

Segment cluster H5 3 62 6_PE A_ 1 _no de_2 8 according to the present invention is supported 
by 66 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626JPEA__1_T15 and H53626_PEA_1 JT16. 
Table 1352 below describes the starting and ending position of this segment on each transcript. 

15 Table 1352 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H53626_PEA_1_T15 


2229 


2306 


H53626_PEA_1_T16 


1724 


1801 



Segment cluster H5 3 62 6 PE A_ 1 no de_2 9 according to the present invention is supported 
by 73 libraries. The number of libraries was determined as previously described. This segment 
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can be found in the following transcript(s): H53626_PEA_1_T15 and H53626_PEA_1 JT16. 
Table 1353 below describes the starting and ending position of this segment on each transcript. 

Table 1353 - Segment location on transcripts 



Ttoscriptgriame 


Segment starting position 


Segment ending position %/■ 


H53626_PEA_1_T15 


2307 


2396 


H53626_PEA_1_T16 


1802 


1891 



5 

Segment cluster H53626JPEA_1 jtiode_30 according to the present invention is supported 
by 71 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626_PEA_1_T1 5 and H53626JPEA_1_T16. 
Table 1354 below describes the starting and ending position of this segment on each transcript. 

10 Table 1354 - Segment location on transcripts 



Transcript name 


Segment starting position 


Segment ending position 


H53626_PEA_1_T15 


2397 


2442 


H53626_PEA_1_T16 


1892 


1937 



Segment cluster H53626 PEA1 node _31 according to the present invention is supported 
by 67 libraries. The number of libraries was determined as previously described. This segment 
15 can be found in the following transcript(s) : H53626J>EA_1_T15 and H53626J>EA_1_T16. 
Table 1355 below describes the starting and ending position of this segment on each transcript. 

Table 1355 - Segment location on transcripts 



Transcript name 


Segment starting position 


' Segment ending position 


H53626_PEA_1_T15 


2443 


2469 


H53626_PEA_1_T16 


1938 


1964 
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Segment cluster H53626_PEA_l_node_32 according to the present invention is supported 
by 65 libraries. The number of libraries was determined as previously described. This segment 
can be found in the following transcript(s): H53626_PEA_1_T15 and H53626JPEA_1_T16. 
Table 1356 below describes the starting and ending position of this segment on each transcript. 



5 Table 1356 - Segment location on transcripts 



Transcript name ' 


Segment starting position :,' 


Segment ending position 


H53626_PEA_1_T15 


2470 


2498 


H53626_PEA_1_T16 


1965 


1993 



Segment cluster H53626JPEA_l_node 33 according to the present invention can be 
found in the following transcript(s): H53626JPEAJLT15 and H53626JPEAJJT16. Table 
1 0 1 357 below describes the starting and ending position of this segment on each transcript. 

Table 1357 - Segment location on transcripts 



Transcript name / 


Segment starting position 


. Segment ending position 


H53626_PEA_1_T15 


2499 


2506 


H53626_PEA_1_T16 


1994 


2001 



1 5 Variant protein alignment to the previously known protein: 

Sequence name: /tmp/KlMec2ReKO/eglEUS2AXY : Q8N441 

Sequence documentation : 
20 Alignment of: H53 62 6_PEA_1_P4 x Q8N441 
Alignment segment 1/1: 
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Quality: 4882.00 

Escore: 0 

Matching length: 504 
length: 584 
5 Matching Percent Similarity: 100.00 
Identity: 100.00 

Total Percent Similarity: 86.30 
Identity: 86.30 

Gaps : 1 

10 

Alignment : 

1 MTPSPLLLLLLPPLLLGAFPPAAAARGPPKMADKWPRQVARLGRTVRLQ 50 

I I I I I I I I I II I i I I I I I I I I I I I I I ! I I I I i I I I I I I I I I I I I i I 1 I I I 
15 1 MT P S PLLLLLL P PLLLGAFP PAAAARGP PKMADK VVPRQVARLGRT VRLQ 50 

51 CPVEGDPPPLTMWTKDGRTIHSGWSRFRVLPQGLKVKQVEREDAGVYVCK 100 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I ! I I 

51 CPVEGDPPPLTMWTKDGRTIHSGWSRFRVLPQGLKVKQVEREDAGVYVCK 100 
20 ..... 

101 ATNGFGSLSVNYTLWLDDISPGKESLGPDSSSGGQEDPASQQWARPRFT 150 

I I I I I I I I I I I II 1 I I II I 11 I I I I I I I I I I I I I I I I I I I I I 1 I I I I II I 

101 ATNGFGSLSVNYTLVVLDDISPGKESLGPDSSSGGQEDPASQQWARPRFT 150 

25 151 QPSKMRRRVIARPVGSSVRLKCVASGHPRPDITWMKDDQALTRPEAAEPR 200 

I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

151 QPSKMRRRVIARPVGSSVRLKCVASGHPRPDITWMKDDQALTRPEAAEPR 200 

201 KKKWTLSLKNLRPEDSGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTG 250 

30 I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I 

201 KKKWTLSLKNLRPEDSGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTG 250 



Total 
Matching Percent 
Total Percent 
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251 THPVNTTVDFGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVGG 30 0 

I I i II I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I 1 I I I I I I I I I I I 

251 T H P VN T TV DFGGTTSFQC K VR S D VK P V I Q WLKRVE Y G AE GRHN S T I D V GG 300 
5 ..... 

301 QKFVVLPTGDVWSRPDGSYLNKLLITRARQDDAGMYICLGANTMGYSFRS 350 

I I I II I I II I I I I I I I I I 1 I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I 

301 QKFVVLPTGDVWSRPDGSYLNKLLITRARQDDAGMY1CLGANTMGYSFRS 350 
. . • • • 

10 351 AFLTVLPGARLPRHATPCWCPDPPPGPGVPPTGWGPTLPSRAVLARSSAE 40 0 

I I I I I I I 

351 AFLTVLP 357 

401 GGQPRGTVSTAPGMGLGCSPGLCVGVPLPTSFPLALADPKPPGPPVASSS 450 
15 I I I I I I I I 1 I I I I 

358 DPKPPGPPVASSS 370 

4 51 SATSLPWPWIGIPAGAVFILGTLLLWLCQAQKKPCTPAPAPPLPGHRPP 500 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

20 371 SATSLPWPVVIGIPAGAVFILGTLLLWLCQAQKKPCTPAPAPPLPGHRPP 420 

..... 

501 GTARDRSGDKDLPSLAALSAGPGVGLCEEHGSPAAPQHLLGPGPVAGPKL 550 

I | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
421 GTARDRSGDKDLPSLAALSAGPGVGLCEEHGSPAAPQHLLGPGPVAGPKL 470 



25 



551 YPKLYTDIHTHTHTHSHTHSHVEGKVHQHIHYQC 584 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
471 YPKLYTDIHTHTHTHSHTHSHVEGKVHQHIHYQC 504 



30 



WO 2006/131783 



PCT/IB2005/004037 



1345 



Sequence name: /tmp/oSUZaRW3WK/oSh3fN5ZtO : Q9H4D7 

5 

Sequence documentation : 

Alignment of: H53 62 6_PEA_1_P5 x Q9H4D7 
10 Alignment segment 1/1: 

Quality: 2644.00 

Escore: 0 

Matching length: 2 69 Total 

15 length: 269 

Matching Percent Similarity: 100.00 Matching Percent 
Identity: 100.00 

Total Percent Similarity: 100.00 Total Percent 

Identity: 100.00 
20 Gaps : 0 



Alignment : 

1 MTPSPLLLLLLPPLLLGAFPPAAAARGPPKMADKVVPRQVARLGRTVRLQ 50 

25 I I I ! I I I I I I i 1 I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I 

1 MTPSPLLLLLLPPLLLGAFPPAAAARGPPKMADKVVPRQVARLGRTVRLQ 50 
. . . • • 

51 C P VE G D P P P L TMW T K D GRT I H S GW S RFRVL P Q G LK VKQ VE RE D AG V Y V C K 100 
I I 1 I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
30 51 CPVEGDPPPLTMWTKDGRTIHSGWSRFRVLPQGLKVKQVEREDAGVYVCK 100 
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101 ATNGFGSLSVNYTLVVLDDISPGKESLGPDSSSGGQEDPASQQWARPRFT 150 

I II 11 ! I I I I I I I I I II i I I i I I i I I 1 I I I I t I I I I I I I I I I I I I I ! ! i ! 

101 ATNGFGSLSVNYTLVVLDDISPGKESLGPDSSSGGQEDPASQQWARPRFT 150 

151 QPSKMRRRVIARPVGSSVRLKCVASGHPRPDITWMKDDQALTRPEAAEPR 20 0 

I I I 1 I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
151 QPSKMRRRVIARPVGSSVRLKCVASGHPRPDITWMKDDQALTRPEAAEPR 200 

201 KKKWTLSLKNLRPEDSGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTG 250 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I 

201 KKKWTLSLKNLRPEDSGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTG 250 

251 THPVNTTVDFGGTTSFQCK 2 69 

I I I I I I I I I I I I I I I I I I I 
251 THPVNTTVDFGGTTSFQCK 2 69 



Sequence name: /tmp/oSUZaRW3WK/oSh3f NSZtO : Q8N441 
Sequence documentation : 

Alignment of: H53 62 6__PEA_1_P5 x Q8N441 
Alignment segment 1/1: 

Quality: 2644.00 

Escore: 0 
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Matching length: 
length: 269 

Matching Percent Similarity: 
Identity: 100.00 
5 Total Percent Similarity: 

Identity: 100.00 

Gaps : 



1347 

269 Total 
100.00 Matching Percent 
100.00 Total Percent 

0 



Alignment : 

10 ..... 

1 MTPSPLLLLLLPPLLLGAFPPAAAARGPPKMADKVVPRQVARLGRTVRLQ 50 

I I I i 1 I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I M I I I I I I I M I I I I 

1 MT P S PLLLLLLP PLLLGAFP PAAAARGP PKMADKVVPRQVARLGRT VRLQ 50 

15 51 CPVEGDPPPLTMWTKDGRTIHSGWSRFRVLPQGLKVKQVEREDAGVYVCK 100 

I I I I I I I I I I I I i 1 I I 1 t I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

51 CPVEGDPPPLTMWTKDGRTIHSGWSRFRVLPQGLKVKQVEREDAGVYVCK 100 

101 ATNGFGSLS VNYTLVVLDDI S PGKESLGPDS S S GGQEDPASQQWARPRFT 150 

20 | | | | I I I I 1 I I I I I I [ I 11 I I I I I i I I I I I I I 11 I I I I I I I I I I I I I I I I 

101 ATNGFGSLSVNYTLVVLDDIS PGKESLGPDS SSGGQEDPASQQWARPRFT 150 

• • • • • 

151 QPSKMRRRVIARPVGSSVRLKCVASGHPRPDITWMKDDQALTRPEAAEPR 200 

I I I I 1 t I I I 1 I 1 1 I I I I I I I I I 1 I I I 1 I I I 1 I I I I M I I I I I I I II M I I 

25 151 QPSKMRRRVIARPVGSSVRLKCVASGHPRPDITWMKDDQALTRPEAAEPR 200 

• • • • 

2 01 KKKWT L S L KNLRPE D S GK Y T CRVS NRAGAI NAT YKVDVI QRTRS KP VLT G 250 

I) I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I 1 I I 
201 KKKWTLSLKNLRPEDSGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTG 250 



30 



251 THPVNTTVDFGGTTSFQCK 2 69 
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I 1 1 I I I 1 I 1 ! 1 I I I I I I I I 
251 THPVNTTVDFGGTTSFQCK 2 69 



5 

Expression o/Homo sapiens fibroblast growth factor receptor- like 1 (FGFRL1) H53626 
transcripts, which are detectable by amplicon as depicted in sequence name H53626 junc24- 

27F1R3 in normal and cancerous lung tissues. 

10 Expression of Homo sapiens fibroblast growth factor receptor- like 1 

(FGFRLl)transcripts detectable by or according to junc24-27, H53626 junc24-27FlR3 amplicon 
(SEQ ID NO: 1690) and H53626 junc24-27Fl (SEQ ID NO: 1688) and H53626 junc24-27R3 
(SEQ ID NO: 1689) primers was measured by real time PGR. In parallel the expression of four 
housekeeping genes -PBGD (GenBank Accession No. BCO 19323; amplicon - PBGD- amplicon, 

1 5 SEQ ID NO:334), HPRT1 (GenBank Accession No. NM_0001 94; amplicon - HPRT1 -amplicon, 
SEQ ID NO: 1297), UBC (GenBank Accession No. BC000449; amplicon - Ubiquitin-amplicon, 
SEQ ID NO:328) and SDHA (GenBank Accession No. NM_004168; amplicon - SDHA- 
amplicon, SEQ ID NO:331), was measured similarly. For each RT sample, the expression of the 
above amplicon was normalized to the geometric mean of the quantities of the housekeeping 

20 genes. The normalized quantity of each RT sample was then divided by the median of the 
quantities of the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, 
above), to obtain a value of fold up -regulation for each sample relative to median of the normal 
PM samples. 

Figure 74 is a histogram showing over expression of the above -indicated Homo sapiens 
25 fibroblast growth factor receptor- like 1 (FGFRL1) transcripts in cancerous lung samples relative 
to the normal samples. 

As is evident from Figure 74, the expression of Homo sapiens fibroblast growth factor 
receptor- like 1 (FGFRL1) transcripts detectable by the above amplicon(s) was higher in several 
cancer samples than in the non-cancerous samples (Sample Nos. 46-50, 90-93, 96-99 Table 2). 
30 Notably an over- expression of at least 5 fold was found in 7 out of 15 adenocarcinoma samples. 
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Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
limiting illustrative example only of a suitable primer pair: H53626 junc24-27Fl forward 
primer; and H53626 junc24-27R3 reverse primer. 
5 The present invention also preferably encompasses any amplicon obtained through the 

use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a noiv limiting illustrative example only of a suitable amplicon: H53626 junc24- 
27F1R3. 

Forward primer (SEQ ID NO: 1688): GTCCTTCCAGTGCAAGACCCA 
1 0 Reverse primer(SEQ ID NO: 1 689): TGGGCCTGGCAAAGCC 
Amplicon (SEQ ID NO: 1690): 

GTCCTTCCAGTGCAAGACCCAAAACCGCCAGGGCCACCTGTGGCCTCCTCGTCCTC 

GGCCACTAGCCTGCCGTGGCCCGTGGTCATCGGCATCCCAGCCGGCGCTGTCTTCAT 

CCTGGGCACCCTGCTCCTGTGGCTTTGCCAGGCCCA 



Expression o/Homo sapiens fibroblast growth factor receptor- like 1 (FGFRL1) H53626 
transcripts, which are detectable by amplicon as depicted in sequence name H53626 seg25 in 
20 normal and cancerous lung tissues. 

Expression of Homo sapiens fibroblast growth factor receptor- like 1 (FGFRL1) 
transcripts detectable by or according to seg25, H53626 seg25 amplicon (SEQ ID NO: 1693) 
and H53626 seg25F (SEQ ID NO: 1691) and H53626 seg25R (SEQ ID NO: 1692) primers was 
measured by real time PGR. In parallel the expression of four housekeeping genes -PBGD 
25 (GenBank Accession No. BC019323; amplicon - PBGD-amplicon, SEQ ID NO:334), HPRT1 
(GenBank Accession No. NMJ)00194; amplicon- HPRT1 -amplicon, SEQ ID NO:1297), UBC 
(GenBank Accession No. BC000449; amplicon - Ubiquitin- amplicon, SEQ ID NO:328) and 
SDHA (GenBank Accession No. NMJ304168; amplicon - SDHA-amplicon, SEQ ID NO:331), 
was measured similarly. For each RT sample, the expression of the above amplicon was 
30 normalized to the geometric mean of the quantities of the housekeeping genes. The normalized 
quantity of each RT sample was then divided by the median of the quantities of the normal post- 
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mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, above), to obtain a value of 
fold up-regulation for each sample relative to median of the normal PM samples. 

As is evident from Figure 75, the expression of Homo sapiens fibroblast growth factor 
receptor- like 1 (FGFRL1) transcripts detectable by the above amplicon(s) was higher in a few 
5 cancer samples than in the noncancerous samples (Sample Nos. 46-50, 90-93, 96-99 Table 2). 
Notably an over- expression of at least 5 fold was found in 3 out of 15 adenocarcinoma samples. 

Primer pairs are also optionally and preferably encompassed within the present 
invention; for example, for the above experiment, the following primer pair was used as a non- 
10 limiting illustrative example only of a suitable primer pair: H53626 seg25F forward primer; and 
H53626 seg25R reverse primer. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: H53626 seg25. 
15 Forward primer (SEQ ID NO: 1691);CCGACGGCTCCTACCTCAA 

Reverse primer (SEQ ID NO: 1692): GGAAGCTGTAGCCCATGGTGT 
Amplicon (SEQ ID NO: 1693): 

CCGACGGCTCCTACCTCAATAAGCTGCTCATCACCCGTGCCCGCCAGGACGATGCG 
GGCATGTACATCTGCCTTGGCGCCAACACCATGGGCTACAGCTTCC 

20 

Expression of Homo sapiens fibroblast growth factor receptor- like 1 (FGFRL1) H53626 
transcripts, which are detectable by amplicon as depicted in sequence name H53626 seg25 in 

different normal tissues. 



25 Expression of Homo sapiens fibroblast growth factor receptor- like 1 (FGFRL1) transcripts 

detectable by or according to H53626 seg25 amplicon (SEQ ID NO: 1693) and H53626 seg25F 
(SEQ ID NO: 1691) and H53626 seg25R (SEQ ID NO: 1692) was measured by real time PGR. 
In parallel the expression of four housekeeping genes: RPL19 (GenBank Accession No. 
NM 000981; RPL19 amplicon, SEQ ID NO: 1630), TATA box (GenBank Accession No. 

30 NM_003 194; TATA amplicon, SEQ ID NO: 1633), UBC (GenBank Accession No. BC000449; 
amplicon - Ubiquitin-amplicon, SEQ ID NO:328) and SDHA (GenBank Accession No. 
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NM 004168; amplicon - SDHA-amplicon, SEQ ID NO:331) was measured similarly. For each 
RT sample, the expression of the above amplicon was normalized to the geometric mean of the 
quantities of the housekeeping genes. The normalized quantity of each RT sample was then 
divided by the median of the quantities of the lung samples (Sample Nos. 15-17 Table 3 above), 
5 to obtain a value of relative expression of each sample relative to median of the lung samples. 



Forward primer (SEQ ID NO: 1691);CCGACGGCTCCTACCTCAA 
Reverse primer (SEQ ID NO: 1692): GGAAGCTGTAGCCCATGGTGT 
Amplicon (SEQ ID NO: 1693): 
10 CCGACGGCTCCTACCTCAATAAGCTGCTCATCACCCGTGCCCGCCAGGACGATGCG 
GGCATGTACATCTGCCTTGGCGCCAACACCATGGGCTACAGCTTCC 



The results are demonstrated in Figure 77, showing the expression of of Homo sapiens 
fibroblast growth factor receptor- like 1 (FGFRL1) H53626 transcripts, which are detectable by 
15 amplicon as depicted in sequence name H53626 seg25 in different normal tissues. 



Expression of Homo sapiens fibroblast growth factor receptor- like 1 (FGFRL1) H53626 
20 transcripts which are detectable by amplicon as depicted in sequence name H53626 junc24- 

27F1R3 in different normal tissues 



Expression of Homo sapiens fibroblast growth factor receptor- like 1 (FGFRL1) transcripts 
detectable by or according to H53626 junc24-27FlR3 amplicon (SEQ ID NO: 1690) and 

25 H53626 junc24-27Fl (SEQ ID NO: 1688) and H53626 junc24-27R3 (SEQ ID NO: 1689) was 
measured by real time PGR. In parallel the expression of four housekeeping genes - RPL19 
(GenBank Accession No. NM_000981; RPL19 amplicon, SEQ ID NO:1630), TATA box 
(GenBank Accession No. NMJ)03194; TATA amplicon, SEQ ID NO: 1633; primers SEQ ID 
NOs 1631 and 1632), UBC (GenBank Accession No. BC000449; amplicon - Ubiquitin- 

30 amplicon, SEQ ID NO:328) and SDITA (GenBank Accession No. NM 004168; amplicon - 
SDHA-amplicon, SEQ ID NO:331) was measured similarly. For each RT sample, the 
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expression of the above amplicon was normalized to the geometric mean of the quantities of the 
housekeeping genes. The normalized quantity of each RT sample was then divided by the 
median of the quantities of the lung samples (Sample Nos. 15-17 Table 3 above), to obtain a 
value of relative expression of each sample relative to median of the lung samples. 

5 

Forward primer (SEQ ID NO: 1688): GTCCTTCCAGTGCAAGACCCA 
Reverse primer(SEQ ID NO: 1689): TGGGCCTGGCAAAGCC 
Amplicon (SEQ ID NO: 1690): 

GTCCTTCCAGTGCAAGACCCAAAACCGCCAGGGCCACCTGTGGCCTCCTCGTCCTC 
10 GGCCACTAGCCTGCCGTGGCCCGTGGTCATCGGCATCCCAGCCGGCGCTGTCTTCAT 
CCTGGGCACCCTGCTCCTGTGGCTTTGCCAGGCCCA 

The results are demonstrated in Figure 78, showing the expression of Homo sapiens 
fibroblast growth factor receptor- like 1 (FGFRL1) H53626 transcripts, which are detectable by 
15 amplicon as depicted in sequence name H53626 junc24-27FlR3 in different normal tissues. 

Expression of trophinin associated protein (tastin) [T86235J transcripts which are detectable by 
amplicon as depicted in SEQ ID NO: 1480 in normal and cancerous lung tissues 



Expression of trophinin associated protein (tastin) transcripts detectable by SEQ ID 
20 NO:1480 (e.g., variant no. 23-26 31, 32- represented by SEQ IDs 1485-1488, 1609, 1610) was 
measured by real time PCR. In parallel the expression of four housekeeping genes - PBGD 
(GenBank Accession No. BC019323; amplicon - SEQ ID NO:1471), HPRT1 (GenBank 
Accession No. NM__000194; amplicon - SEQ ID NO: 1468), Ubiquitin (GenBank Accession No. 
BC000449; amplicon - SEQ ID NO: 1474) and SDHA (GenBank Accession No. NMJ)04168; 
25 amplicon - SEQ ID NO: 1477), was measured similarly. For each RT sample, the expression of 
SEQ ID NO: 1480 was normalized to the geometric mean of the quantities of the housekeeping 
genes. The normalized quantity of each RT sample was then divided by the median of the 
quantities of the normal post-mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, 
"Tissue samples in testing panel", above), to obtain a value of fold up -regulation for each sample 
30 relative to median of the normal PM samples. 
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Figure 54a is a histogram showing over expression of the above -indicated trophinin 
associated protein (tastin) transcripts in cancerous lung samples relative to the normal samples. 
The number and percentage of samples that exhibit at least 5 fold over- expression, out of the 
total number of samples tested is indicated in the bottom. 
5 As is evident from Figure 54a, the expression of trophinin associated protein (tastin) 

transcripts detectable by SEQ ID NO: 1480 in cancer samples was significantly higher than in 
the non-cancerous samples (Sample Nos. 46-50, 90-93, 96-99 Table 2, "Tissue samples in 
testing panel"). Notably an over- expression of at least 5 fold was found in 6 out of 15 
adenocarcinoma samples, 8 out of 16 squamous cell carcinoma samples, 2 out of 4 large cell 
10 carcinoma samples and in 8 out of 8 small cells carcinoma samples. 

Statistical analysis was applied to verify the significance of these results, as described 

below. 

The P value for the difference in the expression levels of trophinin associated protein 
(tastin) transcripts detectable by SEQ ID NO: 1480 in lung cancer samples versus the normal 
15 lung samples was determined by T test as 1 .61E-04. 

Threshold of 5 fold overexpression was found to differentiate between cancer and 
normal samples with P value of 1.49E-02 as checked by exact fisher test. The above values 
demonstrate statistical significance of the results. 

20 According to the present invention, trophinin associated protein (tastin) is a non- limiting 

example of a marker for diagnosing lung cancer. The trophinin associated protein (tastin) 
marker of the present invention, can be used alone or in combination, for various uses, including 
but not limited to, prognosis, prediction, screening, early diagnosis, therapy selection and 
treatment monitoring of lung cancer. Although optionally any method may be used to detected 

25 overexpression and/or differential expression of this marker, preferably a NAT-based 

technology is used. Therefore, optionally and preferably, any nucleic acid molecule capable of 
selectively hybridizing to trophinin associated protein (tastin) as previously defined is also 
encompassed within the present invention. Primer pairs are also optionally and preferably 
encompassed within the present invention; for example, for the above experiment, the following 

30 primer pair was used as a noi> limiting illustrative example only of a suitable primer pair: 
trophinin associated protein (tastin)- TAA-seg 44- forward primer (SEQ ID NO: 1478): 
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AGACTCCAACCCACAGCCC; and trophinin associated protein (tastin) - TAA-seg 44- 
Reverse primer (SEQ ID NO: 1479): CAGCTCAGCCAACCTTGCA. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
5 was obtained as a non- limiting illustrative example only of a suitable amplicon: trophinin 
associated protein (tastin) amplicon, SEQ ID NO: 1480: 

AGACTCCAACCCACAGCCCAGCTGTGGCTGCACAGTGAGCCTGATGGGAGGTGGGG 

AACAGGGACAGGGGGCCACCTGGGCTTCTTCACAGAGAGGTCAGCAGGAAGGCTT 

GGCTACAGTGCAAGGTTGGCTGAGCTG 

10 According to other preferred embodiments of the present invention, trophinin associated 

protein (tastin) or a fragment thereof comprises a biomarker for detecting lung cancer. 
Optionally and more preferably, trophinin associated protein (tastin) splice variants, as depicted 
in SEQ ID NO: 1485-1488, 1609, 1610 (e.g., variant no. 23-26, 31, 32), or a fragment thereof 
comprise a biomarker for detecting lung cancer. Optionally and more preferably, the fragment 

15 of trophinin associated protein (tastin) comprises segment_TAA-44 - SEQ ID NO: 1507. Also 
optionally and more preferably, any suitable method may be used for detecting a fragment such 
as trophinin associated protein (tastin) _segment_ TAA-44 - SEQ ID no 1507 for example. 
Most preferably, NAT-based technology used, such as any nucleic acid molecule capable of 
specifically hybridizing with the fragment. Optionally and most preferably, a primer pair is 

20 used for obtaining Hie fragment. 
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According to still other preferred embodiments, the present invention optionally and 
preferably encompasses any amino acid sequence or fragment thereof encoded by a nucleic acid 
sequence corresponding to trophinin associated protein (tastin) as described above, including 
but not limited to SEQ ID NOs: 1492-1501, 1612. Any oligopeptide or peptide relating to such 
5 an amino acid sequence or fragment thereof may optionally also (additionally or alternatively) 
be used as a biomarker, including but not limited to the unique amino acid sequences of these 
proteins that are depicted in SEQ ID Nos: 1508-1511, 1613. The present invention also 
optionally encompasses antibodies capable of recognizing, and/or being elicited by, such 
oligopeptides or peptides. 

10 The present invention also optionally and preferably encompasses any nucleic acid 

sequence or fragment thereof, or amino acid sequence or fragment thereof, corresponding to 
trophinin associated protein (tastin) as described above, optionally for any application. 



Expression of trophinin associated protein (tastin) [T86235] transcripts which are 
15 detectable by oligonucleotides as depicted in SEQ ID NOs:1512-1514 in normal and 
cancerous lung tissues 



Expression of trophinin associated protein (tastin) [T86235J transcripts detectable by 
oligonucleotides SEQ ID NOs: 1512-1514 (e.g., variants no. 8-10, 22, 23, 26, 27, 29-31, 33 - 

20 represented by SEQ IDs 1481-1485, 1488-1491, 1609, 161 1) was measured with 

oligonucleotide-based micro-arrays. The segments detected by the above oligonucleotides as 
depicted in SEQ ID NOs: 1512-1514 are for example nucleotide sequences as depicted in SEQ 
IDs 1503, 1504, 1506. 

The results of image intensities for each feature were normalized according to the 

25 ninetieth percentile of the image intensities of all the features on the chip. Then, feature image 
intensities for replicates of the same oligonucleotide on the chip and replicates of the same 
sample were averaged. Outlying results were discarded. 



30 



For every oligonucleotide (SEQ ID NOs: 1512-1514 ) the averaged intensity determined for 
every sample was divided by the averaged intensity of all the normal samples (Sample Nos. 
48,50, 90-92, 96-99, Table 2, "Tissue samples in testing panel 55 , above), to obtain a value of fold 
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up-regulation for each sample relative to the averaged normal samples. These data are presented 
in a histogram in Figure 54b. As is evident from Figure 54b, the expression of trophinin 
associated protein (tastin) [T862 35] transcripts detectable with oligonucleotides according to 
SEQ ID NOs: 1512-1514 in cancer samples was significantly higher than in the normal samples. 
5 According to the present invention, trophinin associated protein (tastin) is a non- limiting 

example of a marker for diagnosing lung cancer. Although optionally any method may be used 
to detected overexpression and/or differential expression of this marker, preferably a NAT- 
based technology is used. Therefore, optionally and preferably, any nucleic acid molecule 
capable of selectively hybridizing to trophinin associated protein (tastin) as previously defined is 
10 also encompassed within the present invention. Oligonucleotides are also optionally and 

preferably encompassed within the present invention; for example, for the above experiment, the 
following oligonucleotides were used as a noit- limiting illustrative example only of a suitable 
oligonucleotides: SEQ ID NOs: 1512-1514 
SEQ ID 1512: 

15 CATGGTAACACGGCCTCCATGGCTGAGTAGGGGACTAGGAAGGGTAAAAG 
SEQ ID 1513: 

TGTACATCTAGGGCCTCTCAGTTAGGGGCTTCAATCCATTCCTCATGAGG 



SEQ ID 1514: 

20 TGTGAACACAAGAGGTCCTCACCTCACTGTGAGCTGCACACCTGCCCTGC 

According to other preferred embodiments of the present invention, trophinin associated 
protein (tastin) or a fragment thereof comprises a biomarker for detecting lung cancer. 
Optionally and more preferably, trophinin associated protein (tastin) splice variants, as depicted 
in SEQ ID NO: 1481-1485, 1488-1491, 1609, 1611 (e.g., variant no. 8-10, 22, 23, 26, 27, 29- 

25 31, 33), or a fragment thereof comprise a biomarker for detecting lung cancer. Optionally and 
more preferably, the fragment of trophinin associated protein (tastin) comprises segment_TAA- 
14, 35 and 42 - SEQ ID no. 1503, 1504, 1506 . Also optionally and more preferably, any 
suitable method may be used for detecting a fragment such as trophinin associated protein 
(tastin) j3egment_TAA-14, 35 and 42 - SEQ ID NOs 1503, 1504 and 1506 for example. Most 

30 preferably, NAT-based technology used, such as any nucleic acid molecule capable of 
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specifically hybridizing with the fragment. Optionally and most preferably, a primer pair is 
used for obtaining the fragment. 

According to other preferred embodiments of the present invention, trophinin associated 
protein (tastin) splice variants containing the unique segments as depicted in SEQ ID Nos 1502 
5 and 1505, for example as these included in variants 9 and 29 (SEQ ID NOs: 1482 and 1490, 
respectively), are useful as biomarkers for detecting lung cancer. 

The present invention also optionally and preferably encompasses any nucleic acid 
sequence or fragment thereof, or amino acid sequence or fragment thereof, corresponding to 
trophinin associated protein (tastin) as described above, optionally for any application. 

10 

Expression ofHomeo box C10 (HOXC10) [N31842] transcripts which are detectable by 
amplicon as depicted in SEQ ID NO: 15 17 in normal and cancerous lung tissues 
Expression ofHomeo box C10 (HOXC10) transcripts detectable by SEQ ID NO: 1517 

15 (e.g., variant no. 3, represented by SEQ ID 1519) was measured by real time PCR. In parallel 
the expression of four housekeeping genes - PBGD (GenBank Accession No. BC0 19323; 
amplicon - SEQ ID NO:1471), HPRT1 (GenBank Accession No. NM_000194; amplicon - SEQ 
ID NO:3), Ubiquitin (GenBank Accession No. BC000449; amplicon - SEQ ID NO:9) and SDHA 
(GenBank Accession No. NMJ)04168; amplicon - SEQ ID NO: 1477), was measured similarly. 

20 For each RT sample, the expression of SEQ ID NO:1517 was normalized to the geometric mean 
of the quantities of the housekeeping genes. The normalized quantity of each RT sample was 
then divided by the median of the quantities of the normal post-mortem (PM) samples (Sample 
Nos. 47-50, 90-93, 96-99, Table 2, "Tissue samples in testing panel", above), to obtain a value of 
fold up -regulation for each sample relative to median of the normal PM samples. 

25 Figure 55 is a histogram showing over expression of the above- indicated Homeo box 

C10 (HOXC10) transcripts in cancerous lung samples relative to the normal samples. The 
number and percentage of samples that exhibit at least 20 fold over- expression, out of the total 
number of samples tested is indicated in the bottom. 

As is evident from Figure 55, the expression ofHomeo box C10 (HOXC10) transcripts 

30 detectable by SEQ ID NO: 1517 in cancer samples was significantly higher than in the non- 
cancerous samples (Sample Nos. 46-50, 90-93, 96-99, Table 2, "Tissue samples in testing 
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panel"). Notably an over- expression of at least 20 fold was found in 6 out of 15 
adenocarcinoma samples, 9 out of 16 squamous cell carcinoma samples, and in 3 out of 4 large 
cell carcinoma samples. 

Statistical analysis was applied to verify the significance of these results, as described 

5 below. 

The P value for the difference in the expression levels of Homeo box CIO (HOXC10) transcripts 
detectable by SEQ ID NO: 1517 in lung cancer samples versus the normal lung samples was 
determined by T test as 4.43 E- 03. 

Threshold of 20 fold overexpression was found to differentiate between cancer and normal 
10 samples with P value of 2.88E-02 as checked by exact fisher test. The above values 
demonstrate statistical significance of the results. 

According to the present invention, Homeo box C10 (HOXC10) is a non- limiting 
example of a marker for diagnosing lung cancer. The Homeo box C10 (HOXC10) marker of 

1 5 the present invention, can be used alone or in combination, for various uses, including but not 
limited to, prognosis, prediction, screening, early diagnosis, therapy selection and treatment 
monitoring of lung cancer. Although optionally any method may be used to detected 
overexpression and/or differential expression of this marker, preferably a NAT-based 
technology is used. Therefore, optionally and preferably, any nucleic acid molecule capable of 

20 selectively hybridizing to Homeo box C10 (HOXC10) as previously defined is also 

encompassed within the present invention. Primer pairs are also optionally and preferably 
encompassed within the present invention; for example, for the above experiment, the following 
primer pair was used as a non- limiting illustrative example only of a suitable primer pair: 
Homeo box C10 (HOXC10) -forward primer (SEQ ID NO: 1515): 

25 GCGAAAC GCGATTTGTTGTT; and Homeo box C10 (HOXC10) -Reverse primer (SEQ ID 
NO:1516): CATCTGGAGGAGGGAGGGA. 



30 The present invention also preferably encompasses any amplicon obtained through the 

use of any suitable primer pair; for example, for the above experiment, the following amplicon 
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was obtained as a non- limiting illustrative example only of a suitable amplicon: Homeo box CIO 
(HOXC10) amplicon (SEQ ID NO:1517): 

GCGAAACGCGATTTGTTGTTTGTGGGTCTGATTTGTGCGTGCGGCTTGGGCTCCTGC 
GGCTTTTGGCTCGGCCGGGGGCCTTGGGCAGCGAGGCTGGAGCCGGAAGAGGTGG 
5 AGGTGAAGGGCTGCCCGCCACGTCCCTCCCTCCTCCAGATG . 

According to other preferred embodiments of the present invention, Homeo box CIO 
(HOXC10) or a fragment thereof comprises a biomarker for detecting lung cancer. Optionally 
and more preferably, Homeo box CIO (HOXC10) splice variants, as depicted in SEQ ID NO:54 
(e.g., variant no. 3), or a fragment thereof comprise a biomarker for detecting lung cancer. 

10 Optionally and more preferably, the fragment of Homeo box CIO (HOXC10) comprises 
segment__TAA-seg 6 (SEQ ID NO: 1526). Also optionally and more preferably, any suitable 
method may be used for detecting a fragment such as Homeo box CIO (HOXC10) _segment_ 
TAA-seg 6 (SEQ ID NO: 1526) for example. Most preferably, NAT-based technology used, 
such as any nucleic acid molecule capable of specifically hybridizing with the fragment. 

1 5 Optionally and most preferably, a primer pah is used for obtaining the fragment. 

According to other preferred embodiments of the present invention, Homeo box CIO 
(HOXC10) splice variants containing the unique segments as depicted in SEQ ID NOs: 1524 
and 1525, for example transcripts as depicted in SEQ ID NO: 1515, 1519 and 1520, comprise a 
biomarker for detecting lung cancer. 

20 
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According to still other preferred embodiments, the present invention optionally and 
preferably encompasses any amino acid sequence or fragment thereof encoded by a nucleic acid 
sequence corresponding to trophinin associated protein (tastin) as described above, including 
but not limited to SEQ ID NOs: 1521 and 1522. Any oligopeptide or peptide relating to such an 
5 amino acid sequence or fragment thereof may optionally also (additionally or alternatively) be 
used as a biomarker, including but not limited to the unique amino acid sequence of the protein 
SEQ ID NO: 1522, as depicted in SEQ ID NO: 1523. The present invention also optionally 
encompasses antibodies capable of recognizing, and/or being elicited by, such oligopeptides or 
peptides. 

10 The present invention also optionally and preferably encompasses any nucleic acid 

sequence or fragment thereof, or amino acid sequence or fragment thereof, corresponding to 
trophinin associated protein (tastin) as described above, optionally for any application. 



15 Expression of Nucleolar protein 4 (NOL4)- [T06014] transcripts which are detectable by 

amplicon as depicted in SEQ IDs NO: 1529 in normal and cancerous lung tissues 
Expression of Nucleolar protein 4 (NOL4) transcripts detectable by SEQ ID NOs: 1529 
(e.g., variant no. 3, 11 and 12, represented by SEQ IDs 1533, 1537, 1538) was measured by real 
time PCR. Li parallel the expression of four housekeeping genes - PBGD (GenBank Accession 

20 No. BC019323; amplicon - SEQ ID NO:1471), HPRT1 (GenBank Accession No. NM_000194; 
amplicon - SEQ ID NO: 1468), Ubiquitin (GenBank Accession No. BC000449; amplicon - SEQ 
ID NO: 1474) and SDHA (GenBank Accession No. NM_004168; amplicon - SEQ ID NO: 1477), 
was measured similarly. For each RT sample, the expression of SEQ ID NO: 1529 was 
normalized to the geometric mean of the quantities of the housekeeping genes. The normalized 

25 quantity of each RT sample was then divided by the median of the quantities of the normal post- 
mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, above, "Tissue samples in 
testing panel"), to obtain a value of fold up-regulation for each sample relative to median of the 
normal PM samples. 

Figures 56a andb are histograms showing over expression of the above -indicated 
30 Nucleolar protein 4 (NOL4) transcripts in cancerous lung samples relative to the normal 

samples. The number and percentage of samples that exhibit at least 200 fold or 6 fold over- 
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expression, out of the total number of samples tested is indicated in the bottom of figures 56a 
and 56b respectively. 

As is evident from Figure 56a, the expression of Nucleolar protein 4 (NOL4) transcripts 
detectable by SEQ ID NO: 1529 in the samples originate from small cell carcinoma of the lung 
was significantly higher than in the non-cancerous samples (Sample Nos. 46-50, 90-93, 96-99, 
Table 2, "Tissue samples in testing panel"). Notably an over- expression of at least 200 fold was 
found in 8 out of 8 small cell carcinoma samples. As is evident from Figures 56b, over 
expression of at least 6 fold was observed also in 2 out of 15 adenocarcinoma samples, 3 out of 
16 squamous cell carcinoma samples. 

Statistical analysis was applied to verify the significance of these results, as described 

below. 

The P value for the difference in the expression levels of Nucleolar protein 4 (NOL4) 
transcripts detectable by SEQ ID NO: 1529 in lung cancer samples versus the normal lung 
samples was determined by T test as 1.36E-02. 

Threshold of 6 fold overexpression was found to differentiate between cancer and 
normal samples with P value of 2.52E-02 as checked by exact fisher test. 

The P value for the difference in the expression levels of Nucleolar protein 4 (NOL4) 
transcripts detectable by SEQ ID NO: 1529 in lung small cell carcinoma samples versus the 
normal lung samples was determined by T test as 3.86E-03. 

Threshold of 200 fold overexpression was found to differentiate between small cell carcinoma 
and normal lung samples with P value of 7.94E-06 as checked by exact fisher test. 
The above values demonstrate statistical significance of the results. 

According to the present invention, Nucleolar protein 4 (NOL4) is a non- limiting 
example of a marker for diagnosing lung cancer. The Nucleolar protein 4 (NOL4) marker of the 
present invention, can be used alone or in combination, for various uses, including but not 
limited to, prognosis, prediction, screening, early diagnosis, therapy selection and treatment 
monitoring of lung cancer. Although optionally any method may be used to detected 
overexpression and/or differential expression of this marker, preferably a NAT-based 
technology is used. Therefore, optionally and preferably, any nucleic acid molecule capable of 
selectively hybridizing to Nucleolar protein 4 (NOL4) as previously defined is also 
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encompassed within the present invention. Primer pairs are also optionally and preferably 
encompassed within the present invention; for example, for the above experiment, the following 
primer pair was used as a non- limiting illustrative example only of a suitable primer pair: 
Nucleolar protein 4 (NOL4)-TAA-segl- forward primer (SEQ ID NO: 1527): 
5 CTCGCTCCCTTGCTCACAC ; and Nucleolar protein 4 (NOL4)- TAA-segl -Reverse primer 
(SEQ ID NO:1528): AAAGGGAAAGCGGGATGTTT. 

The present invention also preferably encompasses any amplicon obtained through the 
10 use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: Nucleolar 
protein 4 (NOL4) amplicon (SEQ ID NO: 1529): 

CTCGCTCCCTTGCTCACACACACGCACACACTCAGCCTGGCCGAGCAGGAGCCACT 
GACCATTTTGCAAGTGTCAGGACCAGCTACAGCGCGGTGGGCGCAAACATCCCGCT 
15 TTCCCTTT . 

According to other preferred embodiments of the present invention, Nucleolar protein 4 
(NOL4) or a fragment thereof comprises a biomarker for detecting lung cancer. Optionally and 
more preferably, Nucleolar protein 4 (NOL4) splice variants, as depicted in SEQ ID NO: 1529 
(e.g., variants nos. 3, 11 and 12), or a fragment thereof comprise a biomarker for detecting lung 

20 cancer. Optionally and more preferably, the fragment of Nucleolar protein 4 (NOL4) comprises 
segmentJTAA-seg-1 (SEQ ID NO: 1552). Also optionally and more preferably, any suitable 
method may be used for detecting a fragment such as Nucleolar protein 4 (NOL4)_segment_ 
TAA-seg-1 (SEQ ID NO: 1552) for example. Most preferably, NAT-based technology used, 
such as any nucleic acid molecule capable of specifically hybridizing with the fragment. 

25 Optionally and most preferably, a primer pair is used for obtaining the fragment. 

According to other preferred embodiments of the present invention, Nucleolar protein 4 
(NOL4) splice variants containing the unique segments as depicted in SEQ ID NOs: 1554 and 
1555, for example transcripts as depicted in SEQ ID NOs: 1534-1536 and 1539-1541, 
comprises a biomarker for detecting lung cancer. 

30 According to still other preferred embodiments, the present invention optionally and 

preferably encompasses any amino acid sequence or fragment thereof encoded by a nucleic acid 
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sequence corresponding to Nucleolar protein 4 (NOL4) as described above, including but not 
limited to SEQ ID Nos: 1542, 1547 and 1543; 1548, 1545, 1546, and 1549-1551. Any 
oligopeptide or peptide relating to such an amino acid sequence or fragment thereof may 
optionally also (additionally or alternatively) be used as a biomarker, including but not limited 
5 to the unique amino acid sequence of the protein SEQ ID NO: 1543, 1546, 1549 as depicted in 
SEQ ID NO: 1544. 

The present invention also optionally encompasses antibodies capable of recognizing, 
and/or being elicited by, such oligopeptides or peptides. 

The present invention also optionally and preferably encompasses any nucleic acid 
10 sequence or fragment thereof, or amino acid sequence or fragment thereof, corresponding to 
Nucleolar protein 4 (NOL4) as described above, optionally for any application. 



Expression of Nucleolar protein 4 (NOL4)- [T06014] transcripts which are detectable by 
15 amplicon as depicted in SEQ IDs NO: 1532 in normal and cancerous lung tissues 

Expression of Nucleolar protein 4 (NOL4) transcripts detectable by SEQ ID NOs:1532 
(e.g., variant no. 3, 11 and 12, represented by SEQ IDs 1533, 1537, 1538) was measured by real 
time PCR. In parallel the expression of four housekeeping genes — PBGD (GenBank Accession 
No. BC019323; amplicon - SEQ ID NO:1471), HPRT1 (GenBank Accession No. NM 000194; 
20 amplicon - SEQ ID NO: 1468), Ubiquitin (GenBank Accession No. BC000449; amplicon - SEQ 
ID NO:1474) and SDHA (GenBank Accession No. NM_004168; amplicon - SEQ ID NO: 1481), 
was measured similarly. For each RT sample, the expression of SEQ ID NO: 1532 was 
normalized to the geometric mean of the quantities of the housekeeping genes. The normalized 
quantity of each RT sample was then divided by the median of the quantities of the normal post- 
25 mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, "Tissue samples in testing 
panel", above), to obtain a value of fold up-regulation for each sample relative to median of the 
normal PM samples. 

Figures 57a and b are histograms showing over expression of the above- indicated 
Nucleolar protein 4 (NOL4) transcripts in cancerous lung samples relative to the normal 
30 samples. The number and percentage of samples that exhibit at least 400 fold or 6 fold over- 
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expression, out of the total number of samples tested is indicated in the bottom of figures 57a 
and b respectively. 

As is evident from Figure 57a 5 the expression of Nucleolar protein 4 (NOL4) transcripts 
detectable by SEQ ID NO: 1532 in the samples originate from small cell carcinoma of the lung 
5 was significantly higher than in the non-cancerous samples (Sample Nos. 46-50, 90-93, 96-99, 
Table 2, "Tissue samples in testing panel"). Notably an over- expression of at least 400 fold was 
found in 8 out of 8 small cell carcinoma samples. As is evident from Figure 4b, over expression 
of at least 6 fold was observed also in 4 out of 15 adenocarcinoma samples, 3 out of 16 
squamous cell carcinoma samples. 
1 0 Statistical analysis was applied to verify the significance of these results, as described 

below. 

The P value for the difference in the expression levels of Nucleolar protein 4 (NOL4) 

transcripts detectable by SEQ ID NO: 1532 in lung cancer samples versus the normal lung 

samples was determined by T test as 1.70E-02. 
1 5 Threshold of 6 fold overexpression was found to differentiate between cancer and 

normal samples with P value of 1.80E-02 as checked by exact fisher test. 

The P value for the difference in the expression levels of Nucleolar protein 4 (NOL4) 

transcripts detectable by SEQ ID NO: 1532 in lung small cell carcinoma samples versus the 

normal lung samples was determined by T test as 7.08E-03. 
20 Threshold of 400 fold overexpression was found to differentiate between small cell carcinoma 

and normal lung samples with P value of 1.03E-04 as checked by exact fisher test. 

The above values demonstrate statistical significance of the results. 

According to the present invention, Nucleolar protein 4 (NOL4) is a non- limiting 
25 example of a marker for diagnosing lung cancer. The Nucleolar protein 4 (NOL4) marker of the 
present invention, can be used alone or in combination, for various uses, including but not 
limited to, prognosis, prediction, screening, early diagnosis, therapy selection and treatment 
monitoring of lung cancer. Although optionally any method may be used to detected 
overexpression and/or differential expression of this marker, preferably a NAT-based 
30 technology is used. Therefore, optionally and preferably, any nucleic acid molecule capable of 
selectively hybridizing to Nucleolar protein 4 (NOL4) as previously defined is also 
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encompassed within the present invention. Primer pairs are also optionally and preferably 
encompassed within the present invention; for example, for the above experiment, the following 
primer pair was used as a non- limiting illustrative example only of a suitable primer pair: 
Nucleolar protein 4 (NOL4) -TAA-seg 3-forward primer (SEQ ID NO: 1 530): 
5 ACATCCCCCTGGAACGGAT; and Nucleolar protein 4 (NOL4)- TAA-seg 3-Reverse primer 
(SEQ ID NO:1531): CAGAAATTAGCAAAGCATTGATGG 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
10 was obtained as a non- limiting illustrative example only of a suitable amplicon: Nucleolar 
protein 4 (NOL4) amplicon (SEQ ID NO: 1532): 

ACATCCCCCTGGAACGGAT ATCTGTTTGGGGCACTACAATCTATCCTGTAGAACTAT 
GGCCAAATCTCCATCAATGCTTTGCTAATTTCTG. 

According to other preferred embodiments of the present invention, Nucleolar protein 4 

1 5 (NOL4) or a fragment thereof comprises a biomarker for detecting lung cancer. Optionally and 
more preferably, Nucleolar protein 4 (NOL4) splice variants, as depicted in SEQ ID NO: 1533, 
1537, 1538 (e.g., variants nos. 3, 11, 12), or a fragment thereof comprise a biomarker for 
detecting lung cancer. Optionally and more preferably, the fragment of Nucleolar protein 4 
(NOL4) comprises segment_TAA-seg-3 (SEQ ID NO: 1553). Also optionally and more 

20 preferably, any suitable method may be used for detecting a fragment such as Nucleolar protein 
4 (NOL4)_segment_ TAA-seg-3 (SEQ ID NO: 1553) for example. Most preferably, NAT- 
based technology used, such as any nucleic acid molecule capable of specifically hybridizing 
with the fragment. Optionally and most preferably, a primer pair is used for obtaining the 
fragment. 

25 According to still other preferred embodiments, the present invention optionally and 

preferably encompasses any amino acid sequence or fragment thereof encoded by a nucleic acid 
sequence corresponding to Nucleolar protein 4 (NOL4) as described above, including but not 
limited to SEQ ID NOs: SEQ ID Nos: 1542, 1547 and 1548. Any oligopeptide or peptide 
relating to such an amino acid sequence or fragment thereof may optionally also (additionally or 

30 alternatively) be used as a biomarker. 
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The present invention also optionally encompasses antibodies capable of recognizing, 
and/or being elicited by, such oligopeptides or peptides. 

The present invention also optionally and preferably encompasses any nucleic acid 
sequence or fragment thereof, or amino acid sequence or fragment thereof, corresponding to 
5 Nucleolar protein 4 (NOL4) as described above, optionally for any application. 



Expression ofAA281370 transcripts which are detectable by amplicon as depicted in SEQ ID 
10 NO: 1558 in normal and cancerous lung tissues 

AA281370 gene was identified by a computational process described above as over 
expressed in lung cancer. The AA281370 encoded proteins (SEQ ID NO: 1563, 1564) contain 
several WD40 domains, which are found in a number of eukaryotic proteins that cover a wide 

15 variety of functions, including adaptor/regulatory modules in signal transduction, pre-mRNA 
processing and cytoskeleton assembly. As is demonstrated in Figure 63, the WD40 domain 
region of AA281370 encoded protein, depicted in SEQ ID NO: 1564, has several similarities 
that might suggest involvement in signal transduction MAPK pathway. For example, the region 
of the AA281370 polypeptide SEQ ID NO: 1564 located between amino acids at positions 40- 

20 790 has 75% homology to the WD40 domain region of mouse Mapkbpl protein (gi|47 124622 ) 
(figure 63a); and the amino acids at positions 40-886 of the AA281370 polypeptide SEQ ID 
NO: 1564 has 70% homology to rat JNK-binding protein JNKBP1 (gi|34856717) (figure 63b). 

Expression of AA281370 transcripts detectable by SEQ ID NO: 1558 (e.g., variant no. 0, 
25 1,4 and 5, represented in SEQ IDs 1559-1562) was measured by real time PCR. In parallel the 
expression of four housekeeping genes - PBGD (GenBank Accession No. BC0 19323; amplicon 
- SEQ ID NO:1471), HPRT1 (GenBank Accession No. NM_000194; amplicon - SEQ ID 
NO: 1468), Ubiquitin (GenBank Accession No. BC000449; amplicon - SEQ ID NO: 1474) and 
SDHA (GenBank Accession No. NM_004168; amplicon - SEQ ID NO: 1477), was measured 
30 similarly. For each RT sample, the expression of SEQ ID NO: 1558 was normalized to the 
geometric mean of the quantities of the housekeeping genes. The normalized quantity of each 
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RT sample was then divided by the median of the quantities of the normal post-mortem (PM) 
samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, "Tissue samples in testing panel", above), 
to obtain a value of fold up-regulation for each sample relative to median of the normal PM 
samples. 

5 Figure 58 is a histogram showing over expression of the above -indicated AA281370 

transcripts in cancerous lung samples relative to the normal samples. The number and 
percentage of samples that exhibit at least 6 fold over- expression, out of the total number of 
samples tested is indicated in the bottom. 

As is evident from Figures 58, the expression of AA281370 transcripts detectable by 
10 SEQ ID NO: 1558 in cancer samples was significantly higher than in the non-cancerous samples 
(Sample Nos. 46-50, 90-93, 96-99, Table 2, "Tissue samples in testing panel"). Notably an over- 
expression of at least 6 fold was found in 8 out of 8 small cell carcinoma, 2 out of 1 6 squamous 
cell carcinoma samples, and in 1 out of 4 large cell carcinoma samples. 

Statistical analysis was applied to verify the significance of these results, as described 

1 5 below. 

The P value for the difference in the expression levels .of AA281370 transcripts 
detectable by SEQ ID NO: 1558 in lung cancer samples versus the normal lung samples was 
determined by T test as 8.58E-07. 

Threshold of 6 fold overexpression was found to differentiate between cancer and 
20 normal samples with P value of 4. 8 IE- 02 as checked by exact fisher test. 

The above values demonstrate statistical significance of the results. 

According to the present invention, AA281370 transcripts are a non- limiting example of 
a marker for diagnosing lung cancer. The AA281370 marker of the present invention, can be 

25 used alone or in combination, for various uses, including but not limited to, prognosis, 

prediction, screening, early diagnosis, therapy selection and treatment monitoring of lung 
cancer. Although optionally any method may be used to detected overexpression and/or 
differential expression of this marker, preferably a NAT-based technology is used. Therefore, 
optionally and preferably, any nucleic acid molecule capable of selectively hybridizing to 

30 AA281370 as previously defined is also encompassed within the present invention. Primer pairs 
are also optionally and preferably encompassed within the present invention; for example, for 
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the above experiment, the following primer pair was used as a non- limiting illustrative example 
only of a suitable primer pair: AA28 1370- forward primer (SEQ ID NO: 1556): 
GGTTCGGATGG ACTAC ACTTTGTC ; and AA281370-Reverse primer (SEQ ID NO: 1557): 
CC ACGTACTTCTGGGTGATGTC . 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: AA281370- 
amplicon (SEQ ID NO: 1558): 

GGTTCGGATGGACTACACTTTGTCCGTACCCACCACGTAGCAGAGAAAACCACCTT 
GTATGACATGGACATTGACATCACCCAGAAGTACGTGG. 

According to other preferred embodiments of the present invention, AA281370 or a 
fragment thereof comprises a biomarker for detecting lung cancer. Optionally and more 
preferably, AA281370 splice variants, as depicted in SEQ ID NO: 1558 (e.g., variants no: 0, 1, 4 
and 5), or a fragment thereof comprise a biomarker for detecting lung cancer. Optionally and 
more preferably, the fragment of AA281370 comprises segment_TAA seg 10 SEQ ID NO: 
1567, Also optionally and more preferably, any suitable method may be used for detecting a 
fragment such as AA28 1 370_segmentJTAA seg 10 SEQ ID NO: 1567 for example. Most 
preferably, NAT-based technology used, such as any nucleic acid molecule capable of 
specifically hybridizing wth the fragment. Optionally and most preferably, a primer pair is 
used for obtaining the fragment. 

According to other preferred embodiments, the present invention also optionally and 
preferably encompasses AA281370 splice variants containing the unique segments as depicted 
in SEQ ID NO: 1568, for example transcripts 4 and 5, as depicted in SEQ ID NOs: 1561 and 
1562, comprises a biomarker for detecting lung cancer. 

According to still other preferred embodiments, the present invention optionally and 
preferably encompasses any amino acid sequence or fragment thereof encoded by a nucleic acid 
sequence corresponding to AA281370 as described above, including but not limited to SEQ ID 
NOs: 1563- 1566. Any oligopeptide or peptide relating to such an amino acid sequence or 
fragment thereof may optionally also (additionally or alternatively) be used as a biomarker, 
including but not limited to the unique amino acid sequence of the proteins SEQ ID NOs: 1563- 
1566, as depicted in SEQ ID NOs: 1569, 1570 and 1571. 
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The present invention also optionally encompasses antibodies capable of recognizing, 
and/or being elicited by, such oligopeptides or peptides. 

The present invention also optionally and preferably encompasses any nucleic acid 
sequence or fragment thereof, or amino acid sequence or fragment thereof, corresponding to 
5 AA281370 as described above, optionally for any application. 



Expression ofSulfatase 1 (SULF1)-[Z21368]> transcripts which are detectable by amplicon as 
10 depicted in SEQ ID NO: 1574 in normal and cancerous lung tissues 

SULF1 is a secreted protein which is found in the extracellular matrix. It is known to be 
downregulated in many epithelial cancer types. 

15 

Expression of Sulfatase 1 (SULF1) transcripts detectable by SEQ ID NO: 1574 (e.g., 
variant no. 13 and 14, represented in SEQ ID 1578, 1579) was measured by real time PGR. In 
parallel the expression of four housekeeping genes - PBGD (GenBank Accession No. 
BC019323; amplicon - SEQ ID NO:1471), HPRT1 (GenBank Accession No. NMJ)00194; 
20 amplicon - SEQ ID NO: 1468), Ubiquitin (GenBank Accession No. BC000449; amplicon - SEQ 
ID NO:1474) and SDHA (GenBank Accession No. NMJ)04168; amplicon - SEQ ID NO:1477), 
was measured similarly. For each RT sample, the expression of SEQ ID NO: 1574 was 
normalized to the geometric mean of the quantities of the housekeeping genes. The normalized 
quantity of each RT sample was then divided by the median of the quantities of the normal post- 
25 mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, "Tissue samples in testing 
panel", above), to obtain a value of fold up -regulation for each sample relative to median of the 
normal PM samples. 

Figure 59 is a histogram showing over expression of the above -indicated Sulfatase 1 
(SULF1) transcripts in cancerous lung samples relative to the normal samples. The number and 
30 percentage of samples that exhibit at least 8 fold over- expression, out of the total number of 
samples tested is indicated in the bottom. 
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As is evident from Figure 59, the expression of Sulfatase 1 (SULF1) transcripts 
detectable by SEQ ID NO: 1574 in cancer samples originate from non-cell carcinoma was 
significantly higher than in the non-cancerous samples (Sample Nos. 46-50, 90-93, 96-99, Table 
2, "Tissue samples in testing panel"). Notably an over- expression of at least 8 fold was found in 
5 11 out of 15 adenocarcinoma samples, 11 out of 16 squamous cell carcinoma samples, and in 4 
out of 4 large cell carcinoma samples. 

Statistical analysis was applied to verify the significance of these results, as described 

below. 

The P value for the difference in the expression levels of Sulfatase 1 (SULF1) transcripts 
10 detectable by SEQ ID NO: 1574 in lung cancer samples versus the normal lung samples was 
determined by T test as 3.18E-07. 

Threshold of 8 fold overexpression was found to differentiate between cancer and normal 
samples with P value of 1.18E-04 as checked by exact fisher test. 

The above values demonstrate statistical significance of the results. 

15 According to the present invention, Sulfatase 1 (SULF1) is a non- limiting example of a 

marker for diagnosing lung cancer. The Sulfatase 1 (SULF1) marker of the present invention, 
can be used alone or in combination, for various uses, including but not limited to, prognosis, 
prediction, screening, early diagnosis, therapy selection and treatment monitoring of lung 
cancer. Although optionally any method may be used to detected overexpression and/or 

20 differential expression of this marker, preferably a NAT-based technology is used. Therefore, 
optionally and preferably, any nucleic acid molecule capable of selectively hybridizing to 
Sulfatase 1 (SULF1) as previously defined is also encompassed within the present invention. 
Primer pairs are also optionally and preferably encompassed within the present invention; for 
example, for the above experiment, the following primer pair was used as a none limiting 

25 illustrative example only of a suitable primer pair: Sulfatase 1 (SULF1) - forward primer (SEQ 
ID NO: 1572): ACTC ACTC AGAGACTA AC AC AAAGGAAG ; and Sulfatase 1 (SULF1) - 
Reverse primer (SEQ ID NO: 1573): AGTATGGGAAGAATTTACTGGTCACA 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 

30 was obtained as a non- limiting illustrative example only of a suitable amplicon: Sulfatase 1 
(SULF1) -amplicon (SEQ ID NO: 1574): 
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ACTCACTCAGAGACTAACACAAAGGAAGTAATTTCTTACCTGGTCATTATTTAGTCT 

ACAATAAGTTCATCCTTCTTCAGTGTGACCAGTAAATTCTTCCCATACT. 

According to other preferred embodiments of the present invention, Sulfatase 1 (SULF1) 

or a fragment thereof comprises a biomarker for detecting lung cancer. Optionally and more 
5 preferably, Sulfatase 1 (SULF1) splice variants, as depicted in SEQ ID NO: 1578, 1579 (e.g., 

variants no: 13 and 14), or a fragment thereof comprise a biomarker for detecting lung cancer. 

Optionally and more preferably, the fragment of Sulfatase 1 (SULF1) comprises segmentJTAA 

seg 5 - SEQ ID NO: 1587. Also optionally and more preferably, any suitable method may be 

used for detecting a fragment such as Sulfatase 1 (SULF1) _segment_ TAA seg 5 - SEQ ID 
10 NO: 1587 for example. Most preferably, NAT-based technology used, such as any nucleic acid 

molecule capable of specifically hybridizing with the fragment. Optionally and most preferably, 

a primer pair is used for obtaining the fragment. 

According to other preferred embodiments of the present invention, Sulfatase 1 (SULF1) 

splice variants containing the unique segments as depicted in SEQ ID NOs: 1588-1591, for 
15 example transcripts as depicted in SEQ ID NOs: 1575-1577, comprises a biomarker for 

detecting lung cancer. 

According to still other preferred embodiments, the present invention optionally and 
preferably encompasses any amino acid sequence or fragment thereof encoded by a nucleic acid 
sequence corresponding to Sulfatase 1 (SULF1) as described above, including but not limited to 
20 SEQ ID NOs:1586, 1580, 1582, 1584. Any oligopeptide or peptide relating to such an amino 
acid sequence or fragment thereof may optionally also (additionally or alternatively) be used as 
a biomarker, including but not limited to the unique amino acid sequence of the protein SEQ ID 
NO: 1580, 1582, 1584, as depicted in SEQ ID NO: 1581, 1583, 1585, respectively. 

The present invention also optionally encompasses antibodies capable of recognizing, 
25 and/or being elicited by, such oligopeptides or peptides. 

The present invention also optionally and preferably encompasses any nucleic acid 
sequence or fragment thereof, or amino acid sequence or fragment thereof, corresponding to 
Nucleolar protein 4 (NOL4) as described above, optionally for any application. 



30 
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Expression ofSRY (sex determining region Y) -box 2 (SOX2))-[HUMHMGBOX], transcripts 
which are detectable by the amplicon as depicted in SEQ ID NO: 1594 in normal and cancerous 

lung tissues 

Expression of SOX2 transcripts detectable by SEQ ID NO: 1594 (e.g., variant no. 0 
5 represented by SEQ ID 1595) was measured by real time PCR. In parallel the expression of four 
housekeeping genes - PBGD (GenBank Accession No. BCO 19323; amplicon - SEQ ID 
NO:1471), HPRT1 (GenBank Accession No. NM_000194; amplicon - SEQ ID NO: 1468), 
Ubiquitin (GenBank Accession No. BC000449; amplicon - SEQ ID NO: 1474) and SDHA 
(GenBank Accession No. NMJ)04168; amplicon - SEQ ID NO: 1477), was measured similarly. 

10 For each RT sample, the expression of SEQ ID NO: 1594 was normalized to the geometric mean 
of the quantities of the housekeeping genes. The normalized quantity of each RT sample was 
then divided by the median of the quantities of the normal post-mortem (PM) samples (Sample 
Nos. 47-50, 90-93, 96-99, Table 2, "Tissue samples in testing panel", above), to obtain a value of 
fold up-regulation for each sample relative to median of the normal PM samples. 

15 Figure 60 is a histogram showing over expression of the above -indicated SOX2 

transcripts in cancerous lung samples relative to the normal samples. The number and 
percentage of samples that exhibit at least 5 fold over- expression, out of the total number of 
samples tested is indicated in the bottom. 

As is evident from Figure 60, the expression of SOX2 transcripts detectable by SEQ ID 

20 NO: 1594 in cancer samples originate from lung carcinoma was significantly higher than in the 
non-cancerous samples (Sample Nos. 46-50, 90-93, 96-99, Table 2, 'Tissue samples in testing 
panel"). Notably an over- expression of at least 5 fold was found in 4 out of 15 adenocarcinoma 
samples, 10 out of 16 squamous cell carcinoma samples, in 2 out of 4 large cell carcinoma, and 
in 7 out of 8 small cell carcinoma samples. 

25 Statistical analysis was applied to verify the significance of these results, as described 

below. 

The P value for the difference in the expression levels of SOX2 transcripts detectable by 
SEQ ID NO: 1594 in lung cancer samples versus the normal lung samples was determined by T 
test as 4.38E-05. 

30 Threshold of 5 fold overexpression was found to differentiate between cancer and normal 
samples with P value of 8.09E-04 as checked by exact fisher test. 
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The above values demonstrate statistical significance of the results. 



According to the present invention, SOX2 is a non- limiting example of a marker for 
5 diagnosing lung cancer. The SOX2 marker of the present invention, can be used alone or in 
combination, for various uses, including but not limited to, prognosis, prediction, screening, 
early diagnosis, therapy selection and treatment monitoring of lung cancer. Although 
optionally any method may be used to detected overexpression and/or differential expression of 
this marker, preferably a NAT-based technology is used. Therefore, optionally and preferably, 

10 any nucleic acid molecule capable of selectively hybridizing to SOX2 as previously defined is 
also encompassed within the present invention. Primer pairs are also optionally and preferably 
encompassed within the present invention; for example, for the above experiment, the following 
primer pair was used as a non- limiting illustrative example only of a suitable primer pair: SOX2 
-forward primer (SEQ ID NO: 1592): GGCGGCGGCAGGAT; and SOX2 -Reverse primer 

1 5 (SEQ ID NO: 1 593): GTCGGGAGCGCAGGG. 

The present invention also preferably encompasses any amplicon obtained through the 
use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a norv limiting illustrative example only of a suitable amplicon: SOX2 - 
amplicon (SEQ ID NO: 1594): 

20 GGCGGCGGCAGGATCGGCCAGAGGAGGAGGGAAGCGCTTTTTTTGATCCTGATTCC 
AGTTTGCCTCTCTCTTTTTTTCCCCCAAATTATTCTTCGCCTGATTTTCCTCGCGGAG 
CCCTGCGCTCCCGAC. 

According to other preferred embodiments of the present invention, SOX2 or a fragment 
thereof comprises a biomarker for detecting lung cancer. Optionally and more preferably, 

25 SOX2 splice variants, as depicted in SEQ ID NO: 1595 (e.g., variants no: 0), or a fragment 
thereof comprise a biomarker for detecting lung cancer. Optionally and more preferably, the 
fragment of SOX2 comprises segmentJTAA seg 2 - SEQ ID NO: 1597. Also optionally and 
more preferably, any suitable method may be used for detecting a fragment such as SOX2 
_segment_ TAA seg 2 - SEQ ID NO: 1597 for example. Most preferably, NAT-based 

30 technology used, such as any nucleic acid molecule capable of specifically hybridizing with the 
fragment. Optionally and most preferably, a primer pair is used for obtaining the fragment. 
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According to still other preferred embodiments, the present invention optionally and 
preferably encompasses any amino acid sequence or fragment thereof encoded by a nucleic acid 
sequence corresponding to SOX2 as described above, including but not limited to SEQ ID NOs: 
SEQ ID NO: 1596. Any oligopeptide or peptide relating to such an amino acid sequence or 
5 fragment thereof may optionally also (additionally or alternatively) be used as a biomarker. 

The present invention also optionally encompasses antibodies capable of recognizing, 
and/or being elicited by, such oligopeptides or peptides. 

The present invention also optionally and preferably encompasses any nucleic acid 
sequence or fragment thereof, or amino acid sequence or fragment thereof, corresponding to 
10 SOX2 as described above, optionally for any application. 



15 Expression of Plakophilin 1 (ectodermal dysplasia/ skin fragility syndrome) (PKP1) -[HSB6PR], 
transcripts which are detectable by the amplicon as depicted in SEQ ID NO: 1600 in normal and 

cancerous lung tissues 

Expression of PKP1 transcripts detectable by SEQ ID NO: 1600 (e.g., variant no. 0, 5 and 
6-represented by SEQ IDs 1601-1603) was measured by real time PCR. In parallel the 

20 expression of four housekeeping genes - PBGD (GenBank Accession No. BC019323; amplicon 
- SEQ ID NO: 1471), HPRT1 (GenBank Accession No. NM_000194; amplicon - SEQ ID 
NO: 1468), Ubiquitin (GenBank Accession No. BC000449; amplicon - SEQ ID NO: 1474) and 
SDHA (GenBank Accession No. NM 004168; amplicon - SEQ ID NO: 1477), was measured 
similarly. Ibr each RT sample, the expression of SEQ ID NO: 1600 was normalized to the 

25 geometric mean of the quantities of the housekeeping genes. The normalized quantity of each 
RT sample was then divided by the median of the quantities of the normal post-mortem (PM) 
samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, "Tissue samples in testing panel" above), to 
obtain a value of fold up-regulation for each sample relative to median of the normal PM 
samples. 

30 Figure 61 is a histogram showing over expression of the above- indicated PKP1 

transcripts in cancerous lung samples relative to the normal samples. The number and 
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percentage of samples that exhibit at least 7 fold over-expression, out of the total number of 
samples tested is indicated in the bottom. 

As is evident from Figure 61, the expression of PKP1 transcripts detectable by SEQ ID 
NO: 1600 in cancer samples originate from lung carcinoma was significantly higher than in the 
5 non-cancerous samples (Sample Nos. 46-50, 90-93, 96-99, Table 2, "Tissue samples in testing 
panel"). Notably an over- expression of at least 7 fold was found in 11 out of 16 squamous cell 
carcinoma samples, and in 1 out of 4 large cell carcinoma. 

Statistical analysis was applied to verify the significance of these results, as described 

below. 

10 The P value for the difference in the expression levels of PKP1 transcripts detectable by 

SEQ ID NO: 1600 in lung cancer samples versus the normal lung samples was determined by T 
test as3.18E-03. 

Threshold of 7 fold overexpression was found to differentiate between cancer and 
normal samples with P value of 3.50E-02 as checked by exact fisher test. 
15 The above values demonstrate statistical significance of the results. 

According to the present invention, PKP1 is a non- limiting example of a marker for 
diagnosing lung cancer. The PKP1 marker of the present invention, can be used alone or in 
combination, for various uses, including but not limited to, prognosis, prediction, screening, 

20 early diagnosis, therapy selection and treatment monitoring of lung cancer. Although 

optionally any method may be used to detected overexpression and/or differential expression of 
this marker, preferably a NAT-based technology is used. Therefore, optionally and preferably, 
any nucleic acid molecule capable of selectively hybridizing to PKP1 as previously defined is 
also encompassed within the present invention. Primer pairs are also optionally and preferably 

25 encompassed within the present invention; for example, for the above experiment, the following 
primer pair was used as a non- limiting illustrative example only of a suitable primer pair: PKP1 
-forward primer (SEQ ID NO: 1598): CCCCAGACTCTGTGCACTTCA; andPKPl -Reverse 
primer (SEQ ID NO: 1599): TGGGCTCTGCTCTGTCTTAGTGTA 

The present invention also preferably encompasses any amplicon obtained through the 

30 use of any suitable primer pair; for example, for the above experiment, the following amplicon 
was obtained as a non- limiting illustrative example only of a suitable amplicon: PKP1 - 
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amplicon (SEQ ID NO: 1600): 

CCCCAGACTCTGTGCACTTCAGACCAGCAGCAGCAGGAGGGCTCCCGAGGGCCTTA 
TGAGAAAACCTGTGTGGACATCCCTTGGTGTACACTAAGACAGAGCAGAGCCCA 

According to other preferred embodiments of the present invention, PKP1 or a fragment 
5 thereof comprises a biomarker for detecting lung cancer. Optionally and more preferably, PKP1 
splice variants, as depicted in SEQ ID NO: 1601-1603 (e.g., variants no: 0, 5 and 6), or a 
fragment thereof comprise a biomarker for detecting lung cancer. Optionally and more 
preferably, the fragment of PKP1 comprises segmentJTAA seg 34-SEQ ID NO: 1608. Also 
optionally and more preferably, any suitable method may be used for detecting a fragment such 
10 as PKPl_segment_ TAA seg 34-SEQ ID NO: 1608 for example. Most preferably, NAT-based 
technology used, such as any nucleic acid molecule capable of specifically hybridizing with the 
fragment. Optionally and most preferably, a primer pair is used for obtaining the fragment. 

According to other preferred embodiments of the present invention, PKP1 splice variants 
containing the unique segment_8 as depicted in SEQ ID NO: 1607, for example variant 6, as 
15 depicted in SEQ ID NO: 1603, are suitable as biomarkers for detecting lung cancer. 

According to still other preferred embodiments, the present invention optionally and 
preferably encompasses any amino acid sequence or fragment thereof encoded by a nucleic acid 
sequence corresponding to PKP1 as described above, including but not limited to SEQ ID NOs: 
1604-1606. Any oligopeptide or peptide relating to such an amino acid sequence or fragment 
20 thereof may optionally also (additionally or alternatively) be used as a biomarker. 

The present invention also optionally encompasses antibodies capable of recognizing, 
and/or being elicited by, such oligopeptides or peptides. 

The present invention also optionally and preferably encompasses any nucleic acid 
sequence or fragment thereof, or amino acid sequence or fragment thereof, corresponding to 
25 PKP1 as described above, optionally for any application. 



Combined expression of 12 sequences (SEQ ID NO: 1480, 1517, 1529, 1532, 1558, 
30 1574, 1594, 1600, 1616, 1619, 1622, 1625 ) in normal and cancerous lung tissues. 
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Expression of several transcripts detectable by SEQ ID NOs: 1480, 1517, 1529, 1532, 
1558, 1574, 1594, 1600, 1616, 1619, 1622, 1625 was measured by real time PCR (the expression 
of each SEQ ID was checked separately). In parallel the expression of four housekeeping genes 
- PBGD (GenBank Accession No. BC019323; amplicon - SEQ ID NO:1471), HPRT1 (GenBank 
5 Accession No. NM_000194; amplicon - SEQ ID NO: 1468), Ubiquitin (GenBank Accession No. 
BC000449; amplicon - SEQ ID NO:1474) and SDHA (GenBank Accession No. NM_004168; 
amplicon - SEQ ID NO: 1477), was measured similarly. For each RT sample, the expression of 
SEQ ID NOs: 1480, 1517, 1529, 1532, 1558, 1574, 1594, 1600, 1616, 1619, 1622, 1625 was 
normalized to the geometric mean of the quantities of the housekeeping genes. The normalized 
10 quantity of each RT sample was then divided by the median of the quantities of the normal post- 
mortem (PM) samples (Sample Nos. 47-50, 90-93, 96-99, Table 2, "Tissue samples in testing 
panel", above), to obtain a value of fold up-regulation for each sample relative to median of the 
normal PM samples. 

Figure 62 is a histogram showing over expression of the above- indicated transcripts in 
15 cancerous lung samples relative to the normal samples. The number and percentage of samples 
that exhibit at least 10 fold over- expression of at least one of the SEQ IDs, out of the total 
number of samples tested is indicated in the bottom. 

As is evident from Figure 62, an over- expression of at least 10 fold in at least one of the 
SEQ IDs was found in 15 out of 15 adenocarcinoma samples, 15 out of 16 squamous cell 
20 carcinoma samples, 4 out of 4 large cell carcinoma samples, and in 8 out of 8 small-cell 
samples. 

Statistical analysis was applied to verify the significance of these results, as described 
below. Threshold of 10 fold overexpression of at least one of the amplicons as depicted in SEQ 
ID NOs: 1480, 1517, 1529, 1532, 1558, 1574, 1594, 1600, 1616, 1619, 1622, 1625, was found 
25 to differentiate between cancer and normal samples with P value of 2.37E-08 as checked by 
exact fisher test. 

The above values demonstrate statistical significance of the results. 
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Kits and Diagnostic Assays and Methods 
The markers described with regard to any of Examples above can be used alone, in 
5 combination with other markers described above, and/or with other entirely different markers, 
including but not limited to UbcHIO (see US Patent Application Nos: 60/535,904 and 
60/572,122; attorney refs: 27080 and 28045, filed on January 13 and May 19 2004, 
respectively), Troponin (see US Patent Application No: 60/539,129; attorney ref: 26940), Sim2 
(see PCT Application No. WO 2004/012847), PE-10 (SP-A), TTF-1, Cytokeratin 5/6, to aid in 

10 the diagnosis of lung cancer. All of these applications are hereby incorporated by reference as if 
fully set forth herein. These markers can be used in combination with other markers for a 
number of uses, including but not limited to, prognosis, prediction, screening, early diagnosis, 
therapy selection and treatment monitoring of lung cancer, and also optionally including staging 
of the disease. Used together, they may provide more information for the diagnostician, 

15 increasing the percentage of true positive and true negative diagnoses and decreasing the 

percentage of false positive or false negative diagnoses, as compared to the results obtained with 
a single marker alone. 

Assays and methods according to the present invention, as described above, include but 
are not limited to, immunoassays, hybridization assays and NAT-based assays. The combination 

20 of the markers of the present invention with other markers described above, and/or with other 
entirely different markers to aid in the diagnosis of lung cancer could be carried out as a mix of 
NAT-based assays, immunoassays and hybridization assays. According to preferred 
embodiments of the present invention, the assays are NAT-based assays, as described for 
example with regard to the Examples above. 

25 In yet another aspect, the present invention provides kits for aiding a diagnosis of lung 

cancer, wherein the kits can be used to detect the markers of the present invention. For example, 
the kits can be used to detect any one or combination of markers described above, which 
markers are differentially present in samples of a lung cancer patients and normal patients. The 
kits of the invention have many applications. For example, the kits can be used to differentiate if 

30 a subject has a small cell lung cancer, non-small cell lung cancer, adenocarcinoma, 

bronchoalveolar- alveolar, squamous cell or large cell carcinomas or has a negative diagnosis, 
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thus aiding a lung cancer diagnosis. In another example, the kits can be used to identify 
compounds that modulate expression of the markers in in vitro lung cells or in vivo animal 
models for lung cancer. 

In one embodiment, a kit comprises: (a) a substrate comprising an adsorbent thereon, 
5 wherein the adsorbent is suitable for binding a marker, and (b) a washing solution or instructions 
for making a washing solution, wherein the combination of the adsorbent and the washing 
solution allows detection of the marker as previously described. 

Optionally, the kit can further comprise instructions for suitable operational parameters 
in the form of a label or a separate insert. For example, the kit may have standard instructions 
10 informing a consumer/kit user how to wash the probe after a sample of seminal plasma or other 
tissue sample is contacted on the probe. 

In another embodiment, a kit comprises (a) an antibody that specifically binds to a 
marker; and (b) a detection reagent. Such kits can be prepared from the materials described 
above. 

15 In either embodiment, the kit may optionally further comprise a standard or control 

information, and/or a control amount of material, so that the test sample can be compared with 
the control information standard and/or control amount to determine if the test amount of a 
marker detected in a sample is a diagnostic amount consistent with a diagnosis of lung cancer. 

20 Therapeutic applications of splice variants of the present invention 

Splice variants described herein (including any polynucleotide, oligonucleotide, 
polypeptide, peptide or fragments thereof) or antibodies that specifically bind thereto may 
optionally be used for therapeutic applications, for example to treat the diseases described herein 
with regard to diagnostic applications thereof. A "variant-treatable" disease refers to any disease 

25 that is treatable by using a splice variant of any of the therapeutic proteins according to the present 
invention. "Treatment" also encompasses prevention, amelioration, elimination and control of the 
disease and/or pathological condition. The diseases for which such variants may be useful 
therapeutic agents are described in greater detail below for each of the variants. The variants 
themselves are described by "cluster" or by gene, as these variants are splice variants of known 

30 proteins. Therefore, a "cluster- related disease" or a "variant-related disease" refers to a disease that 
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may be treated by a particular protein, with regard to the description of such diseases below a 
therapeutic protein variant according to the present invention. 

The term "biologically active", as used herein, refers to a protein having structural, 
regulatory, or biochemical functions of a naturally occurring molecule. Likewise, 
5 "immunologically active" refers to the capability of the natural, recombinant, or synthetic ligand, or 
any oligopeptide thereof, to induce a specific immune response in appropriate animals or cells and 
to bind with specific antibodies. 

The term "modulate", as used herein, refers to a change in the activity of at least one 
receptor mediated activity. For example, modulation may cause an increase or a decrease in protein 
10 activity, binding characteristics, or any other biological, functional or immunological properties of 
a ligand. 



METHODS OF TREATMENT 
15 As mentioned hereinabove the novel therapeutic protein variants of the present invention 

and compositions derived therefrom (i.e., peptides, oligonucleotides) can be used to treat cluster- 
related diseases. 

Thus, according to an additional aspect of the present invention there is provided a method 
of treating cluster-related disease in a subject. 
20 The subject according to the present invention is a mammal, preferably a human which has 

at least one type of the cluster- related diseases described hereinabove. 

As mentioned hereinabove, the biomolecular sequences of the present invention can be 
used to treat subjects with the above -described diseases. 

The subject according to the present invention is a mammal, preferably a human which is 
25 diagnosed with one of the diseases described hereinabove, or alternatively is predisposed to having 
one of the diseases described hereinabove. 

As used herein the term "treating" refers to preventing, curing, reversing, attenuating, 
alleviating, niinimdzing, suppressing or halting the deleterious effects of the above-described 
diseases. 
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Treating, according to the present invention, can be effected by specifically upregulating or 
alternatively downregulating the expression of at least one of the polypeptides of the present 
invention in the subject. 

Optionally, upregulation may be effected by administering to the subject at least one of the 

5 polypeptides of the present invention (e.g., recombinant or synthetic) or an active portion thereof, 
as described herein. However, since the bioavailability of large polypeptides may potentially be 
relatively small due to high degradation rate and low penetration rate, administration of 
polypeptides is preferably confined to small peptide fragments (e.g., about 100 amino acids). The 
polypeptide or peptide may optionally be administered in a pharmaceutical composition, described 

10 in more detail bebw. 

It will be appreciated that treatment of the above -described diseases according to the 
present invention may be combined with other treatment methods known in the art (i.e., 
combination therapy). Thus, treatment of malignancies using the agents of the present invention 
may be combined with, for example, radiation therapy, antibody therapy and/or chemotherapy. 

15 Alternatively or additionally, an upregulating method may optionally be effected by 

specifically upregulating the amount (optionally expression) in the subject of at least one of the 
polypeptides of the present invention or active portions thereof. 

As is mentioned hereinabove and in the Examples section which follows, the biomolecular 
sequences of this aspect of the present invention may be used as valuable therapeutic tools in the 

20 treatment of diseases in which altered activity or expression of the wild-type gene product is known 
to contribute to disease onset or progression. For example in case a disease is caused by 
overexpression of a membrane bound receptor, a soluble variant thereof may be used as an 
antagonist which competes with the receptor for binding the ligand, to thereby terminate signaling 
from the receptor. 

25 Examples of such diseases are listed in the Examples section which follows. 

It will be appreciated that the polypeptides of the present invention may also have agonistic 
properties. These include increasing the stability of the ligand (e.g., IL-4), protection from 
proteolysis and modification of the pharmacokinetic properties of the ligand (i.e., increasing the 
half-life of the ligand, while decreasing the clearance thereof). As such, the biomolecular 

30 sequences of this aspect of the present invention may be used to treat conditions or diseases in 
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which the wild-type gene product plays a favorable role, for example, increasing angiogenesis in 
cases of diabetes or ischemia. 

Upregulating expression of the therapeutic protein variants of the present invention may be 
effected via the administration of at least one of the exogenous polynucleotide sequences of the 
5 present invention, ligated into a nucleic acid expression construct designed for expression of coding 
sequences in eukaryotic cells (e.g., mammalian cells), as described above. Accordingly, the 
exogenous polynucleotide sequence may be a DNA or RNA sequence encoding the variants of the 
present invention or active portions thereof. 

It will be appreciated that the nucleic acid construct can be administered to the individual 

10 employing any suitable mode of administration, described hereinbelow (i.e., in- vivo gene therapy). 
Alternatively, the nucleic acid construct is introduced into a suitable cell via an appropriate gene 
delivery vehicle/method (transfection, transduction, homologous recombination, etc.) and an 
expression system as needed and then the modified cells are expanded in culture and returned to the 
individual (i.e., ex- vivo gene therapy). Nucleic acid constructs are described in greater detail 

1 5 above. 

It will be appreciated that the present methodology may also be effected by specifically 
upregulating the expression of the variants of the present invention endogenously in the subject. 
Agents for upregulating endogenous expression of specific splice variants of a given gene include 
antisense oligonucleotides, which are directed at splice sites of interest, thereby altering the splicing 

20 pattern of the gene. This approach has been successfully used for shifting the balance of expression 
of the two isoforms of Bcl-x [Taylor (1999) Nat. Biotechnol. 17:1097- 1 100; and Mercatante (2001) 
J. Biol. Chem. 276:16411-16417]; IL-5R [KaiTas (2000) Mol. Pharmacol. 58:380-387]; and c-myc 
[Giles (1999) Antisense Acid Drug Dev. 9:213-220]. 

For example, interleukin 5 and its receptor play a critical role as regulators of hematopoiesis 

25 and as mediators in some inflammatory diseases such as allergy and asthma. Two alternatively 

spliced isoforms are generated from the EL-5R gene, which include (i.e., long form) or exclude (i.e., 
short form) exon 9. The long form encodes for the intact membrane-bound receptor, while the 
shorter form encodes for a secreted soluble non- functional receptor. Using 2'-0-MOE- 
oligonucleotides specific to regions of exon 9, Karras and co-workers (supra) were able to 

30 significantly decrease the expression of the wild type receptor and increase the expression of the 
shorter isoforms. Design and synthesis of oligonucleotides which can be used according to the 
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present invention are described hereinbelow and by Sazani and Kole (2003) Progress in Moleclular 
and Subcellular Biology 31:21 7-239. 

Upregulating expression of the polypeptides of the present invention in a subject may be 
effected via the administration of at least one of the exogenous polynucleotide sequences of the 
5 present invention (e.g., SEQ ID NOs: 3, 7, 1 1, 15, 19, 23, 27, 31, 35, 39 or 43) ligated into a 

nucleic acid expression construct designed for expression of coding sequences in eukaryotic cells 
(e.g., mammalian cells). Accordingly, the exogenous polynucleotide sequence may be a DNA or 
RNA sequence encoding the variants of the present invention or active portions thereof. 

It will be appreciated that the nucleic acid construct can be administered to the individual 

10 employing any suitable mode of administration, described hereinbelow (i.e., in- vivo gene therapy). 
Alternatively, the nucleic acid construct is introduced into a suitable cell via an appropriate gene 
delivery vehicle/method (transfection, transduction, homologous recombination, etc.) and an 
expression system as needed and then the modified cells are expanded in culture and returned to the 
individual (i.e., ex- vivo gene therapy). 

1 5 Preferably, the promoter utilized by the nucleic acid construct of the present invention is 

active in the specific cell population transformed. Examples of cell type-specific and/or tissue- 
specific promoters include promoters, such as albumin that is liver specific [Pinkert et al., (1987) 
Genes Dev. 1:268-277], lymphoid specific promoters [Calame et al., (1988) Adv. Immunol. 
43:235-275]; in particular promoters of T-cell receptors [Winoto et ah, (1989) EMBO J. 8:729-733] 

20 and immunoglobulins; [Banerji et al. (1983) Cell 33729-740], neuron- specific promoters such as 
the neurofilament promoter [Byrne et al. (1989) Proa Natl. Acad. Sci. USA 86:5473-5477], 
pancreas- specific promoters [Edlunch et ah (1985) Science 230:912-916] or mammary gland- 
specific promoters such as the milk whey promoter (U.S. Pat. No. 4,873,316 and European Patent 
Application No. EP 264,166). 

25 Examples of suitable constructs include, but are not limited to, pcDNA3, pcDNA3.1 (+/-), 

pGL3, PzeoSV2 (+/-), pDisplay, pEF/myc/cyto, pCMV/myc/cyto each of which is commercially 
available from Invitrogen Co. (www.invitrogen.com). Examples of retroviral vector and packaging 
systems are those sold by Clontech, San Diego, Calif, including Retro-X vectors pLNCX and 
pLXSN, which permit cloning into multiple cloning sites and the trasgene is transcribed from CMV 

30 promoter. Vectors derived from Mo-MuLV are also included such as pBabe, where the transgene 
will be transcribed from the 5 ? LTR promoter. 
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Currently preferred in vivo nucleic acid transfer techniques include transfection with viral 
or non- viral constructs, such as adenovirus, lentivirus, Herpes simplex I virus, or adeno- associated 
virus (AAV) and lipid-based systems. Useful lipids for lipid- mediated transfer of the gene are, for 
example, DOTMA, DOPE, and DC-Choi [Tonkinson et al., Cancer Investigation, 14(1): 54-65 
5 (1996)]. The most preferred constructs for use in gene therapy are viruses, most preferably 

adenoviruses, AAV, lentiviruses, or retroviruses. A viral construct such as a retroviral construct 
includes at least one transcriptional promoter/enhancer or locus -defining element(s), or other 
elements that control gene expression by other means such as alternate splicing, nuclear RNA 
export, or post-translational modification of messenger. Such vector constructs also include a 

10 packaging signal, long terminal repeats (LTRs) or portions thereof, and positive and negative strand 
primer binding sites appropriate to the virus used, unless it is already present in the viral construct. 
In addition, such a construct typically includes a signal sequence for secretion of the peptide from a 
host cell in which it is placed. Preferably the signal sequence for this purpose is a mammalian 
signal sequence or the signal sequence of the polypeptide variants of the present invention. 

15 Optionally, the construct may also include a signal that directs polyadenylation, as well as one or 
more restriction sites and a translation termination sequence. By way of example, such constructs 
will typically include a 5* LTR, a tRNA binding site, a packaging signal, an origin of second-strand 
DNA synthesis, and a 3' LTR or a portion thereof. Other vectors can be used that are non- viral, 
such as cationic lipids, polylysine, and dendrimers. 

20 It will be appreciated that the present methodology may also be performed by 

specifically upregulating the expression of the splice variants of the present invention 
endogenously in the subject. Agents fcr upregulating endogenous expression of specific splice 
variants of a given gene include antisense oligonucleotides, which are directed at splice sites of 
interest, thereby altering the splicing pattern of the gene. This approach has been successfully used 

25 for shifting the balance of expression of the two isoforms of Bcl-x [Taylor (1999) Nat. BiotechnoL 
17:1097-1100; and Mercatante (2001) J. Biol. Chem. 276:16411-16417]; IL-5R [Karras (2000) 
Mol. Pharmacol. 58:380-387]; and c-myc [Giles (1999) Antisense Acid Drug Dev. 9:213-220]. 

For example, interleukin 5 and its receptor play a critical role as regulators of hematopoiesis 
and as mediators in some inflammatory diseases such as allergy and asthma. Two alternatively 

30 spliced isoforms are generated from the EL-5R gene, which include (i.e., long form) or exclude (i.e., 
short form) exon 9. The long form encodes for the intact membrane -bound receptor, while the 
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shorter form encodes for a secreted soluble non- functional receptor. Using 2'-0-MOE- 
oligonucleotides specific to regions of exon 9, Karras and co-workers (supra) were able to 
significantly decrease the expression of the wild type receptor and increase the expression of the 
shorter isoforms. Design and synthesis of oligonucleotides which can be used according to the 
5 present invention are described hereinbelow and by Sazani and Kole (2003) Progress in Moleclular 
and Subcellular Biology 31:217-239. 

Treatment can preferably effected by agents which are capable of specifically 
downregulating expression (or activity) of at least one of the polypeptide variants of the present 
invention. 

1 0 Down regulating the expression of the therapeutic protein variants of the present invention 

may be achieved using oligonucleotide agents such as those described in greater detail below. 

SiRNA molecules - Small interfering RNA (siRNA) molecules can be used to down- 
regulate expression of the therapeutic protein variants of the present invention. RNA interference 
is a two-step process. The first step, which is termed as the initiation step, input dsRNA is digested 

15 into 21-23 nucleotide (nt) small interfering RNAs (siRNA), probably by the action of Dicer, a 
member of the RNase III family of dsRN A- specific ribonucleases, which processes (cleaves) 
dsRNA (introduced directly or via a transgene or a virus) in an ATP -dependent manner. 
Successive cleavage events degrade the RNA to 19-21 bp duplexes (siRNA), each with 2- 
nucleotide 3' overhangs [Hutvagner and Zamore Curr. Opin. Genetics and Development 12:225- 

20 232 (2002); and Bernstein Nature 409:363-366 (2001)]. 

In the effector step, the siRNA duplexes bind to a nuclease complex to from the 
RNA- induced silencing complex (RISC). An ATP-dependent unwinding of the siRNA duplex is 
required for activation of the RISC. The active RISC then targets the homologous transcript by 
base pairing interactions and cleaves the mRNA into 12 nucleotide fragments from the 3' terminus 

25 of the siRNA [Hutvagner and Zamore Curr. Opin. Genetics and Development 12:225-232 (2002); 
Hammond et al. (2001) Nat Rev. Gen. 2:110-119 (2001); and Sharp Genes. Dev. 15:485-90 
(2001)]. Although the mechanism of cleavage is still to be elucidated, research indicates that each 
RISC contains a single siRNA and an RNase [Hutvagner and Zamore Curr. Opin. Genetics and 
Development 12:225-232 (2002)]. 

30 Because of the remarkable potency of RNAi, an amplification step within the RNAi 

pathway has been suggested. Amplification could occur by copying of the input dsRNAs which 
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would generate more siRNAs, or by replication of the siRNAs formed. Alternatively or 
additionally, amplification could be effected by multiple turnover events of the RISC [Hammond et 
al. Nat Rev. Gen. 2:1 10-1 19 (2001), Sharp Genes. Dev. 15:485-90 (2001); Hutvagner and Zamore 
Curr. Opin. Genetics and Development 12:225-232 (2002)]. For more information on RNAi see 
5 the following reviews Tuschl ChemBiochem. 2:239-245 (2001); Cullen Nat. Immunol. 3:597-599 
(2002); and Brantl Biochem. Biophys. Act. 1575:15-25 (2002). 

Synthesis of RNAi molecules suitable for use with the present invention can be 
effected as follows. First, the mRNA sequence is scanned downstream of the AUG start codon for 
AA dinucleotide sequences. Occurrence of each AA and the 3' adjacent 19 nucleotides is recorded 

10 as potential siRNA target sites. Preferably, siRNA target sites are selected from the open reading 
frame, as untranslated regions (UTRs) are richer in regulatory protein binding sites. UTR-binding 
proteins and/or translation initiation complexes may interfere with binding of the siRNA 
endonuclease complex [Tuschl ChemBiochem. 2:239-245]. It will be appreciated though, that 
siRNAs directed at untranslated regions may also be effective, as demonstrated for GAPDH 

1 5 wherein siRNA directed at the 5' UTR mediated about 90 % decrease in cellular GAPDH mRNA 
and completely abolished protein level (www.ambion.com/techlib/tii/91/912.htiTil). 

Second, potential target sites are compared to an appropriate genomic database 
(e.g., human, mouse, rat etc.) using any sequence alignment software, such as the BLAST software 
available from the NCBI server (www.ncbi.nlm.iiih.gov/BLAST/). Putative target sites which 

20 exhibit significant homology to other coding sequences are filtered out. 

Qualifying target sequences are selected as template for siRNA synthesis. Preferred 
sequences are those including low G/C content as these have proven to be more effective in 
mediating gene silencing as compared to those with G/C content higher than 55 %. Several target 
sites are preferably selected along the length of the targst gene for evaluation. Target sites are 

25 selected from the unique nucleotide sequences of each of the polynucleotides of the present 

invention, such that each polynucleotide is specifically down regulated. For better evaluation of the 
selected siRNAs, a negative control is preferably used in conjunction. Negative control siRNA 
preferably include the same nucleotide composition as the siRNAs but lack significant homology to 
the genome. Thus, a scrambled nucleotide sequence of the siRNA is preferably used, provided it 

30 does not display any significant homology to any other gene. 
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DNAzyme molecules - Another agent capable of downregulating expression of the 
polypeptides of the present invention is a DNAzyme molecule capable of specifically cleaving an 
mRNA transcript or DNA sequence of the polynucleotides of the present invention. DNAzymes are 
single- stranded polynucleotides which are capable of cleaving both single and double stranded 
target sequences (Breaker, R.R. and Joyce, G. Chemistry and Biology 1995;2:655; Santoro, S.W. & 
Joyce, G.F. Proc. Natl, Acad. Sci. USA 1997;943:4262) A general model (the "10-23" model) for 
the DNAzyme has been proposed. "10-23" DNAzymes have a catalytic domain of 15 
deoxyribonucleotides, flanked by two substrate-recognition domains of seven to nine 
deoxyribonucleotides each. This type of DNAzyme can effectively cleave its substrate RNA at 
purineipyrimidine junctions (Santoro, S.W. & Joyce, G.F. Proc. Natl, Acad. Sci. USA 199; for rev 
of DNAzymes see Khachigian, LM [Curr Opin Mol Ther 4:1 19-21 (2002)]. 

Target sites for DNAzymes are selected from the unique nucleotide sequences of each of 
the polynucleotides of the present invention, such that each polynucleotide is specifically down 
regulated. 

Examples of construction and amplification of synthetic, engineered DNAzymes 
recognizing single and double -stranded target cleavage sites have been disclosed in U.S. Pat. No. 
6,326,174 to Joyce et al. DNAzymes of similar design directed against the human Urokinase 
receptor were recently observed to inhibit Urokinase receptor expression, and successfully inhibit 
colon cancer cell metastasis in vivo (Itoh et al , 20002, Abstract 409, Ann Meeting Am Soc Gen 
Ther www.asgt.org). In another application, DNAzymes complementary to bcr-abl oncogenes 
were successful in inhibiting the oncogenes expression in leukemia cells, and lessening relapse 
rates in autologous bone marrow transplant in cases of CML and ALL. 

Antisense molecules - Downregulation of the polynucleotides of the present 
invention can also be effected by using an antisense polynucleotide capable of specifically 
hybridizing with an mRNA transcript encoding the polypeptide variants of the present invention. 

The term "antisense", as used herein, refers to any composition containing nucleotide 
sequences, which are complementary to a specific DNA or RNA sequence. 

The term "antisense strand" is used in reference to a nucleic acid strand that is 
complementary to the "sense" strand. Antisense molecules also include peptide nucleic acids and 
may be produced by any method including synthesis or transcription. Once introduced into a cell, 
the complementary nucleotides combine with natural sequences produced by the cell to form 
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duplexes and block either transcription or translation. The designation "negative" is sometimes 
used in reference to the antisense strand, and "positive" is sometimes used in reference to the sense 
strand. Antisense oligonucleotides are also used for modulation of alternative splicing in vivo and 
for diagnostics in vivo and in vitro (Khelifi C. et aL, 2002, Current Pharmaceutical Design 8:451- 
1466; Sazani, P., and Kole. R. Progress in Molecular and Cellular Biology, 2003, 31:217-239). 

Design of antisense molecules which can be used to efficiently downregulate 
expression of the polypeptides of the present invention must be effected while considering two 
aspects important to the antisense approach. The first aspect is delivery of the oligonucleotide into 
the cytoplasm of the appropriate cells, while the second aspect is design of an oligonucleotide 
which specifically binds the designated mRNA within cells in a way which inhibits translation 
thereof 

The prior art teaches of a number of delivery strategies which can be used to efficiently 
deliver oligonucleotides into a wide variety of cell types [see, for example, Luft J Mol Med 76: 75- 
6 (1998); Kronenwett et al. Blood 91: 852-62 (1998); Rajur et al. Bioconjug Chem 8: 935-40 
(1997); Lavigne et al. Biochem Biophys Res Commim 237: 566-71 (1997) and Aoki et al. (1997) 
Biochem Biophys Res Commun 231: 540-5 (1997)]. 

In addition, algorithms for identifying those sequences with the highest predicted binding 
affinity for their target mRNA based on a thermodynamic cycle that accounts for the energetics of 
structural alterations in both the target mRNA and the oligonucleotide are also available [see, for 
example, Walton et al. Biotechnol Bioeng 65: 1-9 (1999)]. 

Such algorithms have been successfully used to implement an antisense approach in cells. 
For example, the algorithm developed by Walton et al enabled scientists to successfully design 
antisense oligonucleotides for rabbit beta-globin (RBG) and mouse tumor necrosis factor-alpha 
(TNF alpha) transcripts. The same research group has more recently reported that the antisense 
activity of rationally selected oligonucleotides against three model target mRNAs (human lactate 
dehydrogenase A and B and rat gpl30) in cell culture as evaluated by a kinetic PCR technique 
proved effective in almost all cases, including tests against three different targets in two cell types 
with phosphodiester and phosphorothioate oligonucleotide chemistries. 

In addition, several approaches for designing and predicting efficiency of specific 
oligonucleotides using an in vitro system were also published (Matveeva et al., Nature 
Biotechnology 16: 1374 - 1375 (1998)]. 



WO 2006/131783 



PCT/IB2005/004037 



1389 

Several clinical trials have demonstrated safety, feasibility and activity of antisense 
oligonucleotides. For example, antisense oligonucleotides suitable for the treatment of cancer have 
been successfully used [Holmund et al., Curr Opin Mol Ther 1:372-85 (1999)], while treatment of 
hematological malignancies via antisense oligonucleotides targeting c-myb gene, p53 and Bcl-2 
5 had entered clinical trials and had been shown to be tolerated by patients [Gerwitz Curr Opin Mol 
Ther 1:297-306 (1999)]. 

More recently, antisense-mediated suppression of human heparanase gene expression has 
been reported to inhibit pleural dissemination of human cancer cells in a mouse model [Uno et al., 
Cancer Res 61:7855-60 (2001)]. 

10 Thus, the current consensus is that recent developments in the field of antisense 

technology which, as described above, have led to the generation of highly accurate antisense 
design algorithms and a wide variety of oligonucleotide delivery systems, enable an ordinarily 
skilled artisan to design and implement antisense approaches suitable for downregulating 
expression of known sequences without having to resort to undue trial and error experimentation. 

1 5 Target sites for antisense molecules are selected from the unique nucleotide sequences of 

each of the polynucleotides of the present invention, such that each polynucleotide is specifically 
down regulated. 

Ribozymes - Another agent capable of downregulating expression of the polypeptides of 
the present invention is a ribozyme molecule capable of specifically cleaving an mRNA transcript 

20 encoding the polypeptide variants of the present invention. Ribozymes are being increasingly used 
for the sequence- specific inhibition of gene expression by the cleavage of mRNAs encoding 
proteins of interest [Welch et al., Curr Opin Biotechnol. 9:486-96 (1998)]. The possibility of 
designing ribozymes to cleave any specific target RNA has rendered them valuable tools in both 
basic research and therapeutic applications. In therapeutics area, ribozymes have been exploited to 

25 target viral RNAs in infectious diseases, dominant oncogenes in cancers and specific somatic 

mutations in genetic disorders [Welch et al., Clin Diagn Virol. 10:163-71 (1998)]. Most notably, 
several ribozyme gene therapy protocols for HIV patients are already in Phase 1 trials. More 
recently, ribozymes have been used for transgenic animal research, gene target validation and 
pathway elucidation. Several ribozymes are in various stages of clinical trials. ANGIOZYME was 

30 the first chemically synthesized ribozyme to be studied in human clinical trials. ANGIOZYME 

specifically inhibits formation of the VEGF-r (Vascular Endothelial Growth Factor receptor), a key 
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component in the angiogenesis pathway. Ribozyme Pharmaceuticals, Inc., as well as other firms 
have demonstrated the importance of anti- angiogenesis therapeutics in animal models. 
HEPTAZYME, a ribozyme designed to selectively destroy Hepatitis C Virus (HCV) RNA, was 
found effective in decreasing Hepatitis C viral RNA in cell culture assays (Ribozyme 
5 Pharmaceuticals, Incorporated - WEB home page). 

Alternatively, down regulation of the polypeptide variants of the present invention may be 
achieved at the polypeptide level using downregulating agents such as antibodies or antibody 
fragments capabale of specifically binding the polypeptides of the present invention and inhibiting 
the activity thereof (i.e., neutralizing antibodies). Such antibodies can be directed for example, to 
10 the heterodimerizing domain on the variant, or to a putative ligand binding domain. Further 
description of antibodies and methods of generating same is provided below. 

PHARMACEUTICAL COMPOSITIONS AND DELIVERY THEREOF 
The present invention features a pharmaceutical composition comprising a therapeutically 
15 effective amount of a therapeutic agent according to the present invention, which is preferably a 
therapeutic protein variant as described herein. Optionally and alternatively, the therapeutic agent 
could be an antibody or an oligonucleotide that specifically recognizes and binds to the therapeutic 
protein variant, but not to the corresponding full length known protein. 

Alternatively, the pharmaceutical composition of the present invention includes a 
20 therapeutically effective amount of at least an active portion of a therapeutic protein variant 
polypeptide. 

The pharmaceutical composition according to the present invention is preferably used for 
the treatment of cluster-related diseases. 

"Treatment" refers to both therapeutic treatment and prophylactic or preventative measures. 
25 Those in need of treatment include those already with the disorder as well as those in which the 
disorder is to be prevented. Hence, the mammal to be treated herein may have been diagnosed as 
having the disorder or may be predisposed or susceptible to the disorder. "Mammal" for purposes 
of treatment refers to any animal classified as a mammal, including humans, domestic and farm 
animals, and zoo, sports, or pet animals, such as dogs, horses, cats, cows, etc. Preferably, the 
30 mammal is human. 
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A "disorder" is any condition that would benefit from treatment with the agent according to 
the present invention. This includes chronic and acute disorders or diseases including those 
pathological conditions which predispose the mammal to the disorder in question. Non- limiting 
examples of disorders to be treated herein are described with regard to specific examples given 
5 herein. 

The term "therapeutically effective amount" refers to an amount of agent according to the 
present invention that is effective to treat a disease or disorder in a mammal. In the case of cancer, 
the therapeutically effective amount of the agent may reduce the number of cancer cells; reduce the 
tumor size; inhibit (i.e., slow to some extent and preferably stop) cancer cell infiltration into 

1 0 peripheral organs; inhibit (i.e., slow to some extent and preferably stop) tumor metastasis; inhibit, 
to some extent, tumor growth; and/or relieve to some extent one or more of the symptoms 
associated with the cancer. To the extent the agent may prevent growth and/or kill existing cancer 
cells, it may be cytostatic and/or cytotoxic. For cancer therapy, efficacy can, for example, be 
measured by assessing the time to disease progression (TTP) and/or determining the response rate 

15 (RR). 

The therapeutic agents of the present invention can be provided to the subject per se, or as 
part of a pharmaceutical composition where they are mixed with a pharmaceutically acceptable 
carrier. 

As used herein a "pharmaceutical composition" refers to a preparation of one or more of the 
20 active ingredients described herein with other chemical components such as physiologically 
suitable carriers and excipients. The purpose of a pharmaceutical composition is to facilitate 
administration of a compound to an organism. 

Herein the term "active ingredient" refers to the preparation accountable for the biological 

effect. 

25 Hereinafter, the phrases "physiologically acceptable carrier" and "pharmaceutically 

acceptable carrier" which may be interchangeably used refer to a earner or a diluent that does not 
cause significant irritation to an organism and does not abrogate the biological activity and 
properties of the administered compound. An adjuvant is included under these phrases. One of the 
ingredients included in the pharmaceutically acceptable carrier can be for example polyethylene 

30 glycol (PEG), a biocompatible polymer with a wide range of solubility in both organic and aqueous 
media (Mutter et al. (1979). 
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Herein the term "excipient" refers to an inert substance added to a pharmaceutical 
composition to further facilitate administration of an active ingredient. Examples, without 
limitation, of excipients include calcium carbonate, calcium phosphate, various sugars and types of 
starch, cellulose derivatives, gelatin, vegetable oils and polyethylene glycols. 
5 Techniques for formulation and administration of drugs may be found in "Remington's 

Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest edition, which is incorporated 
herein by reference. 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, 
especially transnasal, intestinal or parenteral delivery, including intramuscular, subcutaneous and 
10 intramedullary injections as well as intrathecal, direct intraventricular, intravenous, intraperitoneal, 
intranasal, or intraocular injections. Alternately, one may administer a preparation in a local rather 
than systemic manner, for example, via injection of the preparation directly into a specific region of 
a patient's body. 

Pharmaceutical compositions of the present invention may be manufactured by processes 
1 5 well known in the art, e.g., by means of conventional mixing, dissolving, granulating, dragee- 
making, levigating, emulsifying, encapsulating, entrapping or lyophilizing processes. 

Pharmaceutical compositions for use in accordance with the present invention may be 
formulated in conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries, which facilitate processing of the active ingredients into 
20 preparations which, can be used pharmaceutically. Proper fomiulation is dependent upon the route 
of administration chosen. 

For injection, the active ingredients of the invention may be formulated in aqueous 
solutions, preferably in physiologically compatible buffers such as Hank's solution, Ringer's 
solution, or physiological salt buffer. For transmucosal administration, penetrants appropriate to 
25 the barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the active 
compounds with pharmaceutically acceptable carriers well known in the art. Such carriers enable 
the compounds of the invention to be formulated as tablets, pills, dragees, capsules, liquids, gels, 
30 syrups, slurries, suspensions, and the like, for oral ingestion by a patient. Pharmacological 

preparations for oral use can be made using a solid excipient, optionally grinding the resulting 
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mixture, and processing the mixture of granules, after adding suitable auxiliaries if desired, to 
obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as sugars, including 
lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize starch, 
wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, 
hydroxypropylmethyl- cellulose, sodium carbomethylcellulose; and/or physiologically acceptable 
polymers such as polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such 
as cross- linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium alginate. 

Dragee cores are provided with suitable coatings. For this purpose, concentrated sugar 
solutions may be used which may optionally contain gum arabic, talc, polyvinyl pyrrolidone, 
carbopol gel, polyethylene glycol, titanium dioxide, lacquer solutions and suitable organic solvents 
or solvent mixtures. Dyestuffs or pigments may be added to the tablets or dragee coatings for 
identification or to characterize different combinations of active compound doses. 

Pharmaceutical compositions, which can be used orally, include pusbefit capsules made of 
gelatin as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push- fit capsules may contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, lubricants such as talc or magnesium stearate and, optionally, 
stabilizers. In soft capsules, the active ingredients may be dissolved or suspended in suitable 
liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, stabilizers 
may be added. All formulations for oral administration should be in dosages suitable for the 
chosen route of administration. 

For buccal administration, the compositions may take the form of tablets or lozenges 
formulated in conventional manner. 

For administration by nasal inhalation, the active ingredients for use according to the 
present invention are conveniently delivered in the form of an aerosol spray presentation from a 
pressurized pack or a nebulizer with the use of a suitable propellant, e.g., dichlorodifluoromethane, 
trichlorofluoromethane, dichloro-tetrafluoroethane or carbon dioxide. In the case of a pressurized 
aerosol, the dosage unit may be determined by providing a valve to deliver a metered amount. 
Capsules and cartridges of, e.g., gelatin for use in a dispenser may be formulated containing a 
powder mix of the compound and a suitable powder base such as lactose or starch. 

The preparations described herein may be formulated for parenteral administration, e.g., by 
bolus injection or continuous infusion. Formulations for injection may be presented in unit dosage 
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form, e.g., in ampoules or in multidose containers with optionally, an added preservative. The 
compositions may be suspensions, solutions or emulsions in oily or aqueous vehicles, and may 
contain formulatory agents such as suspending, stabilizing and/or dispersing agents. 

Pharmaceutical compositions for parenteral administration include aqueous solutions of the 
5 active preparation in water-soluble form. Additionally, suspensions of the active ingredients may 
be prepared as appropriate oily or water based injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acids esters such as ethyl oleate, 
triglycerides or liposomes. Aqueous injection suspensions may contain substances, which increase 
the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol or dextran. 
10 Optionally, the suspension may also contain suitable stabilizers or agents which increase the 

solubility of the active ingredients to allow for the preparation of highly concentrated solutions. 

Alternatively, the active ingredient may be in powder form for constitution with a suitable 
vehicle, e.g., sterile, pyrogen- free water based solution, before use. 

The preparation of the present invention may also be formulated in rectal compositions 
15 such as suppositories or retention enemas, using, e.g., conventional suppository bases such as cocoa 
butter or other glycerides. 

Pharmaceutical compositions suitable for use in context of the present invention include 
compositions wherein the active ingredients are contained in an amount effective to achieve the 
intended purpose. More specifically, a therapeutically effective amount means an amount of active 
20 ingredients effective to prevent, alleviate or ameliorate symptoms of disease or prolong the survival 
of the subject being treated. 

Determination of a therapeutically effective amount is well within the capability of those 

skilled in the art. 

For any preparation used in the methods of the invention, the therapeutically effective 
25 amount or dose can be estimated initially from in vitro assays. For example, a dose can be 

formulated in animal models and such information can be used to more accurately determine useful 
doses in humans. 

Toxicity and therapeutic efficacy of the active ingredients described herein can be 
determined by standard pharmaceutical procedures in vitro, in cell cultures or experimental 
30 animals. The data obtained from these in vitro and cell culture assays and animal studies can be 

used in formulating a range of dosage for use in human. The dosage may vary depending upon the 
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dosage form employed and the route of administration utilized. The exact formulation, route of 
administration and dosage can be chosen by the individual physician in view of the patient's 
condition. (See e.g., Fingl, et al., 1975, in "The Pharmacological Basis of Therapeutics", Ch. 1 p.l). 
Depending on the severity and responsiveness of the condition to be treated, dosing can be 
5 of a single or a plurality of administrations, with course of treatment lasting from several days to 
several weeks or until cure is effected or diminution of the disease state is achieved. 

The amount of a composition to be administered will, of course, be dependent on the 
subject being treated, the severity of the affliction, the manner of administration, the judgment of 
the prescribing physician, etc. 

10 Compositions including the preparation of the present invention formulated in a compatible 

pharmaceutical carrier may also be prepared, placed in an appropriate container, and labeled for 
treatment of an indicated condition. 

Pharmaceutical compositions of the present invention may, if desired, be presented in a 
pack or dispenser device, such as an FDA approved kit, which may contain one or more unit 

15 dosage forms containing the active ingredient. The pack may, for example, comprise metal or 
plastic foil, such as a blister pack. The pack or dispenser device may be accompanied by 
instructions for administration. The pack or dispenser may also be accommodated by a notice 
associated with the container in a form prescribed by a governmental agency regulating the 
manufacture, use or sale of pharmaceuticals, which notice is reflective of approval by the agency of 

20 the form of the compositions or human or \eterinary administration. Such notice, for example, 

may be of labeling approved by the U.S. Food and Drug Administration for prescription drugs or of 
an approved product insert. 

IMMUNOGENIC COMPOSITIONS 

25 A therapeutic agent according to the present invention may optionally be a molecule, which 

promotes a specific immunogenic response against at least one of the polypeptides of the present 
invention in the subject. The molecule can be polypeptide variants of the present invention, a 
fragment derived therefrom or a nucleic acid sequence encoding thereof Although such a 
molecule can be provided to the subject per se, the agent is preferably administered with an 

30 immunostimulant in an immunogenic composiiton. An immunostimulant may be any substance 
that enhances or potentiates an immune response (antibody and/or cell- mediated) to an exogenous 
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antigen. Examples of immunostimulants include adjuvants, biodegradable microspheres (e.g., 
polylactic galactide) and liposomes into which the compound is incorporated (see e.g., U.S. Pat. 
No. 4,235,877). Vaccine preparation is generally described in, for example, M. F. Powell and M. J. 
Newman, eds., "Vaccine Design (Hie subunit and adjuvant approach)," Plenum Press (NY, 1995). 

Illustrative immunogenic compositions may contain DNA encoding one or more of the 
polypeptides as described above, such that the polypeptide is generated in situ. The DNA may be 
present within any of a variety of delivery systems known to those of ordinary skill in the art, 
including nucleic acid expression systems (see below), bacteria and viral expression systems. 
Numerous gene delivery techniques are well known in the art, such as those described by Rolland, 
Crit. Rev. Therap. Drug Carrier Systems 15:143-198, 1998, and references cited therein. 
Appropriate nucleic acid expression systems contain the necessary DNA sequences for expression 
in the subject (such as a suitable promoter and terminating signal). Bacterial delivery systems 
involve the administration of a bacterium (such as Bacillus -Calmette-Gueixin) that expresses an 
immunogenic portion of the polypeptide on its cell surface or secretes such an epitope. In a 
preferred embodiment, the DNA may be introduced using a viral expression system (e.g., vaccinia 
or other pox virus, retrovirus, or adenovirus), which may involve the use of a non-pathogenic 
(defective), replication competent virus. Suitable systems are disclosed, for example, in Fisher- 
Hoch et al., Proc. Natl. Acad. Sci. USA 86:317-321, 1989; Flexner et al., Ann. N.Y Acad. Sci. 
569:86-103, 1989; Flexner et al., Vaccine 8:17-21, 1990; U.S. Pat Nos. 4,603,1 12, 4,769,330, and 
5,017,487; WO 89/01973; U.S. Pat No. 4,777,127; GB 2,200,651; EP 0,345,242; WO 91/02805; 
Berkner, Biotechniques 6:616-627, 1988; Rosenfeld et at, Science 252:431-434, 1991; Kolls et at, 
Proc. Natt Acad. Sci. USA 91:215-219, 1994; Kass-Eisler et at, Proc. Natt Acad. Sci. USA 
90:11498-11502, 1993; Guzman et at, Circulation 88:2838-2848, 1993; and Guzman et at, Cir. 
Res. 73:1202-1207, 1993. Techniques for incorporating DNA into such expression systems are 
well known to those of ordinary skill in the art. The DNA may also be "naked," as described, for 
example, in Ulmer et at, Science 259:1745-1749, 1993 and reviewed by Cohen, Science 259:1691- 
1692, 1993. The uptake of naked DNA may be increased by coating the DNA onto biodegradable 
beads, which are efficiently transported into the cells. 

It will be appreciated that an immunogenic composition may comprise both a 
polynucleotide and a polypeptide component. Such immunogenic compositions may provide for 
an enhanced immune response. 
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Any of a variety of imiminostimulants may be employed in the immunogenic compositions 
of this invention. For example, an adjuvant may be included. Most adjuvants contain a substance 
designed to protect the antigen from rapid catabolism, such as aluminum hydroxide or mineral oil, 
and a stimulator of immune responses, such as lipid A, Bortadella pertussis or Mycobacterium 
5 tuberculosis derived proteins. Suitable adjuvants are commercially available as, for example, 

Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, Mich.); Merck 
Adjuvant 65 (Merck and Company, Inc., Rahway, N.J.); AS -2 (SmithKline Beecham, Philadelphia, 
Pa.); aluminum salts such as aluminum hydroxide gel (alum) or aluminum phosphate; salts of 
calcium, iron or zinc; an insoluble suspension of acylated tyrosine; acylated sugars; cationically or 

10 anionically derivatized polysaccharides; polyphosphazenes; biodegradable microspheres; 

monophosphoryl lipid A and quil A. Cytokines, such as GM-CSF or interleukin-2,-7, or - 12, may 
also be used as adjuvants. 

The adjuvant composition may be designed to induce an immune response predominantly 
of the Thl type. High levels of Thl -type cytokines (e.g., IFN- .gamma., TNF.alpha., IL-2 and IL- 

15 12) tend to favor the induction of cell mediated immune responses to an administered antigen. In 
contrast, high levels of Th2-type cytokines (e.g., IL-4, EL-5, IL-6 and IL-10) tend to favor the 
induction of humoral immune responses. Following application of an immunogenic composition as 
provided herein, the subject will support an immune response that includes Thl - and Th2-type 
responses. The levels of these cytokines may be readily assessed using standard assays. For a 

20 review of the families of cytokines, see Mosmann and Coffinan, Ann. Rev. Immunol. 7:145-173, 
1989. 

Preferred adjuvants for use in eliciting a predominantly Thl -type response include, for 
example, a combination of monophosphoryl lipid A, preferably 3- de-O- acylated monophosphoryl 
lipid A (3D-MPL), together with an aluminum salt. MPL adjuvants are available from Corixa 

25 Corporation (Seattle, Wash.; see U.S. Pat. Nos. 4,436,727; 4,877,61 1; 4,866,034 and 4,912,094). 
CpG-containing oligonucleotides (in which the CpG dinucleotide is unmethylated) also induce a 
predominantly Thl response. Such oligonucleotides are well known and are described, for example, 
in WO 96/02555, WO 99/33488 and U.S. Pat. Nos. 6,008,200 and 5,856,462. Immunostimulatory 
DNA sequences are also described, for example, by Sato et al., Science 273:352, 1996. Another 

30 preferred adjuvant is a saponin, preferably QS21 (Aquila Biopharmaceuticals Inc., Framingham, 
Mass.), which may be used alone or in combination with other adjuvants. For example, an 
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enhanced system involves the combination of a monophosphoryl lipid A and saponin derivative, 
such as the combination of QS21 and 3D-MPL as described in WO 94/00153, or a less reactogenic 
composition where the QS21 is quenched with cholesterol, as described in WO 96/33739. Other 
preferred formulations comprise an oil- in- water emulsion and tocopherol. A particularly potent 
5 adjuvant formulation involving QS21, 3D-MPL and tocopherol in an oil-in- water emulsion is 
described in WO 95/17210. 

Other preferred adjuvants include Montanide ISA 720 (Seppic, France), SAF (Chiron, 
Calif, United States), ISCOMS (CSL), MF-59 (Chiron), the SBAS series of adjuvants (e.g., 
SBAS-2 or SBAS -4, available from SmithKline Beecham, Rixensart, Belgium), Detox (Corixa, 
10 Hamilton, Mont.), RC-529 (Corixa, Hamilton, Mont.) and other aminoalkyl glucosaminide 4- 
phosphates (AGPs), such as those described in pending U.S. patent application Ser. Nos. 
08/853,826 and 09/074,720. 

A delivery vehicle may be employed within the immunogenic composition of the present 
invention to facilitate production of an antigen- specific immune response that targets tumor cells. 
15 Delivery vehicles include antigen presenting cells (APCs), such as dendritic cells, macrophages, B 
cells, monocytes and other cells that may be engineered to be efficient APCs. Such cells may be 
genetically modified to increase the capacity for presenting the antigen, to improve activation 
and/or maintenance of the T cell response, to have anti- tumor effects per se and/or to be 
immunologically compatible with the receiver (i.e., matched HLA haplotype). APCs may 
20 generally be isolated from any of a variety of biological fluids and organs, including tumor and 
peritumoral tissues, and may be autologous, allogeneic, syngeneic or xenogeneic cells. 

Dendritic cells are highly potent APCs (Banchereau and Steinman, Nature 392:245-251, 
1998) and have been shown to be effective as a physiological adjuvant for eliciting prophylactic or 
therapeutic antitumor immunity (see Timmernan and Levy, Ann. Rev. Med. 50:507-529, 1999). In 
25 general, dendritic cells may be identified based on their typical shape (stellate in situ, with marked 
cytoplasmic processes (dendrites) visible in vitro), their ability to take up, process and present 
antigens with high efficiency and their ability to activate naive T cell responses. Dendritic cells 
may, of course, be engineered to express specific cell- surface receptors or ligands that are not 
commonly found on dendritic cells in vivo or ex vivo, and such modified dendritic cells are 
30 contemplated by the present invention. As an alternative to dendritic cells, secreted vesicles 
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antigen- loaded dendritic cells (called exosomes) may be used within an immunogenic composition 
(see Zitvogel et al., Nature Med. 4:594-600, 1998). 

Dendritic cells and progenitors may be obtained from peripheral blood, bone marrow, 
tumor- infiltrating cells, peritumoral tissues-infiltrating cells, lymph nodes, spleen, skin, umbilical 
cord blood or any other suitable tissue or fluid. For example, dendritic cells may be differentiated 
ex vivo by adding a combination of cytokines such as GM-CSF, IL-4, IL- 13 and/or TNF.alpha. to 
cultures of monocytes harvested from peripheral blood. Alternatively, CD34 positive cells 
harvested from peripheral blood, umbilical cord blood or bone marrow may be differentiated into 
dendritic cells by adding to the culture medium combinations of GM-CSF, IL-3, TNF.alpha., CD40 
ligand, LPS, flt3 ligand and/or other compound(s) that induce differentiation, maturation and 
proliferation of dendritic cells. 

Dendritic cells are categorized as "immature" and "mature" cells, which allows a simple 
way to discriminate between two well characterized phenotypes. Immature dendritic cells are 
characterized as APC with a high capacity for antigen uptake and processing, which correlates with 
the high expression of Fey receptor and mannose receptor. The mature phenotype is typically 
characterized by a lower expression of these markers, but a high expression of cell surface 
molecules responsible for T cell activation such as class I and class II MHC, adhesion molecules 
(e.g., CD54 and CD11) and costimulatory molecules (e.g., CD40, CD80, CD86 and 4- IBB). 

APCs may generally be transfected with at least one polynucleotide encoding a polypeptide 
of the present invention, such that variant II, or an immunogenic portion thereof, is expressed on 
the cell surface. Such transfection may take place ex vivo, and a composition comprising such 
transfected cells may then be used for therapeutic purposes, as described herein. Alternatively, a 
gene delivery vehicle that targets a dendritic or other antigen presenting cell may be administered 
to the subject, resulting in transfection that occurs in vivo. In vivo and ex vivo transfection of 
dendritic cells, for example, may generally be performed using any methods known in the art, such 
as those described in WO 97/24447, or the gene gun approach described by Mahvi et al., 
Immunology and cell Biology 75:456-460, 1997. Antigen loading of dendritic cells may be 
achieved by incubating dendritic cells or progenitor cells with a polypeptide of the present inventio, 
DNA (naked or within a plasmid vector) or RNA; or with antigen- expressing recombinant 
bacterium or viruses (e.g., vaccinia, fowlpox, adenovirus or lentivirus vectors). Prior to loading, 
the polypeptide may be covalently conjugated to an immunological partner that provides T cell 
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help (e.g., a carrier molecule) such as described above. Alternatively, a dendritic cell may be pulsed 
with a non- conjugated immunological partner, separately or in the presence of the polypeptide. 



5 It is appreciated that certain features of the invention, which are, for clarity, described in 

the context of separate embodiments, may also be provided in combination in a single 
embodiment. Conversely, \arious features of the invention, which are, for brevity, described in 
the context of a single embodiment, may also be provided separately or in any suitable 
subcombination. 

10 

Although the invention has been described in conjunction with specific embodiments 
thereof, it is evident that many alternatives, modifications and variations will be apparent to 
those skilled in the art. Accordingly, it is intended to embrace all such alternatives, 
modifications and variations that fall within the spirit and broad scope of the appended claims. 

15 All publications, patents and patent applications mentioned in this specification are herein 
incorporated in their entirety by reference into the specification, to the same extent as if each 
individual publication, patent or patent application was specifically and individually indicated to 
be incorporated herein by reference. In addition, citation or identification of any reference in 
this application shall not be construed as an admission that such reference is available as prior 

20 art to the present invention. 



25 



30 



WO 2006/131783 



PCT/IB2005/004037 



1401 

WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a polynucleotide having a sequence of 
R11723_PEA_1_T5. 

2. The isolated polynucleotide of claim 1, comprising a node having a sequence of : 
Rl 1723JPEA_l_node_13. 

3. An isolated polypeptide comprising a polypeptide having a sequence of: 
R11723JPEA_1JP13. 

4. The isolated of claim 3, comprising a chimeric polypeptide encoding for 

Rl 1723JPEA_1 JP13, comprising a first amino acid sequence being at least 95 % homologous 
to 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEV 
MEQSA corresponding to amino acids 1 - 63 of Q96AC2, which also corresponds to amino 
acids 1-63 of Rl 1723JPEA_1JP13, and a second amino acid sequence being at least about 
95% homologous to a polypeptide having the sequence DTKRTNTLLFEMRHFAKQLTT 
corresponding to amino acids 64 - 84 of R11723JPEA_1JP13 ? wherein said first and second 
amino acid sequences are contiguous and in a sequential order. 

4. The isolated polypeptide of claim 4, comprising a tail of Rl 1723JPEA1 JP13, 
comprising a polypeptide being at least about 95% homologous to the sequence 
DTKRTNTLLFEMRHFAKQLTT in R11723JPEA_1JP13. 

5. The isolated oligonucleotide of claim 1 ? comprising an amplicon according to 
SEQIDNO: 1684. 

6. A primer pair, comprising a pair of isolated oligonucleotides capable of 
amplifying said amplicon of claim 5. 
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7. The primer pair of claim 6, comprising a pair of isolated oligonucleotides: SEQ 
NOs 1682 and 1683. 

8. An antibody capable of specifically binding to an epitope of an amino acid sequence of 
claim 3. 

9. The antibody of claim 8, wherein said amino acid sequence comprises said tail of claim 
4. 

10. The antibody of claim 8, wherein said antibody is capable of differentiating between 
a splice variant having said epitope and a corresponding known protein PSEC. 

11. A kit for detecting lung cancer, comprising a kit detecting overexpression of a 
splice variant according to claim 1 . 

12. The kit of claim 1 1, wherein said kit comprises a NAT-based technology. 

1 3 . The kit of claim 1 1 , wherein said kit further comprises at least one primer pair 
capable of selectively hybridizing to a nucleic acid sequence according to claim 1 . 

14. The kit of claim 11, wherein said kit further comprises at least one 
oligonucleotide capable of selectively hybridizing to a nucleic acid sequence according to claim 
1. 

12. A kit for detecting lung cancer, comprising a kit detecting overexpression of a 
splice variant according to claim 3, said kit comprising an antibody according claim 8. 

13. The kit of claim 12, wherein said kit further comprises at least one reagent for 
performing an ELISA or a Western blot. 
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14. A method for detecting lung cancer, comprising detecting overexpression of a 
splice variant according to claim 1 . 

15. The method of claim 14, wherein said detecting overexpression is performed 
with a NAT-based technology. 

16. A method for detecting lung cancer, comprising detecting overexpression of a 
splice variant according to claim 3, wherein said detecting overexpression is performed with an 
immunoassay. 

17. The method of claim 16, wherein said immunoassay comprises an antibody 
according to claim 8. 

18. A biomarker capable of detecting lung cancer, comprising a nucleic acid 
sequence according to claim 1 or a fragment thereof, or an amino acid sequence according to 
claim 3 or a fragment thereof. 

19. A method for screening for lung cancer, comprising detecting lung cancer cells 
with a biomarker according to claim 18. 

20. A method for diagnosing lung cancer, comprising detecting lung cancer cells 
with a biomarker according to claim 18. 

21 . A method for monitoring disease progression and/or treatment efficacy and/or 
relapse of lung cancer, comprising detecting lung cancer cells with a biomarker according to 
claim 18. 

22. A method of selecting a therapy for lung cancer, comprising detecting lung 
cancer cells with a biomarker according to claim 18 and selecting a therapy according to said 
detection. 
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> gil47124622l gb )AAH70449. 1 1 □ Mapkbpl protein [Mus musculus] 
Length = 1503 

Score = 911 bits (2354), Expect - 0.0 

Identities - 447/759 (58%), Positives - 576/759 (75%), Gaps - 11/759 (1%) 

Query 40 APPICLRRRTRLSTASEETVQNRVSLEKVLGITAQNSSGLTCDPGTGHVAYLAGCVVVIL 99 

+P I LRR + E + ++V+LEKVLG+T GL CDP +G VAY AGCVVV+ 

Sbjct: 19 SPSIKLRRSK — AGNRREDLSSKVTLEKVLGVTVSGGRGLACDPRSGLVAYSAGCVWLF 76 

Query: 100 DPKENKQQHIFNTARKSLSALAFSPDGKYIVTGENGHRPAVRIWDVEEKNQVAEMLGHKY 159 

+P+++KQ HI N++RK+++ALAFSPDGKY+VTGE+GH PAVR+WDV E++QVAE+ HKY 
Sbjct: 77 NPRKHKQHHI LNS S RKT I TALAFS PDGKYLVTGE SGHMP AVRVWDVAERS QVAELQEHKY 136 

Query: 160 GVACVAFSPNMKHIVSMGYQHDMVLNVWDWKKDIWASNKVSCRVIALSFSEDSSYFVTV 219 

GVACVAFSP+ K+IVS+GYQHDM++NVW WKK+IVVASNKVS RV A+SFSED SYFVT 
Sbjct: 137 GVACVAFSPSAKYIVSVGYQHDMIVNVWAWKKNIVVASNKVSSRVTAVSFSEDCSYFVTA 196 

Query: 220 GNRHVRFWFLXXXXXXXXXXXXPLVGRSGILGELHNNIFCGVACGRGRMAGSTFCVSYSG 279 

GNRH++FW+L PL+GRSG+LGEL NN+F VACGRG A STFC++ SG 

Sbjct: 197 GNRHIKFWYLDDSKTSKVNATVPLLGRSGLLGELRNNLFTDVACGRGEKADSTFCITSSG 256 

Query: 280 LLCQFNEKRVLEKWI NLKXXXXXXXXXXQELI FCGCTDGI VRI FQAHSLH YLANLPKPHY 1 339 

LLC+F+++R+L+KW+ L+ QE IFCGC DG VR+F +LH+L+ LP+PH 

Sbjct: 257 LLCEFSDRRLLDKWVELRTTVAHCISVTQEYIFCGCADGTVRLFNPSNLHFLSTLPRPHA 316 

Query: 340 LGVDVAQGLEPSFLFHRKAEAVYPDTVALTFDPIHQWLSCVYKDHSIYIWDVKDINRVGK 399 

LG D+A ESLF A YPDT+ALTFDP +QWLSCVY DHSIY+WDV+D +VGK 

sb j ct: 3X1 LGTD I AS I TEASRL FS GGVNARYPDT I ALT FDPTNQWL S C VYNDHS I YVWDVRDPKKVGK 376 

Query: 400 VWSELFHSSYVWNVEVYPEFED-QRACIiPSGSFLTCSSDNTIRFWNLDSSP DSHWQKN 456 

V+S L+HSS VW+VEVYPE +D +ACLP SF+TCSSDNTIR WN +SS S +N 
Sbjct: 377 VYSAIiYHSSCVWSVEVYPEIKDSHQACLPPSSFITCSSDNTIRLWNTESSGVHGSTLHRN 436 

Query: 457 I FSNTLLKVV YVEN DI QHLQDMS HFP DRGS ENGT PMDVKAGVRVMQVS PDGQHLAS GDRS 516 

I SN L+K++YV+ + Q L D + P +G+ MD + G+R + +SP+GQHLASGDR 

Sbjct: 437 ILSNDLIKIIYVDGNTQALLD-TELPGGDKADGSLMDPRVGIRSVCISPNGQHLASGDRM 495 

Query: 517 GNLRIHELHFMDELVKVEAHDAEVLCLEYSKPETGLTLLASASRDRLIHVLNVEKNYNLE 576 

G LRIHEL + E++KVEAHD+E+LCLEYSKP+TGL LLAS ASRDRLI HVL+ + Y+L+ 
Sbjct: 496 GTLRIHELQSLSEMLKVEAHDSEILCLEYSKPDTGLKLLASASRDRLIHVLDAGREYSLQ 555 

Query: 577 QTLDDHSSSITAIKFAGNR-DIQMISCGADKSIYFRSAQQGSDGLHFVRTHHVAEKTTLY 635 

QTLD+HSSSITA+KFA + ++MISCGADKSIYFR+AQ+ +G+ F RTHHV KTTLY 
Sbjct: 556 QT LDEHS S S I T A VKFAAS DGQVRMI S C GADKS I YFRT AQKS GEGVQFT RTHH VVRKTTLY 615 

Query: 636 DMDIDITQKYVAVACQDRNVRVYNTVNGKQKKCYKGSQGDEGSLLKVHVDPSGTFLATSC 695 

DMD++ + KY A+ CQDRN+R++N +GKQKK +KGSQG++G+L+KV DPSG ++ATSC 
Sbjct: 616 DMDVEPSWKYTAIGCQDRNIRI FNISSGKQKKLFKGSQGEDGTLIKVQTDPSGI YIATSC 675 

Query: 696 SDKSISVIDFYSGECIAKMFGHSEIITSMKFTYDCHHLITVSGDSCVFIWHLGPEITNCM 755 

SDK++S+ DF SGEC+A MFGHSEI+T MKF4- DC HLI+VSGDSC+F+W L E+T M 
Sbjct: 676 SDKNLSIFDFSSGECVATMFGHSEIVTGMKFSNDCKHLISVSGDSCIFWRLSSEMTISM 735 

Query: 756 KQHLLEIDHRQ QQQHTNDKKRSGHPRQDTYVSTPS 7 90 

+Q L E+ RQ QQ T+HSG + V PS C I fl RQA 

Sbjct: 736 RQRLAELRQRQRGIKQQGPTSPQRASGAKQHHAPVVPPS 774 I IW. \J\Jr\ 
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>gi! 348567171reflXP 342499.11 □ similar to JNK-binding protein JNKBP1 [Rattus norvegicus] 
Lengths 1530 

Score =910 bits (2353), Expect = 0.0 

Identities = 467/868 (53%), Positives = 611/868 (70%), Gaps = 29/868 (3%) 

Query 40 APPICLRRRTRLSTASEETVQNRVSLEKVLGITAQNSSGLTCDPGTGHVAYLAGCVWIL 99 

+P I LRR + E + + + V+ IiEK VLG-HT GL CDP +G VAY AGCVW+ 

Sbjct: 18 S PS I KLRRSK — AGNRREDLS SKVTLEKVLGVT VSGGRGL AC DPRSGLVAYPAGC WVLF 75 

Query 100 DPKENKQQHIFNTARKSLSALAFSPDGKYIVTGENGHRPAVRIWDVEEKNQVAEMLGHKY 159 

+P+++KQ HI N++RK+++ALAFSPDGKY+VTGE+GH PAVR+WDV E+NQVAE-f- HKY 
Sbjct: 76 N P RKHKQHH I LN S SRKT I T AL A FS P D GK YL VTGE S GHMP AVRVWD V AE RN QVAELQEH K Y 135 

Query 160 GVACVAFSPNMKHIVSMGYQHDMVLNVWDWKKDIVVASNKVSCRVIALSFSEDSSYFVTV 219 

GVACVAFSP+ K+IVS+GYQHDM++NVW WKK+IVVASNKVS RV A+SFSED SYFVT 
Sbjct: 136 GVACVAFSPSAKYIVSVGYQHDMIWVWAWKKNIVVASNKVSSRVTAVSFSEDCSYFVTA 195 

Query: 220 GNRHVRFWFLXXXXXXXXXXXXPLVGRSGILGELHNNIFCGVACGRGRMAGSTFCVSYSG 279 

GNRH++FW+L PL+GRSG+LGEL NN+F VACGRG+ A STFC++ SG 

Sbjct: 196 GNRHIKFWYLDDSKTSKVNATVPLLGRSGLLGELRNNLFTDVACGRGKKADSTFCITSSG 255 

Query: 280 LLCQFNEKRVLEKWINLK XXXXXXXXXXQELI FCGCTDGIVRI FQAHSLHYLAN 333 

LLC+F4-++R+L+KW+ L+ QE IFCGC DG VR+F +LH+L+ 

Sbjct: 25 6 LLCEFSDRRLLDKWVEIiRNTDSFTTTVAHCISVSQEYIFCGCADGTVRLFNPSNLHFLST 315 

Query 334 LPKPHYLGVDVAQGLEPSFLFHRKAEAVYPDTVALTFDPIHQVJLSCVYKDHSIYIWDVKD 393 

LP+PH LG D+A E S LF A A YPDT+ALTFDP +QWLSCVY DHSIY+WDV+D 
Sbjct: 316 LPRPHALGTDIATITEASRLFSGGANARYPDTIALTFDPANQWLSCVYNDHSIYVWDVRD 375 

Query 394 INRVGKVWSELFHSSYVWNVEVYPEFED-QRACLPSGSFLTCSSDNTIRFWNLDSSP— D 450 

+VGKV+S L+HSS VW+VEVYPE +D +ACLP SF+TCSSDNTIR WN +SS 
Sbjct: 376 PKKVGKVYSALYHSSCVWSVEVYPEIKDSNQACLPPSSFITCSSDNTIRLWNTESSGVHG 435 

Query: 451 SHWQKNIFSNTLLKVVYVENDIQHLQDMSHFPDRGSENGTPMDVKAGVRVMQVSPDGQHL 510 

S +NI SN L+K++YV+ + Q L D + P +G+ MD + G+R + +SP+GQHL 

Sbjct: 436 SALHRNILSNDLIKII YVDGNTQALLD-TELPGGDICADGSLMDPRVGIRSVCISPNGQHL 494 

Query: 511 ASGDRSGNLRIHELHFMDELVKVEAHDAEVLCLEYSKPETGLTLLASASRDRLIHVLNVE 570 

ASGDR G LR+HEL + EL4-KVEAHD+E+LCLEYSKP+TGL LLASASRDRLIHVL+ 
Sbjct: 495 ASGDRMGTLRVHELQSLSELLKVEAHDSEILCLEYSKPDTGLKLLASASRDRLIHVLDAG 554 

Query: 571 KN YNLEQTLDDHS SSI TAI KFAGNR-DIQMI SCGADKS I YFRS AQQGS DGLHF VRTHHVA 629 

+ Y+L+QTLD+HSSSITA+KFA + + +MI SCGADKS I YFR+AQ+ +G+ F RTHHV 
Sbjct: 555 REYSLQQTLDEHSSSITAVKFAASDGQVRMISCGADKSIYFRTAQKSGEGVQFTRTHHVV 614 

Query: 630 EKTTLYDMDIDITQKYVAVACQDRNVRVYNTVNGKQKKCYKGSQGDEGSLLKVHVDPSGT 689 

KTTLYDMD++ + KY A+ CQDRN+R++N +GKQKK +KGSQG++G+L+KV DPSG 
Sbjct: 615 RKTTLYDMDVEPSWKYTAIGCQDRNIRIFNISSGKQKKLFKGSQGEDGTLIKVQTDPSGI 674 

Query: 690 FLATSCSDKSISVIDFYSGECIAKMFGHSEIITSMKFTYDCHHLITVSGDSCVFIWHLGP 749 

++ATSCSDK++S+ DF+SGEC+A MFGHSEI+T MKF+ DC HLI+VSGDSC+F+W h 
Sbjct: 675 YIATSCSDKNLSIFDFFSGECVATMFGHSEIVTGMKFSNDCKHLISVSGDSCIFVWRLSS 734 

Query: 750 EITNCMKQHLLEIDHRQ QQQH T N DKKRS GH P RQDT Y V STPSEIHSLSP GXXXXXXX 805 

4 E+T M+Q L E+ RQ QQ T+ +K SG + V PS P 
Sbjct: 735 EMTISMRQRLAELRQRQRGIKQQGPTSPQKASGAKQHHPPVVPPS GPALSSDSDK 789 

Query: 806 XXXXXXXXMLKTPSKDSLDPDPRCLLTNGKLPL WAKRLLGDDDVADGSAFHAK 858 

+ P+L + L+GP W ++ G A A 

Sbjct: 790 EGEDEGTEEEELPALPILGKSTKKELASGSSPALLRSLSHWEMSRAQENMEFLGPAPTAN 849 

Query: 859 RSYQPHGRWAERAGQEPLKTILDAQDLD 88 6 f\r\r-\ 
+ GRWA+ + +-M-+LD + L+ RIGl fi jR 

Sbjct: 850 TGPKRRGRWAQPGVELSVRSMLDLRQLE 877 1 1 W ■ Vyv-H-/ 
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73-Arn- skeletal muscle 
72-CG-brain 
71 -B -brain 
70-CI-brain 
69-B -brain 
68-CG -cerebellum 
67-CG -cerebellum 
66-B -kidney 
65-Cl-kidney 
64-Arn- kidney 
63-CI-saliuary gland 
62-CG -thyroid 
61 -B -thyroid 
60-B -thyroid 
59-B -thymus 
58-Am-thymus 
57-CG -thymus 
56-B -spleen 
55- Am spleen 
54-CG -spleen 
53-CG EH -blood 
52-CGEH-blood 
51-CGEM-blood 
50-CI-BM 
49-CG -liver*** 
48-CG -liver 
47-Amliuer 
46-CG -heart 
45-CG -heart fib 
44-B-heart 
43-B -adrenal 
42-CG -adrenal 
41-CI-testis 
40-B -testis 
39-Am-testis 
38-Am-prostate (P59) 
i' 37-Am-prastate (P42) 
36-CI-prostate(P53) 
35-Am-breast (B64) 
34-Arn-breast IB 63) 
33-B-breast (B59) 
32-B -placenta 
31 -B -placenta 
" 30-Am-placenta 
29-B-bladder 
28-Am-bladder 
27-B -bladder 
26-B -uterus 
25-B -uterus 
24-B -uterus 
23-B-cervix 
22-B -cervix*** 
21-Am-cervix 
20-B -ovary (046) 
19-B -ovary (Q48) 
1 8-Am-ouary (0 47) 
17-B-lung (L92) 
16-Am-lung (L93) 
15-B-lung 
14-CG -pancreas 
13-Am-pancreas 
12-B -esophagus 
11 -B -esophagus 
10-B -stomach 
9-Am-stomach 
8-B-rectum 
7-B -rectum 
6-B -rectum 
5-B -small intestine 
4- Am small intestine 
3-CI-colon (C70) 
2 B colon (C69) 
j£ 1-Am-colon (C71) 
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. 74-CI-sfceletal muscle 
~ 73-Am-skeletal muscle 
\ 72-CG -brain 
71 -B -brain 
. 70-CI-brain 
. G9-B -brain 
. G8-CG -cerebellum 
. 67-CG -cerebellum 
6S-B -kidney 
. 65-Cl-kidney 
\ 64-Am-kidney 
€3-CI-salivary gland 
[ G2-CG -thyroid 
61 -B -thyroid 
_60-B -thyroid 
. 5 9-B -thymus 
58-Am-thymus 
. 57-CG -thymus 
_56-B -spleen 
. 55-Am-spleen 
54-CG -spleen 
'53-CGEH-!>lood 
52-CGEH-blood 
51-CG Ell -blood 
50-CI-BM 
.49-CG -liver* 1 1 
. 48-CG -liver 
47-Am-liver 
4G-CG -heart 
45-CG -heart fib 
44-B -heart 
, 43-B -adrenal 
' 42-CG -adrenal 
41-CI-testis 
. 40-B -testis 
39-Anvtestis 
; 38-Anvprostate{P50) 
37-Anvprostate (P42) 
36-CI-prostate <P53) 
35-Anvbreast (B64) 
34-Am-breast (B63f 
33-B -breast (Bm) 
' 32-B -placenta 
" 31 -B -placenta 

30-Am-placenta 
[ 2§-B -bladder 
28-Am-hladder 
27-B -bladder 
. 26-B -uterus 
" 25-B -uterus 
24-B -litems 
" 23-B -cervix 
\ 22-B -cervix u> 
21-Am-cervix 
=[20-B -ovary (04«) 
19-B -ovary (04S) 
1 8-Anvovary {047) 
17-B-lung <L32) 
JlS-Am-lung {L»3> 
=£l5-B-lung 
. 14-CG -pancreas 
' 13-Anvpancreas 
' 12-B -esophagus 
\ 11-B -esophagus 
. 10-B -stomach 
[ 9-Anvstomach 
\ 8-B -rectum 
" 7-B -rectum 
] G-B -rectum 
" 5-B -small intestine 
" 4- Am- small intestine 
" 3-CI-colon (C70) 
'2-B -colon {C69) 
i=J 1-Anvcolon (C71) 
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99- Am-N 
98-Am-iM 
97-Am-N 
96-Am-N 
93-Arn-N 
92-B-N 
91-B-N 
90-B-N 
50-B-N 
49-B-N 
48-B-N 
47-B-N 
46-B-N M44 

85- B-Smail cell card 
84-B-Small cell carci 
83-B-Small cell carci 

86- B-Small cell carci G3 
33-B-Small cell carci G3 
32-B-Small cell carci G3 
31-B-Small cell carci G3 
30-B-Small cell carci G3 
82-B-Large cell 
39-B-Large cell 
38-B-Large ceil 

87- B-Large cell G3 
25-CG-Squamous 
24-CG-Squamous 
23-CG-Squamous 

100- B-Squamous 

88- B-Squainous 
22-B-Squamous 
20-B-Squamous 
79-B-Squamous G3 
81-B-Squarnous G3 

18- B-Squamous G2-3 
, 80-B-Squamous G2 

78-B-Squamous G2 
' 21-B-Squamous G2 
. 17-B-Squamous G2 

16-B-Squarnous G2 

19- B-Squarnous G1 

44- B-Alvelous Adeno G2 

45- B-Alvelous Adeno 
15-CG-Bronch adeno 
14-CG- Adeno 
3-CG-Adeno 

94- B-Adeno G3 

76- B-Adeno G3 

89- B-Adeno G2-3 
13-B-Adeno G2-3 

77- B-Adeno G2 
75-B-Adeno G2 
12-B-Adeno G2 

95- B-Adeno G1 
2-B-Adeno G1 
1-B-Adeno G1 
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TGCTGTCGCCTCCTCTGATGCGCTTGCCCTCTCCCGGCCCCGGGAC 
JCCGGGAGAATGTGGGTCCTAGGCATCGCGGCAACTTTTTGCGGAT 
TGTTCTTGCTTCCAGGCTTTGCGCTGCAAATCCAGTGCTACCAGTGT 
GAAGAATTCCAGCTGAACAACGACTGCTCCTCCCCCGAGTTCATTG 
TGAATTGCACGGTGAACGTTCAA GA CATGTGTCA GAAA GAAGTGAT 
GGA GCAAAGTGCCGA CACTAAAA GAA CAAA CA CCTTGCTCTTCGA 
GATGA GA CATTTTGCCAAGCA G TTGA CCA CTTA GTTCTCA AG AAGCA 
ACTATCTCTTTCATGTGCCTTCTGA GG 

FIG. 80 
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MRGSHHHHHHGUASMWVLGIAATFCGLFLLPGFALQIQCYQCEEF 
QLNNDCSSPEFIVNCTVNVQDMCQKEVMEQSADTKRTNTLLFEMR 

HFAKQLTT 

FIG. 83 
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GATCTCGATCCCGCGAAATTAATACGACTCACTATAGGGAG/ICC/IC/MCG6T7TCCCTCT/IGAA 

ATAATTTTGTTTAACTTTAAGAAGGAGATATACATATGCGGGG7TCTCA1CAJCAJCATCATCATG 

GTATGGCTAGCATGTGGGTCCTAGGCATCGCGGCAACTTTTTGCGGATTGTTCTTGCTTCCAG 

GCTTTGCGCTGCAAATCCAGTGCTACCAGTGTGAAGAATTCCAGCTGAACAACGACTGCTCC 

TCCCCCGAGTTCATTGTGAATTGCACGGTGAACGTTCAAGACATGTGTCAGAAAGAAGTGAT 

GGAGCAAAGTGCCGACACTAAAAGAACAAACACCTTGCTCTTCGAGATGAGACATTTTGCCA 

AGCAGTTGACCACTTAG>\/\GCrrG/\rCCGGCrGCTAACAA/\GCCCG^GG^GCTG/AGr7G 

GCTGCTGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGG 

GGTTTTTTGCTGAAAGGAGGAACTATATCCGGATCTGGCGTAATAGCGAAGAGGCCCGCACCG 

ATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGGACGCGCCCTGTAGCGGCGCA 

TTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGC 

GCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGC 

TCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCAC CTCGA CCCCAAAAAA 

CTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGA 

CGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATC 

TCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAA 

ATTTAACAAAAATTTAAC G C G AATTTTAAC AAAATATTAAC G CTTACAATTTAG GTG G C ACTTTTCG 

GGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAA 

GAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGA AGAGTA TGAGTATTCAACATTT 

CCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACG 

CTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGAT 

CTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTT 

TTAMGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCG 

CCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGG 

ATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCAT GAGTGA TAACACTGCGGCCAA 

CTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGA 

TCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCG 

TGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTT 

ACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTC 

TGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGT 

CTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACAC 

GACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACT 

GATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTA^ 

TTTTMTTTAAAAGGATCTAGGTGMGATCCTTTTTGATAATCTCATGACCAAMTCCCTTAACGT 

GAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTT 

TTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTT 

GCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACC 

AAATACTGTTCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTA 

CATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTAC 

CGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTT 

CGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAG 

CTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAG 

GGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTC 

CTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGA 

GCGTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGGCTTTTGGTGGCCTTTTG 

CTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGA 

GCTGATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGG 

MGAGCGCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAG 

FIG. 84 
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Sequence Listing: 

<210> SEQ ID NO 1 

<211> Length : 1,250 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 1 
>H61775_T21 

GGAGGCGCTCGGGGCATCCGAGGCGGGGAGGCGGGTCCGCCCCCTATTGTGTAGCGGCGAGAGTGGAGCCGAGCGGT 
GCGGAGCAGATCTGGTGGTTCTCCGGAGAGCAGCTTCCTTGGGTGTTACATGAGCCAAGCCCTCACTGTACAGAAGA 
GTGAGAGCTGAAACCTGTTCCCTGAGCTGATCAGAAGGACATCCCTTGGCCCCTCCATCTGGGCTCCTGTGGATAGG 
AGGGGCTGGGTGAGCAGGCCAGCTGGGCTATGGTGTGGTGCCTCGGCCTGGCCGTCCTCAGCCTGGTCATCAGCCAG 
GGGGCTGACGGTCGAGGGAAGCCTGAGGTGGTATCGGTGGTGGGCCGGGCTGGGGAGAGTGTGGTGCTGGGCTGTGA 
CCTGCTGCCCCCGGCCGGCCGGCCCCCCCTGCATGTCATCGAGTGGCTGCGCTTTGGATTCCTGCTTCCCATCTTCA 
TCCAGTTCGGCCTCTACTCTCCCCGAATTGACCCTGATTACGTGGGTGACTGTGGGTTCCCTGCCTTCCGAGAGCTT 
AAGAGAGCAGAGACTGTGTCTCCTGTTTTCTTCACACGCCGCTGCATATGGGAAGATCTGAAGTCAACAGGCTTTAG 
CCCTGCAGGTGGAGGGAGGCCTCCAGGAGGTGGGCCCAGGACTCAGGAGGACTCAGGGCTGCCCTGCTGGCGATCTT 
CCTGTTCTGTAACACTACAGGTCTAGCAGTCCAGCTGTCACAGAAAAGCTAGGACATGCAGTATGCTTCTTTGGATA 
TTCTGAGTAACATTTGGACTGTTACCCATTGGCTACCAGCATCTCCCAAGTGAGAATACATAGATTACCCCCAGTGC 
CCTGAACAGCACTCGGTCCTAACACCCGTGTCCATGGAAAGCACGCCGCGTCTGGAGAAAGAAGCCGAAGGCTCTTG 
TCACTTACTAGCCATGTGATTTTGGAAAGAAACTTAACATTAATTCCTTCAGCTACAATGGAATTCTTGGGAGGATT 
AAATATGGTGACAACGCCTAATATTAGATGGCCTGTATTCCACACTCAATCTTCCTTCCCTCTTCTTCCTTCTTTGT 
AGAGCTATAATGAAAAGTATCATGTGGGACACAGAAGAGGTTGCAGTCTGGGGTCTGCAGGGCTTAGCGGCCAGGCA 
GATTAGCTTTCTTGAGGAATCCTGACAGTGGGTGGAAGGGTATGATGATG 

<210> SEQ ID NO 2 

<211> Length : 715 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 2 
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2 

>H61775_T22 

GGAGGCGCTCGGGGCATCCGAGGCGGGGAGGCGGGTCCGCCCCCTATTGTGTAGCGGCGAGAGTGGAGCCGAGCGGT 
GCGGAGCAGATCTGGTGGTTCTCCGGAGAGCAGCTTCCTTGGGTGTTACATGAGCCAAGCCCTCACTGTACAGAAGA 
GTGAGAGCTGAAACCTGTTCCCTGAGCTGATCAGAAGGACATCCCTTGGCCCCTCCATCTGGGCTCCTGTGGATAGG 
AGGGGCTGGGTGAGCAGGCCAGCTGGGCTATGGTGTGGTGCCTCGGCCTGGCCGTCCTCAGCCTGGTCATCAGCCAG 
GGGGCTGACGGTCGAGGGAAGCCTGAGGTGGTATCGGTGGTGGGCCGGGCTGGGGAGAGTGTGGTGCTGGGCTGTGA 
CCTGCTGCCCCCGGCCGGCCGGCCCCCCCTGCATGTCATCGAGTGGCTGCGCTTTGGATTCCTGCTTCCCATCTTCA 
TCCAGTTCGGCCTCTACTCTCCCCGAATTGACCCTGATTACGTGGGATAAGAGTTCTCCTCAGGTGGGAGGTAGGGA 
GGTATCAGCAAGAAAGGTGGGCTGGGTAGAGTCGCACAAGGCCTCCTATGAACG.GCTTTGTCCCTGCTCTGATCTCA 
TCTCCAGCTCTGCTGCCTTAACTCTGCTTAATAAGCATGGCTGTGCTCCCAAGCAGTGTTAATTCATTGAAAGATGT 
CATTCATTTACACACACACACA 



<210> SEQ ID NO 3 

<211> Length : 2,875 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 3 
>M8 54 9 1_PEA_1_T1 6 

TCTGCTGGCTGCGCGGTGGCGGCGGCTGTGTGTGCGCCGCGCCTTGCCGCCCCCCCTGGCCCCCCGAGCCCGGGGCG 
CGCGCTCCCGCCCGGGCCGTCCGGGCCCCGCGGCGCCGCGGCCCGAGGCCCCGGGAAGCGCAGCCATGGCTCTGCGG 
AGGCTGGGGGCCGCGCTGCTGCTGCTGCCGCTGCTCGCCGCCGTGGAAGAAACGCTAATGGACTCCACTACAGCGAC 
TGCTGAGCTGGGCTGGATGGTGCATCCTCCATCAGGGTGGGAAGAGGTGAGTGGCTACGATGAGAACATGAACACGA 
TCCGCACGTACCAGGTGTGCAACGTGTTTGAGTCAAGCCAGAACAACTGGCTACGGACCAAGTTTATCCGGCGCCGT 
GGCGCCCACCGCATCCACGTGGAGATGAAGTTTTCGGTGCGTGACTGCAGCAGCATCCCCAGCGTGCCTGGCTCCTG 
CAAGGAGACCTTCAACCTCTATTACTATGAGGCTGACTTTGACTCGGCCACCAAGACCTTCCCCAACTGGATGGAGA 
ATCCATGGGTGAAGGTGGATACCATTGCAGCCGACGAGAGCTTCTCCCAGGTGGACCTGGGTGGCCGCGTCATGAAA 
ATCAACACCGAGGTGCGGAGCTTCGGACCTGTGTCCCGCAGCGGCTTCTACCTGGCCTTCCAGGACTATGGCGGCTG 
CATGTCCCTCATCGCCGTGCGTGTCTTCTACCGCAAGTGCCCCCGCATCATCCAGAATGGCGCCATCTTCCAGGAAA 
CCCTGTCGGGGGCTGAGAGCACATCGCTGGTGGCTGCCCGGGGCAGCTGCATCGCCAATGCGGAAGAGGTGGATGTA 
CCCATCAAGCTCTACTGTAACGGGGACGGCGAGTGGCTGGTGCCCATCGGGCGCTGCATGTGCAAAGCAGGCTTCGA 
GGCCGTTGAGAATGGCACCGTCTGCCGAGGTTGTCCATCTGGGACTTTCAAGGCCAACCAAGGGGATGAGGCCTGTA 
CCCACTGTCCCATCAACAGCCGGACCACTTCTGAAGGGGCCACCAACTGTGTCTGCCGCAATGGCTACTACAGAGCA 
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GACCTGGACCCCCTGGACATGCCCTGCACAACCATCCCCTCCGCGCCCCAGGCTGTGATTTCCAGTGTCAATGAGAC 
CTCCCTCATGCTGGAGTGGACCCCTCCCCGCGACTCCGGAGGCCGAGAGGACCTCGTCTACAACATCATCTGCAAGA 
GCTGTGGCTCGGGCCGGGGTGCCTGCACCCGCTGCGGGGACAATGTACAGTACGCACCACGCCAGCTAGGCCTGACC 
GAGCCACGCATTTACATCAGTGACCTGCTGGCCCACACCCAGTACACCTTCGAGATCCAGGCTGTGAACGGCGTTAC 
TGACCAGAGCCCCTTCTCGCCTCAGTTCGCCTCTGTGAACATCACCACCAACCAGGCAGCTCCATCGGCAGTGTCCA 
TCATGCATCAGGTGAGCCGCACCGTGGACAGCATTACCCTGTCGTGGTCCCAGCCAGACCAGCCCAATGGCGTGATC 
CTGGACTATGAGCTGCAGTACTATGAGAAGGTACCTATTGGCTGGGTGCTGTCCCCATCACCCACCTCCCTGAGGGC 
CCCTCTCCCAGGCTGAGGCCTGGGAGTTCTGCCCCACCGCAAGATGAGACGCACTGGTGCAGCAGAAAGAGCACTGG 
CCTTGGAGTCAGGCTGCCTGGCTCCCAATCCAGCTCCGCTCCTTCCCACTGTGAGACCTCAGGCAGGTGCCTTGACC 
TCTCTGGATCTCACTTTTCTGGTCTGGAGGATACACCCAGCAATCTCAGTGAAATGCAACAGTCACATCCCTTTCCC 
TACCACGACCCTTTCATCTTGACCTCAGTGGCTTGATGTTGGGAAAAACTGGGTTTCCAAAAAGCTGCACTTATGAA 
GTGATAATTAGTCACTCACCTCTTCTTCGACAGAGATTTGAAACAGCTCAAGAGAGCTTCCGCCTGCCCTGCTCTGA 
GTCCTGCTAAAACACCCACTTTCACTCGCCTGCATGCCCTTTGCATGGGGAGAGGTGATTTCACTTTGAGCTTTTAA 
ATCAGACCTTAATTACTCCCTTTGGGTGGAAGCCCCTGGGATGGTAGAAGGATCACTGGACTAAGAGTGAGAAGCCG 
TAGGTTCAAATCCCAGCTCCGTCCTTCACCAGCTATGTGACCTTGGGCAGGCGTCTTTCTCCCTCTGAACCTCAGTT 
TCCACCTGTGTCGAGTGTGGGTGAGACCCCTCGCGGGGAGCTATGCAGGTTACGGAGAAAAGGCAGCACAGCACCCA 
GAATGGGACCTGGCCCTCAGCAGAGGCCATGTGTGTCCCTGGCCTTCCTCCTCTGCCCTGCCTGCTGCACAGTGGGC 
AATGGTGACAGGATGGGAGGCCAAGTGGATGTGGGGTCTGCACAGTACAGGGGCCAGGAGGTAGACAGCACAATTGC 
CCACCCACATGGCTGGACATCAGAGGCCCCAGGAAGCCTCTCCTTTGAATGATCACTTCTCTTACCTGCTCCAGGAG 
GCAACAAACAGCCACAGAGGCTGCAAGGGCACCTGGGAAAGGCATCGCGGGGCTTCCATTCAGACTAGGTGTCAATG 
ACTGACAGGGAGGCCTTTGGTTGAGGGCAAGCCCACGGGGAACTGCAGATGGATGGAAGGGCTCTCCCTGAAGGCTG 
AGAGGAAGAGTGCAGTCAATTGCAGCCAGTCCTGCTGGAGCCCAACTTTCTAGAGCCCAGCCCGGCCTTCCCACTCT 
GTTAACTGCTGGATCGGCTAACCAGGCCGGTCTCCAGGGCCTTTCAAACACTTACCCAGCCTTTGCCGGCCGTCTTA 
CCATTGCTTGCGTGCGTGTTCATCCC 

<210> SEQ ID NO 4 

<211> Length : 1,182 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 4 
>M8 5 4 9 1_PEA_1_T2 0 

TCTGCTGGCTGCGCGGTGGCGGCGGCTGTGTGTGCGCCGCGCCTTGCCGCCCCCCCTGGCCCCCCGAGCCCGGGGCG 
CGCGCTCCCGCCCGGGCCGTCCGGGCCCCGCGGCGCCGCGGCCCGAGGCCCCGGGAAGCGCAGCCATGGCTCTGCGG 
AGGCTGGGGGCCGCGCTGCTGCTGCTGCCGCTGCTCGCCGCCGTGGAAGAAACGCTAATGGACTCCACTACAGCGAC 
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TGCTGAGCTGGGCTGGATGGTGCATCCTCCATCAGGGTGGGAAGAGGTGAGTGGCTACGATGAGAACATGAACACGA 
TCCGCACGTACCAGGTGTGCAACGTGTTTGAGTCAAGCCAGAACAACTGGCTACGGACCAAGTTTATCCGGCGCCGT 
GGCGCCCACCGCATCCACGTGGAGATGAAGTTTTCGGTGCGTGACTGCAGCAGCATCCCCAGCGTGCCTGGCTCCTG 
CAAGGAGACCTTCAACCTCTATTACTATGAGGCTGACTTTGACTCGGCCACCAAGACCTTCCCCAACTGGATGGAGA 
ATCCATGGGTGAAGGTGGATACCATTGCAGCCGACGAGAGCTTCTCCCAGGTGGACCTGGGTGGCCGCGTCATGAAA 
ATCAACACCGAGGTGCGGAGCTTCGGACCTGTGTCCCGCAGCGGCTTCTACCTGGCCTTCCAGGACTATGGCGGCTG 
CATGTCCCTCATCGCCGTGCGTGTCTTCTACCGCAAGTGCCCCCGCATCATCCAGAATGGCGCCATCTTCCAGGAAA 
CCCTGTCGGGGGCTGAGAGCACATCGCTGGTGGCTGCCCGGGGCAGCTGCATCGCCAATGCGGAAGAGGTGGATGTA 
CCCATCAAGCTCTACTGTAACGGGGACGGCGAGTGGCTGGTGCCCATCGGGCGCTGCATGTGCAAAGCAGGCTTCGA 
GGCCGTTGAGAATGGCACCGTCTGCCGAGAGAGACAGGATCTCACTATGTTGTCCAGGCTGGTCTTGAACTCGTGGC 
CACAAATGATCCTCCCACCTCAGCCTCCCAAAGTGTTGGAATTATAGGCATGAACCACCATGCCCAGGAGGAGAATT 
TTTGATAATAATATTTTGTGGACATCTTTGCATATCATGTCAGAGCTATAACATCATTGTGGAGAAGCTCTTAGGAT 
CCCATAGAATAAATGTACCGTAATTTA 



<210> SEQ ID NO 5 

<211> Length : 2, 199 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 5 
>T39971_T10 

GAGACTGAGCCTGGGGACAGGGAGTGGCCTGCTCAGAAAAGACTCAGAAATTAAATCCAGTCCAGTGGGTTGATATT 
TACCCAAATTTCCAGCCTGGGGAGATTGATGCACCCAAGAGAAGAACCCAGAAATGAAACTTTGTTCTTTTATGCTA 
AAAAATAAAATTCCCCAGAGTGCTTACAATCTCTCCTCCCACTCCCTTTTTCCTGCCCTAAATAAATAATGGCGAAT 
GAGCACCCAGCCAGGGATGTGTCTGATCAAACAATCATGGATCAATAGCTATGTTTGGAGAAGGAATTTGTGGCTGC 
TCCAGCTACTGGGCATTTTGTCTGGTCCAGTTCATGTAATCTCCCAACACCCCATGAAGCAAGGCTTTGTTAATCCT 
ATTTTACTGAAAATGAACTAAGACTCAGAGAGATAAAGCTGTTGCCCAATGAGCCTTCTTTCTGCCCTCCAGATCCA 
CGGTGCTAATTCCCCTTCCGATGACCTAATGATTCTGAGCTTGGCAAAGGTCTTATCTCCCAGCTCGCCCAGGCCCA 
GTGTTCCAGGAATGTGACCTTTGCTGCAGCAGCCGCTGGAGGGGGCAGAGGGGATGGGCTGGAGGTTGAGCAAACAG 
AGCAGCAGAAAAGGCAGTTCCTCTTCTCCAGTGCCCTCCTTCCCTGTCTCTGCCTCTCCCTCCCTTCCTCAGGCATC 
AGAGCGGAGACTTCAGGGAGACCAGAGCCCAGCTTGCCAGGCACTGAGCTAGAAGCCCTGCCATGGCACCCCTGAGA 
CCCCTTCTCATACTGGCCCTGCTGGCATGGGTTGCTCTGGCTGACCAAGAGTCATGCAAGGGCCGCTGCACTGAGGG 
CTTCAACGTGGACAAGAAGTGCCAGTGTGACGAGCTCTGCTCTTACTACCAGAGCTGCTGCACAGACTATACGGCTG 
AGTGCAAGCCCCAAGTGACTCGCGGGGATGTGTTCACTATGCCGGAGGATGAGTACACGGTCTATGACGATGGCGAG 
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GAGAAAAACAATGCCACTGTCCATGAACAGGTGGGGGGCCCCTCCCTGACCTCTGACCTCCAGGCCCAGTCCAAAGG 
GAATCCTGAGCAGACACCTGTTCTGAAACCTGAGGAAGAGGCCCCTGCGCCTGAGGTGGGCGCCTCTAAGCCTGAGG 
GGATAGACTCAAGGCCTGAGACCCTTCATCCAGGGAGACCTCAGCCCCCAGCAGAGGAGGAGCTGTGCAGTGGGAAG 
CCCTTCGACGCCTTCACCGACCTCAAGAACGGTTCCCTCTTTGCCTTCCGAGGGCAGTACTGCTATGAACTGGACGA 
AAAGGCAGTGAGGCCTGGGTACCCCAAGCTCATCCGAGATGTCTGGGGCATCGAGGGCCCCATCGATGCCGCCTTCA 
CCCGCATCAACTGTCAGGGGAAGACCTACCTCTTCAAGGGTAGTCAGTACTGGCGCTTTGAGGATGGTGTCCTGGAC 
CCTGATTACCCCCGAAATATCTCTGACGGCTTCGATGGCATCCCGGACAACGTGGATGCAGCCTTGGCCCTCCCTGC 
CCATAGCTACAGTGGCCGGGAGCGGGTCTACTTCTTCAAGGGGAAACAGTACTGGGAGTACCAGTTCCAGCACCAGC 
CCAGTCAGGAGGAGTGTGAAGGCAGCTCCCTGTCGGCTGTGTTTGAACACTTTGCCATGATGCAGCGGGACAGCTGG 
GAGGACATCTTCGAGCTTCTCTTCTGGGGCAGAACCTCTGGCATGGCACCCCGCCCCTCCTTGGCCAAGAAACAAAG 
GTTTAGGCATCGCAACCGCAAAGGCTACCGTTCACAACGAGGCCACAGCCGTGGCCGCAACCAGAACTCCCGCCGGC 
CATCCCGCGCCACGTGGCTGTCCTTGTTCTCCAGTGAGGAGAGCAACTTGGGAGCCAACAACTATGATGACTACAGG 
ATGGACTGGCTTGTGCCTGCCACCTGTGAACCCATCCAGAGTGTCTTCTTCTTCTCTGGAGACAAGTACTACCGAGT 
CAATCTTCGCACACGGCGAGTGGACACTGTGGACCCTCCCTACCCACGCTCCATCGCTCAGTACTGGCTGGGCTGCC 
CAGCTCCTGGCCATCTGTAGGAGTCAGAGCCCACATGGCCGGGCCCTCTGTAGCTCCCTCCTCCCATCTCCTTCCCC 
CAGCCCAATAAAGGTCCCTTAGCCCCGAAAAAAAAGCKATAAT 

<210> SEQ ID NO 6 

<211> Length : 1, 947 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 6 
>T39971_T12 

GAGACTGAGCCTGGGGACAGGGAGTGGCCTGCTCAGAAAAGACTCAGAAATTAAATCCAGTCCAGTGGGTTGATATT 
TACCCAAATTTCCAGCCTGGGGAGATTGATGCACCCAAGAGAAGAACCCAGAAATGAAACTTTGTTCTTTTATGCTA 
AAAAATAAAATTCCCCAGAGTGCTTACAATCTCTCCTCCCACTCCCTTTTTCCTGCCCTAAATAAATAATGGCGAAT 
GAGCACCCAGCCAGGGATGTGTCTGATCAAACAATCATGGATCAATAGCTATGTTTGGAGAAGGAATTTGTGGCTGC 
TCCAGCTACTGGGCATTTTGTCTGGTCCAGTTCATGTAATCTCCCAACACCCCATGAAGCAAGGCTTTGTTAATCCT 
ATTTTACTGAAAATGAACTAAGACTCAGAGAGATAAAGCTGTTGCCCAATGAGCCTTCTTTCTGCCCTCCAGATCCA 
CGGTGCTAATTCCCCTTCCGATGACCTAATGATTCTGAGCTTGGCAAAGGTCTTATCTCCCAGCTCGCCCAGGCCCA 
GTGTTCCAGGAATGTGACCTTTGCTGCAGCAGCCGCTGGAGGGGGCAGAGGGGATGGGCTGGAGGTTGAGCAAACAG 
AGCAGCAGAAAAGGCAGTTCCTCTTCTCCAGTGCCCTCCTTCCCTGTCTCTGCCTCTCCCTCCCTTCCTCAGGCATC 
AGAGCGGAGACTTCAGGGAGACCAGAGCCCAGCTTGCCAGGCACTGAGCTAGAAGCCCTGCCATGGCACCCCTGAGA 
CCCCTTCTCATACTGGCCCTGCTGGCATGGGTTGCTCTGGCTGACCAAGAGTCATGCAAGGGCCGCTGCACTGAGGG 
CTTCAACGTGGACAAGAAGTGCCAGTGTGACGAGCTCTGCTCTTACTACCAGAGCTGCTGCACAGACTATACGGCTG 
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AGTGCAAGCCCCAAGTGACTCGCGGGGATGTGTTCACTATGCCGGAGGATGAGTACACGGTCTATGACGATGGCGAG 
GAGAAAAACAATGCCACTGTCCATGAACAGGTGGGGGGCCCCTCCCTGACCTCTGACCTCCAGGCCCAGTCCAAAGG 
GAATCCTGAGCAGACACCTGTTCTGAAACCTGAGGAAGAGGCCCCTGCGCCTGAGGTGGGCGCCTCTAAGCCTGAGG 
GGATAGACTCAAGGCCTGAGACCCTTCATCCAGGGAGACCTCAGCCCCCAGCAGAGGAGGAGCTGTGCAGTGGGAAG 
CCCTTCGACGCCTTCACCGACCTCAAGAACGGTTCCCTCTTTGCCTTCCGAGGGCAGTACTGCTATGAACTGGACGA 
AAAGGCAGTGAGGCCTGGGTACCCCAAGCTCATCCGAGATGTCTGGGGCATCGAGGGCCCCATCGATGCCGCCTTCA 
CCCGCATCAACTGTCAGGGGAAGACCTACCTCTTCAAGGGTAGTCAGTACTGGCGCTTTGAGGATGGTGTCCTGGAC 
CCTGATTACCCCCGAAATATCTCTGACGGCTTCGATGGCATCCCGGACAACGTGGATGCAGCCTTGGCCCTCCCTGC 
CCATAGCTACAGTGGCCGGGAGCGGGTCTACTTCTTCAAGGGGAAACAGTACTGGGAGTACCAGTTCCAGCACCAGC 
CCAGTCAGGAGGAGTGTGAAGGCAGCTCCCTGTCGGCTGTGTTTGAACACTTTGCCATGATGCAGCGGGACAGCTGG 
GAGGACATCTTCGAGCTTCTCTTCTGGGGCAGAACCTCTGACAAGTACTACCGAGTCAATCTTCGCACACGGCGAGT 
GGACACTGTGGACCCTCCCTACCCACGCTCCATCGCTCAGTACTGGCTGGGCTGCCCAGCTCCTGGCCATCTGTAGG 
AGTCAGAGCCCACATGGCCGGGCCCTCTGTAGCTCCCTCCTCCCATCTCCTTCCCCCAGCCCAATAAAGGTCCCTTA 
G C C C C G A AA AAAAAGC K AT AAT 

<210> SEQ ID NO 7 

<211> Length : 1,592 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 7 
>T39971__T16 

GAGACTGAGCCTGGGGACAGGGAGTGGCCTGCTCAGAAAAGACTCAGAAATTAAATCCAGTCCAGTGGGTTGATATT 
TACCCAAATTTCCAGCCTGGGGAGATTGATGCACCCAAGAGAAGAACCCAGAAATGAAACTTTGTTCTTTTATGCTA 
AAAAATAAAATTCCCCAGAGTGCTTACAATCTCTCCTCCCACTCCCTTTTTCCTGCCCTAAATAAATAATGGCGAAT 
GAGCACCCAGCCAGGGATGTGTCTGATCAAACAATCATGGATCAATAGCTATGTTTGGAGAAGGAATTTGTGGCTGC 
TCCAGCTACTGGGCATTTTGTCTGGTCCAGTTCATGTAATCTCCCAACACCCCATGAAGCAAGGCTTTGTTAATCCT 
ATTTTACTGAAAATGAACTAAGACTCAGAGAGATAAAGCTGTTGCCCAATGAGCCTTCTTTCTGCCCTCCAGATCCA 
CGGTGCTAATTCCCCTTCCGATGACCTAATGATTCTGAGCTTGGCAAAGGTCTTATCTCCCAGCTCGCCCAGGCCCA 
GTGTTCCAGGAATGTGACCTTTGCTGCAGCAGCCGCTGGAGGGGGCAGAGGGGATGGGCTGGAGGTTGAGCAAACAG 
AGCAGCAGAAAAGGCAGTTCCTCTTCTCCAGTGCCCTCCTTCCCTGTCTCTGCCTCTCCCTCCCTTCCTCAGGCATC 
AGAGCGGAGACTTCAGGGAGACCAGAGCCCAGCTTGCCAGGCACTGAGCTAGAAGCCCTGCCATGGCACCCCTGAGA 
CCCCTTCTCATACTGGCCCTGCTGGCATGGGTTGCTCTGGCTGACCAAGAGTCATGCAAGGGCCGCTGCACTGAGGG 
CTTCAACGTGGACAAGAAGTGCCAGTGTGACGAGCTCTGCTCTTACTACCAGAGCTGCTGCACAGACTATACGGCTG 
AGTGCAAGCCCCAAGTGACTCGCGGGGATGTGTTCACTATGCCGGAGGATGAGTACACGGTCTATGACGATGGCGAG 
GAGAAAAACAATGCCACTGTCCATGAACAGGTGGGGGGCCCCTCCCTGACCTCTGACCTCCAGGCCCAGTCCAAAGG 
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GAATCCTGAGCAGACACCTGTTCTGAAACCTGAGGAAGAGGCCCCTGCGCCTGAGGTGGGCGCCTCTAAGCCTGAGG 
GGATAGACTCAAGGCCTGAGACCCTTCATCCAGGGAGACCTCAGCCCCCAGCAGAGGAGGAGCTGTGCAGTGGGAAG 
CCCTTCGACGCCTTCACCGACCTCAAGAACGGTTCCCTCTTTGCCTTCCGAGGGCAGTACTGCTATGAACTGGACGA 
AAAGGCAGTGAGGCCTGGGTACCCCAAGCTCATCCGAGATGTCTGGGGCATCGAGGGCCCCATCGATGCCGCCTTCA 
CCCGCATCAACTGTCAGGGGAAGACCTACCTCTTCAAGGTGCCAGGGGCTGTGGGCCAGGGTAGAAAGCATCTAGGG 
AGGGTTTGAGAGCTATTGCTCCCAGGGACAGGGTGGACAGGGAAGCTGGACCCAGGGCCCTGCAGGACCTGGTGGGA 
GCTCTGTGAGCACAGGGCAGCCCCAAGACTCCAGGTCCTGGGCAGTGAACCT 

<210> SEQ ID NO 8 

<211> Length : 2, 490 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 8 
>T39971_T5 

GAGACTGAGCCTGGGGACAGGGAGTGGCCTGCTCAGAAAAGACTCAGAAATTAAATCCAGTCCAGTGGGTTGATATT 

TACCCAAATTTCCAGCCTGGGGAGATTGATGCACCCAAGAGAAGAA'CCCAGAAATGAAACTTTGTTCTTTTATGCTA 

AAAAATAAAATTCCCCAGAGTGCTTACAATCTCTCCTCCCACTCCCTTTTTCCTGCCCTAAATAAATAATGGCGAAT 

GAGCACCCAGCCAGGGATGTGTCTGATCAAACAATCATGGATCAATAGCTATGTTTGGAGAAGGAATTTGTGGCTGC 

TCCAGCTACTGGGCATTTTGTCTGGTCCAGTTCATGTAATCTCCCAACACCCCATGAAGCAAGGCTTTGTTAATCCT 

ATTTTACTGAAAATGAACTAAGACTCAGAGAGATAAAGCTGTTGCCCAATGAGCCTTCTTTCTGCCCTCCAGATCCA 

CGGTGCTAATTCCCCTTCCGATGACCTAATGATTCTGAGCTTGGCAAAGGTCTTATCTCCCAGCTCGCCCAGGCCCA 

GTGTTCCAGGAATGTGACCTTTGCTGCAGCAGCCGCTGGAGGGGGCAGAGGGGATGGGCTGGAGGTTGAGCAAACAG 

AGCAGCAGAAAAGGCAGTTCCTCTTCTCCAGTGCCCTCCTTCCCTGTCTCTGCCTCTCCCTCCCTTCCTCAGGCATC 

AGAGCGGAGACTTCAGGGAGACCAGAGCCCAGCTTGCCAGGCACTGAGCTAGAAGCCCTGCCATGGCACCCCTGAGA 

CCCCTTCTCATACTGGCCCTGCTGGCATGGGTTGCTCTGGCTGACCAAGAGTCATGCAAGGGCCGCTGCACTGAGGG 

CTTCAACGTGGACAAGAAGTGCCAGTGTGACGAGCTCTGCTCTTACTACCAGAGCTGCTGCACAGACTATACGGCTG 

AGTGCAAGCCCCAAGTGACTCGCGGGGATGTGTTCACTATGCCGGAGGATGAGTACACGGTCTATGACGATGGCGAG 

GAGAAAAACAATGCCACTGTCCATGAACAGGTGGGGGGCCCCTCCCTGACCTCTGACCTCCAGGCCCAGTCCAAAGG 

GAATCCTGAGCAGACACCTGTTCTGAAACCTGAGGAAGAGGCCCCTGCGCCTGAGGTGGGCGCCTCTAAGCCTGAGG 

GGATAGACTCAAGGCCTGAGACCCTTCATCCAGGGAGACCTCAGCCCCCAGCAGAGGAGGAGCTGTGCAGTGGGAAG 

CCCTTCGACGCCTTCACCGACCTCAAGAACGGTTCCCTCTTTGCCTTCCGAGGGCAGTACTGCTATGAACTGGACGA 

AAAGGCAGTGAGGCCTGGGTACCCCAAGCTCATCCGAGATGTCTGGGGCATCGAGGGCCCCATCGATGCCGCCTTCA 

CCCGCATCAACTGTCAGGGGAAGACCTACCTCTTCAAGGGTAGTCAGTACTGGCGCTTTGAGGATGGTGTCCTGGAC 

CCTGATTACCCCCGAAATATCTCTGACGGCTTCGATGGCATCCCGGACAACGTGGATGCAGCCTTGGCCCTCCCTGC 

CCATAGCTACAGTGGCCGGGAGCGGGTCTACTTCTTCAAGGGTACTCAGGGGGTGGTGGGAGACTGAGCAGGCAGTG 
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GAGCAGTCTTGGATTCCTTTCACATTTCACTGGGGACAGGCCTCAGCATGTGCCCACCCCTGACCCCCACCTCATGC 
TGGGAGATCCTAACTTCAACAGCCTCTGGGATCTCCAGTCTTGCCCTGGCCCAGCCCTCCTAATGCCCACCACCCCG 
CTCCTCAGGGAAACAGTACTGGGAGTACCAGTTCCAGCACCAGCCCAGTCAGGAGGAGTGTGAAGGCAGCTCCCTGT 
CGGCTGTGTTTGAACACTTTGCCATGATGCAGCGGGACAGCTGGGAGGACATCTTCGAGCTTCTCTTCTGGGGCAGA 
ACCTCTGCTGGTACCAGACAGCCCCAGTTCATTAGCCGGGACTGGCACGGTGTGCCAGGGCAAGTGGACGCAGCCAT 
GGCTGGCCGCATCTACATCTCAGGCATGGCACCCCGCCCCTCCTTGGCCAAGAAACAAAGGTTTAGGCATCGCAACC 
GCAAAGGCTACCGTTCACAACGAGGCCACAGCCGTGGCCGCAACCAGAACTCCCGCCGGCCATCCCGCGCCACGTGG 
CTGTCCTTGTTCTCCAGTGAGGAGAGCAACTTGGGAGCCAACAACTATGATGACTACAGGATGGACTGGCTTGTGCC 
TGCCACCTGTGAACCCATCCAGAGTGTCTTCTTCTTCTCTGGAGACAAGTACTACCGAGTCAATCTTCGCACACGGC 
GAGTGGACACTGTGGACCCTCCCTACCCACGCTCCATCGCTCAGTACTGGCTGGGCTGCCCAGCTCCTGGCCATCTG 
TAGGAGTCAGAGCCCACATGGCCGGGCCCTCTGTAGCTCCCTCCTCCCATCTCCTTCCCCCAGCCCAATAAAGGTCC 
C T T AG C C C C G AA AAA A A AG C K AT AAT 

<210> SEQ ID NO 9 

<211> Length : 4, 755 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 9 
>Z213 68_PEA_1_T10 

TCTTCATCTTGCGAGCACTTGGCAGACCGTCGCTAATGAATCTTGGGGCCGGTGTCGGGCCGGGGCGGCTTGATCGG 
CAACTAGGAAACCCCAGGCGCAGAGGCCAGGAGCGAGGGCAGCGAGGATCAGAGGCCAGGCCTTCCCGGCTGCCGGC 
GCTCCTCGGAGGTCAGGGCAGATGAGGAACATGACTCTCCCCCTTCGGAGGAGGAAGGAAGTCCCGCTGCCACCTTA 
TCTCTGCTCCTCTGCCTCCTCCCTGTTCCCAGAGCTTTTTCTCTAGAGAAGATTTTGAAGGCGGCTTTTGGATTCTT 
CACTTCTCTTGAACAAGGAACTCACTCAGAGACTAACACAAAGGAAGTAATTTCTTACCTGGTCATTATTTAGTCTA 
CAATAAGTTCATCCTTCTTCAGTGTGACCAGTAAATTCTTCCCATACTCTTGAAGAGAGCATAATTGGAATGGAGAG 
GTGCTGACGGCCACCCACCATCATCTAAAGAAGATAAACTTGGCAAATGACATGCAGGTTCTTCAAGGCAGAATAAT 
TGCAGAAAATCTTCAAAGGACCCTATCTGCAGATGTTCTGAATACCTCTGAGAATAGAGATTGATTATTCAACCAGG 
ATACCTAATTCAAGAACTCCAGAAATCAGGAGACGGAGACATTTTGTCAGTTTTGCAACATTGGACCAAATACAATG 
AAGTATTCTTGCTGTGCTCTGGTTTTGGCTGTCCTGGGCACAGAATTGCTGGGAAGCCTCTGTTCGACTGTCAGATC 
CCCGAGGTTCAGAGGACGGATACAGCAGGAACGAAAAAACATCCGACCCAACATTATTCTTGTGCTTACCGATGATC 
AAGATGTGGAGCTGGGGTCCCTGCAAGTCATGAACAAAACGAGAAAGATTATGGAACATGGGGGGGCCACCTTCATC 
AATGCCTTTGTGACTACACCCATGTGCTGCCCGTCACGGTCCTCCATGCTCACCGGGAAGTATGTGCACAATCACAA 
TGTCTACACCAACAACGAGAACTGCTCTTCCCCCTCGTGGCAGGCCATGCATGAGCCTCGGACTTTTGCTGTATATC 
TTAACAACACTGGCTACAGAACAGCCTTTTTTGGAAAATACCTCAATGAATATAATGGCAGCTACATCCCCCCTGGG 
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TGGCGAGAATGGCTTGGATTAATCAAGAATTCTCGCTTCTATAATTACACTGTTTGTCGCAATGGCATCAAAGAAAA 
GCATGGATTTGATTATGCAAAGGCCCGTTATGATGGTGATCAGCCACGCTGCGCCCCACGGCCCCGAGGACTCAGCC 
CCACAGTTTTCTAAACTGTACCCCAATGCTTCCCAACACATAACTCCTAGTTATAACTATGCACCAAATATGGATAA 
ACACTGGATTATGCAGTACACAGGACCAATGCTGCCCATCCACATGGAATTTACAAACATTCTACAGCGCAAAAGGC 
TCCAGACTTTGATGTCAGTGGATGATTCTGTGGAGAGGCTGTATAACATGCTCGTGGAGACGGGGGAGCTGGAGAAT 
ACTTACATCATTTACACCGCCGACCATGGTTACCATATTGGGCAGTTTGGACTGGTCAAGGGGAAATCCATGCCATA 
TGACTTTGATATTCGTGTGCCTTTTTTTATTCGTGGTCCAAGTGTAGAACCAGGATCAATAGTCCCACAGATCGTTC 
TCAACATTGACTTGGCCCCCACGATCCTGGATATTGCTGGGCTCGACACACCTCCTGATGTGGACGGCAAGTCTGTC 
CTCAAACTTCTGGACCCAGAAAAGCCAGGTAACAGGTTTCGAACAAACAAGAAGGCCAAAATTTGGCGTGATACATT 
CCTAGTGGAAAGAGGCAAATTTCTACGTAAGAAGGAAGAATCCAGCAAGAATATCCAACAGTCAAATCACTTGCCCA 
AATATGAACGGGTCAAAGAACTATGCCAGCAGGCCAGGTACCAGACAGCCTGTGAACAACCGGGGCAGAAGTGGCAA 
TGCATTGAGGATACATCTGGCAAGCTTCGAATTCACAAGTGTAAAGGACCCAGTGACCTGCTCACAGTCCGGCAGAG 
CACGCGGAACCTCTACGCTCGCGGCTTCCATGACAAAGACAAAGAGTGCAGTTGTAGGGAGTCTGGTTACCGTGCCA 
GCAGAAGCCAAAGAAAGAGTCAACGGCAATTCTTGAGAAACCAGGGGACTCCAAAGTACAAGCCCAGATTTGTCCAT 
ACTCGGCAGACACGTTCCTTGTCCGTCGAATTTGAAGGTGAAATATATGACATAAATCTGGAAGAAGAAGAAGAATT 
GCAAGTGTTGCAACCAAGAAACATTGCTAAGCGTCATGATGAAGGCCACAAGGGGCCAAGAGATCTCCAGGCTTCCA 
GTGGTGGCAACAGGGGCAGGATGCTGGCAGATAGCAGCAACGCCGTGGGCCCACCTACCACTGTCCGAGTGACACAC 
AAGTGTTTTATTCTTCCCAATGACTCTATCCATTGTGAGAGAGAACTGTACCAATCGGCCAGAGCGTGGAAGGACCA 
TAAGGCATACATTGACAAAGAGATTGAAGCTCTGCAAGATAAAATTAAGAATTTAAGAGAAGTGAGAGGACATCTGA 
AGAGAAGGAAGCCTGAGGAATGTAGCTGCAGTAAACAAAGCTATTACAATAAAGAGAAAGGTGTAAAAAAGCAAGAG 
AAATTAAAGAGCCATCTTCACCCATTCAAGGAGGCTGCTCAGGAAGTAGATAGCAAACTGCAACTTTTCAAGGAGAA 
CAACCGTAGGAGGAAGAAGGAGAGGAAGGAGAAGAGACGGCAGAGGAAGGGGGAAGAGTGCAGCCTGCCTGGCCTCA 
CTTGCTTCACGCATGACAACAACCACTGGCAGACAGCCCCGTTCTGGAACCTGGGATCTTTCTGTGCTTGCACGAGT 
TCTAACAATAACACCTACTGGTGTTTGCGTACAGTTAATGAGACGCATAATTTTCTTTTCTGTGAGTTTGCTACTGG 
CTTTTTGGAGTATTTTGATATGAATACAGATCCTTATCAGCTCACAAATACAGTGCACACGGTAGAACGAGGCATTT 
TGAATCAGCTACACGTACAACTAATGGAGCTCAGAAGCTGTCAAGGATATAAGCAGTGCAACCCAAGACCTAAGAAT 
CTTGATGTTGGAAATAAAGATGGAGGAAGCTATGACCTACACAGAGGACAGTTATGGGATGGATGGGAAGGTTAATC 
AGCCCCGTCTCACTGCAGACATCAACTGGCAAGGCCTAGAGGAGCTACACAGTGTGAATGAAAACATCTATGAGTAC 
AGACAAAACTACAGACTTAGTCTGGTGGACTGGACTAATTACTTGAAGGATTTAGATAGAGTATTTGCACTGCTGAA 
GAGTCACTATGAGCAAAATAAAACAAATAAGACTCAAACTGCTCAAAGTGACGGGTTCTTGGTTGTCTCTGCTGAGC 
ACGCTGTGTCAATGGAGATGGCCTCTGCTGACTCAGATGAAGACCCAAGGCATAAGGTTGGGAAAACACCTCATTTG 
ACCTTGCCAGCTGACCTTCAAACCCTGCATTTGAACCGACCAACATTAAGTCCAGAGAGTAAACTTGAATGGAATAA 
CGACATTCCAGAAGTTAATCATTTGAATTCTGAACACTGGAGAAAAACCGAAAAATGGACGGGGCATGAAGAGACTA 
ATCATCTGGAAACCGATTTCAGTGGCGATGGCATGACAGAGCTAGAGCTCGGGCCCAGCCCCAGGCTGCAGCCCATT 
CGCAGGCACCCGAAAGAACTTCCCCAGTATGGTGGTCCTGGAAAGGACATTTTTGAAGATCAACTATATCTTCCTGT 
GCATTCCGATGGAATTTCAGTTCATCAGATGTTCACCATGGCCACCGCAGAACACCGAAGTAATTCCAGCATAGCGG 
GGAAGATGTTGACCAAGGTGGAGAAGAATCACGAAAAGGAGAAGTCACAGCACCTAGAAGGCAGCGCCTCCTCTTCA 
CTCTCCTCTGATTAGATGAAACTGTTACCTTACCCTAAACACAGTATTTCTTTTTAACTTTTTTATTTGTAAACTAA 
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TAAAGGTAATCACAGCCACCAACATTCCAAGCTACCCTGGGTACCTTTGTGCAGTAGAAGCTAGTGAGCATGTGAGC 
AAGCGGTGTGCACACGGAGACTCATCGTTATAATTTACTATCTGCCAAGAGTAGAAAGAAAGGCTGGGGATATTTGG 
GTTGGCTTGGTTTTGATTTTTTGCTTGTTTGTTTGTTTTGTACTAAAACAGTATTATCTTTTGAATATCGTAGGGAC 
ATAAGTATATACATGTTATCCAATCAAGATGGCTAGAATGGTGCCTTTCTGAGTGTCTAAAACTTGACACCCCTGGT 
AAATCTTTCAACACACTTCCACTGCCTGCGTAATGAAGTTTTGATTCATTTTTAACCACTGGAATTTTTCAATGCCG 
TCATTTTCAGTTAGATGATTTTGCACTTTGAGATTAAAATGCCATGTCTATTTGATTAGTCTTATTTTTTTATTTTT 
ACAGGCTTATCAGTCTCACTGTTGGCTGTCATTGTGACAAAGTCAAATAAACCCCCAAGGACGACACACAGTATGGA 
TCACATATTGTTTGACATTAAGCTTTTGCCAGAAAATGTTGCATGTGTTTTACCTCGACTTGCTAAAATCGATTAGC 
AGAAAGGCATGGCTAATAATGTTGGTGGTGAAAATAAATAAATAAGTAAACAAAATGA 

<210> SEQ ID NO 10 

<211> Length : 4, 677 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 10 
> Z 2 1 3 6 8_PE A_1_T 1 1 

TCTTCATCTTGCGAGCACTTGGCAGACCGTCGCTAATGAATCTTGGGGCCGGTGTCGGGCCGGGGCGGCTTGATCGG 
CAACTAGGAAACCCCAGGCGCAGAGGCCAGGAGCGAGGGCAGCGAGGATCAGAGGCCAGGCCTTCCCGGCTGCCGGC 
GCTCCTCGGAGGTCAGGGCAGATGAGGAACATGACTCTCCCCCTTCGGAGGAGGAAGGAAGTCCCGCTGCCACCTTA 
TCTCTGCTCCTCTGCCTCCTCCCTGTTCCCAGAGCTTTTTCTCTAGAGAAGATTTTGAAGGCGGCTTTTGGATTCTT 
CACTTCTCTTGAACAAGGAACTCACTCAGAGACTAACACAAAGGAAGTAATTTCTTACCTGGTCATTATTTAGTCTA 
CAATAAGTTCATCCTTCTTCAGTGTGACCAGTAAATTCTTCCCATACTCTTGAAGAGAGCATAATTGGAATGGAGAG 
GTGCTGACGGCCACCCACCATCATCTAAAGAAGATAAACTTGGCAAATGACATGCAGGTTCTTCAAGGCAGAATAAT 
TGCAGAAAATCTTCAAAGGACCCTATCTGCAGATGTTCTGAATACCTCTGAGAATAGAGATTGATTATTCAACCAGG 
ATACCTAATTCAAGAACTCCAGAAATCAGGAGACGGAGACATTTTGTCAGTTTTGCAACATTGGACCAAATACAATG 
AAGTATTCTTGCTGTGCTCTGGTTTTGGCTGTCCTGGGCACAGAATTGCTGGGAAGCCTCTGTTCGACTGTCAGATC 
CCCGAGGTTCAGAGGACGGATACAGCAGGAACGAAAAAACATCCGACCCAACATTATTCTTGTGCTTACCGATGATC 
AAGATGTGGAGCTGGGGTCCCTGCAAGTCATGAACAAAACGAGAAAGATTATGGAACATGGGGGGGCCACCTTCATC 
AATGCCTTTGTGACTACACCCATGTGCTGCCCGTCACGGTCCTCCATGCTCACCGGGAAGTATGTGCACAATCACAA 
TGTCTACACCAACAACGAGAACTGCTCTTCCCCCTCGTGGCAGGCCATGCATGAGCCTCGGACTTTTGCTGTATATC 
TTAACAACACTGGCTACAGAACAGGACTACTTCACAGACTTAATCACTAACGAGAGCATTAATTACTTCAAAATGTC 
TAAGAGAATGTATCCCCATAGGCCCGTTATGATGGTGATCAGCCACGCTGCGCCCCACGGCCCCGAGGACTCAGCCC 
CACAGTTTTCTAAACTGTACCCCAATGCTTCCCAACACATAACTCCTAGTTATAACTATGCACCAAATATGGATAAA 
CACTGGATTATGCAGTACACAGGACCAATGCTGCCCATCCACATGGAATTTACAAACATTCTACAGCGCAAAAGGCT 
CCAGACTTTGATGTCAGTGGATGATTCTGTGGAGAGGCTGTATAACATGCTCGTGGAGACGGGGGAGCTGGAGAATA 
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CTTACATCATTTACACCGCCGACCATGGTTACCATATTGGGCAGTTTGGACTGGTCAAGGGGAAATCCATGCCATAT 
GACTTTGATATTCGTGTGCCTTTTTTTATTCGTGGTCCAAGTGTAGAACCAGGATCAATAGTCCCACAGATCGTTCT 
CAACATTGACTTGGCCCCCACGATCCTGGATATTGCTGGGCTCGACACACCTCCTGATGTGGACGGCAAGTCTGTCC 
TCAAACTTCTGGACCCAGAAAAGCCAGGTAACAGGTTTCGAACAAACAAGAAGGCCAAAATTTGGCGTGATACATTC 
CTAGTGGAAAGAGGCAAATTTCTACGTAAGAAGGAAGAATCCAGCAAGAATATCCAACAGTCAAATCACTTGCCCAA 
ATATGAACGGGTCAAAGAACTATGCCAGCAGGCCAGGTACCAGACAGCCTGTGAACAACCGGGGCAGAAGTGGCAAT 
GCATTGAGGATACATCTGGCAAGCTTCGAATTCACAAGTGTAAAGGACCCAGTGACCTGCTCACAGTCCGGCAGAGC 
ACGCGGAACCTCTACGCTCGCGGCTTCCATGACAAAGACAAAGAGTGCAGTTGTAGGGAGTCTGGTTACCGTGCCAG 
CAGAAGCCAAAGAAAGAGTCAACGGCAATTCTTGAGAAACCAGGGGACTCCAAAGTACAAGCCCAGATTTGTCCATA 
CTCGGCAGACACGTTCCTTGTCCGTCGAATTTGAAGGTGAAATATATGACATAAATCTGGAAGAAGAAGAAGAATTG 
CAAGTGTTGCAACCAAGAAACATTGCTAAGCGTCATGATGAAGGCCACAAGGGGCCAAGAGATCTCCAGGCTTCCAG 
TGGTGGCAACAGGGGCAGGATGCTGGCAGATAGCAGCAACGCCGTGGGCCCACCTACCACTGTCCGAGTGACACACA 
AGTGTTTTATTCTTCCCAATGACTCTATCCATTGTGAGAGAGAACTGTACCAATCGGCCAGAGCGTGGAAGGACCAT 
AAGGCATACATTGACAAAGAGATTGAAGCTCTGCAAGATAAAATTAAGAATTTAAGAGAAGTGAGAGGACATCTGAA 
GAGAAGGAAGCCTGAGGAATGTAGCTGCAGTAAACAAAGCTATTACAATAAAGAGAAAGGTGTAAAAAAGCAAGAGA 
AATTAAAGAGCCATCTTCACCCATTCAAGGAGGCTGCTCAGGAAGTAGATAGCAAACTGCAACTTTTCAAGGAGAAC 
AACCGTAGGAGGAAGAAGGAGAGGAAGGAGAAGAGACGGCAGAGGAAGGGGGAAGAGTGCAGCCTGCCTGGCCTCAC 
TTGCTTCACGCATGACAACAACCACTGGCAGACAGCCCCGTTCTGGAACCTGGGATCTTTCTGTGCTTGCACGAGTT 
CTAACAATAACACCTACTGGTGTTTGCGTACAGTTAATGAGACGCATAATTTTCTTTTCTGTGAGTTTGCTACXGGC 
TTTTTGGAGTATTTTGATATGAATACAGATCCTTATCAGCTCACAAATACAGTGCACACGGTAGAACGAGGCATTTT- 
GAATCAGCTACACGTACAACTAATGGAGCTCAGAAGCTGTCAAGGATATAAGCAGTGCAACCCAAGACCTAAGAATC 
TTGATGTTGGAAATAAAGATGGAGGAAGCTATGACCTACACAGAGGACAGTTATGGGATGGATGGGAAGGTTAATCA 
GCCCCGTCTCACTGCAGACATCAACTGGCAAGGCCTAGAGGAGCTACACAGTGTGAATGAAAACATCTATGAGTACA 
GACAAAACTACAGACTTAGTCTGGTGGACTGGACTAATTACTTGAAGGATTTAGATAGAGTATTTGCACTGCTGAAG 
AGTCACTATGAGCAAAATAAAACAAATAAGACTCAAACTGCTCAAAGTGACGGGTTCTTGGTTGTCTCTGCTGAGCA 
CGCTGTGTCAATGGAGATGGCCTCTGCTGACTCAGATGAAGACCCAAGGCATAAGGTTGGGAAAACACCTCATTTGA 
CCTTGCCAGCTGACCTTCAAACCCTGCATTTGAACCGACCAACATTAAGTCCAGAGAGTAAACTTGAATGGAATAAC 
GACATTCCAGAAGTTAATCATTTGAATTCTGAACACTGGAGAAAAACCGAAAAATGGACGGGGCATGAAGAGACTAA 
TCATCTGGAAACCGATTTCAGTGGCGATGGCATGACAGAGCTAGAGCTCGGGCCCAGCCCCAGGCTGCAGCCCATTC 
GCAGGCACCCGAAAGAACTTCCCCAGTATGGTGGTCCTGGAAAGGACATTTTTGAAGATCAACTATATCTTCCTGTG 
CATTCCGATGGAATTTCAGTTCATCAGATGTTCACCATGGCCACCGCAGAACACCGAAGTAATTCCAGCATAGCGGG 
GAAGATGTTGACCAAGGTGGAGAAGAATCACGAAAAGGAGAAGTCACAGCACCTAGAAGGCAGCGCCTCCTCTTCAC 
TCTCCTCTGATTAGATGAAACTGTTACCTTACCCTAAACACAGTATTTCTTTTTAACTTTTTTATTTGTAAACTAAT 
AAAGGTAATCACAGCCACCAACATTCCAAGCTACCCTGGGTACCTTTGTGCAGTAGAAGCTAGTGAGCATGTGAGCA 
AGCGGTGTGCACACGGAGACTCATCGTTATAATTTACTATCTGCCAAGAGTAGAAAGAAAGGCTGGGGATATTTGGG 
TTGGCTTGGTTTTGATTTTTTGCTTGTTTGTTTGTTTTGTACTAAAACAGTATTATCTTTTGAATATCGTAGGGACA 
TAAGTATATACATGTTATCCAATCAAGATGGCTAGAATGGTGCCTTTCTGAGTGTCTAAAACTTGACACCCCTGGTA 
AATCTTTCAACACACTTCCACTGCCTGCGTAATGAAGTTTTGATTCATTTTTAACCACTGGAATTTTTCAATGCCGT 



WO 2006/131783 



PCT/IB2005/004037 



12 

CATTTTCAGTTAGATGATTTTGCACTTTGAGATTAAAATGCCATGTCTATTTGATTAGTCTTATTTTTTTATTTTTA 
CAGGCTTATCAGTCTCACTGTTGGCTGTCATTGTGACAAAGTCAAATAAACCCCCAAGGACGACACACAGTATGGAT 
CACATATTGTTTGACATTAAGCTTTTGCCAGAAAATGTTGCATGTGTTTTACCTCGACTTGCTAAAATCGATTAGCA 
GAAAGGCATGGCTAATAATGTTGGTGGTGAAAATAAATAAATAAGTAAACAAAATGA 

<210> SEQ ID NO 11 

<211> Length : 2,790 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 11 
>Z213 68_PEA_1_T2 3 

TCTTCATCTTGCGAGCACTTGGCAGACCGTCGCTAATGAATCTTGGGGCCGGTGTCGGGCCGGGGCGGCTTGATCGG 
CAACTAGGAAACCCCAGGCGCAGAGGCCAGGAGCGAGGGCAGCGAGGATCAGAGGCCAGGCCTTCCCGGCTGCCGGC 
GCTCCTCGGAGGTCAGGGCAGATGAGGAACATGACTCTCCCCCTTCGGAGGAGGAAGGAAGTCCCGCTGCCACCTTA 
TCTCTGCTCCTCTGCCTCCTCCCTGTTCCCAGAGCTTTTTCTCTAGAGAAGATTTTGAAGGCGGCTTTTGGATTCTT 
CACTTCTCTTGAACAAGGAACTCACTCAGAGACTAACACAAAGGAAGTAATTTCTTACCTGGTCATTATTTAGTCTA 
CAATAAGTTCATCCTTCTTCAGTGTGACCAGTAAATTCTTCCCATACTCTTGAAGAGAGCATAATTGGAATGGAGAG 
GTGCTGACGGCCACCCACCATCATCTAAAGAAGATAAACTTGGCAAATGACATGCAGGTTCTTCAAGGCAGAATAAT 
TGCAGAAAATCTTCAAAGGACCCTATCTGCAGATGTTCTGAATACCTCTGAGAATAGAGATTGATTATTCAACCAGG 
ATACCTAATTCAAGAACTCCAGAAATCAGGAGACGGAGACATTTTGTCAGTTTTGCAACATTGGACCAAATACAATG 
AAGTATTCTTGCTGTGCTCTGGTTTTGGCTGTCCTGGGCACAGAATTGCTGGGAAGCCTCTGTTCGACTGTCAGATC 
CCCGAGGTTCAGAGGACGGATACAGCAGGAACGAAAAAACATCCGACCCAACATTATTCTTGTGCTTACCGATGATC 
AAGATGTGGAGCTGGGGTCCCTGCAAGTCATGAACAAAACGAGAAAGATTATGGAACATGGGGGGGCCACCTTCATC 
AATGCCTTTGTGACTACACCCATGTGCTGCCCGTCACGGTCCTCCATGCTCACCGGGAAGTATGTGCACAATCACAA 
TGTCTACACCAACAACGAGAACTGCTCTTCCCCCTCGTGGCAGGCCATGCATGAGCCTCGGACTTTTGCTGTATATC 
TTAACAACACTGGCTACAGAACAGCCTTTTTTGGAAAATACCTCAATGAATATAATGGCAGCTACATCCCCCCTGGG 
TGGCGAGAATGGCTTGGATTAATCAAGAATTCTCGCTTCTATAATTACACTGTTTGTCGCAATGGCATCAAAGAAAA 
GCATGGATTTGATXATGCAAAGGACTACTTCACAGACTTAATCACTAACGAGAGCATTAATTACTTCAAAATGTCTA 
AGAGAATGTATCCCCATAGGCCCGTTATGATGGTGATCAGCCACGCTGCGCCCCACGGCCCCGAGGACTCAGCCCCA 
CAGTTTTCTAAACTGTACCCCAATGCTTCCCAACACATAACTCCTAGTTATAACTATGCACCAAATATGGATAAACA 
CTGGATTATGCAGTACACAGGACCAATGCTGCCCATCCACATGGAATTTACAAACATTCTACAGCGCAAAAGGCTCC 
AGACTTTGATGTCAGTGGATGATTCTGTGGAGAGGCTGTATAACATGCTCGTGGAGACGGGGGAGCTGGAGAATACT 
TACATCATTTACACCGCCGACCATGGTTACCATATTGGGCAGTTTGGACTGGTCAAGGGGAAATCCATGCCATATGA 
CTTTGATATTCGTGTGCCTTTTTTTATTCGTGGTCCAAGTGTAGAACCAGGATCAATAGTCCCACAGATCGTTCTCA 
ACATTGACTTGGCCCCCACGATCCTGGATATTGCTGGGCTCGACACACCTCCTGATGTGGACGGCAAGTCTGTCCTC 
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AAACTTCTGGACCCAGAAAAGCCAGGTAACAGGTTTCGAACAAACAAGAAGGCCAAAATTTGGCGTGATACATTCCT 
AGTGGAAAGAGGGTAATTATTGGTTCCTGGGGTGCTTCTGGGAACCAGTCCTAGTGGGCAGCTTTCCCTGCTGAGTA 
TTTTTTTTCTCCTTATTTTTGTTTACTAAGCATGCAGATTTCGTAAACCTAGTCACAAGATTGAATGGTTTGCTGCT 
TATTCTGTAGTGGTCAATAGAGTAATAATTGCTGGATCAGAATTGTAAAGAATAACCCTCAAGTTGGTTAATTGGTA 
CAAAAACACAGTTAGATAGAAGTTATAGAATTTGATAGTATAGTTGGGACATTATCGTTAACAATAATTTATGTATA 
TCTTAAAATAGCTAGAAGTGAAGAATTGCAAAGTTCCCAACACAAGGAAAAGATAAATGAGATGATGAATATCCCAA 
TTATCTTGATTTGATCATTACACATTGTAGACTGGTATCCATATATCACACGTACCCCCAAAATATGTATAATTGTG 
ATATATCAATTTTTAAAATACCAAAAAAGCAAGAGAATGACGACTCCACATCCCCCAAAAAGAATAAATTCTCATAA 
GCTTGGACCAAAGCCTTTATCATGGGTGTAGATTTACTGTTGCATTTCTCAGTGCTGGTTTCTAATCAGACCAGTGG 
ATTGAGTTTCTCTACCATCCTCCCCACGTTCTTCTCTAAGCTGCCTCCAAGCCTCACCCGGCACCCTTCTTCCTACT 
TCCTACTTCTTTTCCTTGTGTGCCTTTCCTAGTTTTAAATAGATAAATGTATGCCATTGTAATTATTTCCATTGTCA 
CTTCTGGGTTTCCCCTTTTGGTTCATTAATACCCATTGCCTTGTTTTTCTCTGTACATAAATTAGGAGAGAGAAAAT 
ATTTGTATAATTTTTTTA 

<210> SEQ ID NO 12 

<211> Length : 3,069 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 12 
> Z 2 1 3 6 8_PEA_1_T 2 4 

TCTTCATCTTGCGAGCACTTGGCAGACCGTCGCTAATGAATCTTGGGGCCGGTGTCGGGCCGGGGCGGCTTGATCGG 
CAACTAGGAAACCCCAGGCGCAGAGGCCAGGAGCGAGGGCAGCGAGGATCAGAGGCCAGGCCTTCCCGGCTGCCGGC 
GCTCCTCGGAGGTCAGGGCAGATGAGGAACATGACTCTCCCCCTTCGGAGGAGGAAGGAAGTCCCGCTGCCACCTTA 
TCTCTGCTCCTCTGCCTCCTCCCTGTTCCCAGAGCTTTTTCTCTAGAGAAGATTTTGAAGGCGGCTTTTGGATTCTT 
CACTTCTCTTGAACAAGGAACTCACTCAGAGACTAACACAAAGGAAGTAATTTCTTACCTGGTCATTATTTAGTCTA 
CAATAAGTTCATCCTTCTTCAGTGTGACCAGTAAATTCTTCCCATACTCTTGAAGAGAGCATAATTGGAATGGAGAG 
GTGCTGACGGCCACCCACCATCATCTAAAGAAGATAAACTTGGCAAATGACATGCAGGTTCTTCAAGGCAGAATAAT 
TGCAGAAAATCTTCAAAGGACCCTATCTGCAGATGTTCTGAATACCTCTGAGAATAGAGATTGATTATTCAACCAGG 
ATACCTAATTCAAGAACTCCAGAAATCAGGAGACGGAGACATTTTGTCAGTTTTGCAACATTGGACCAAATACAATG 
AAGTATTCTTGCTGTGCTCTGGTTTTGGCTGTCCTGGGCACAGAATTGCTGGGAAGCCTCTGTTCGACTGTCAGATC 
CCCGAGGTTCAGAGGACGGATACAGCAGGAACGAAAAAACATCCGACCCAACATTATTCTTGTGCTTACCGATGATC 
AAGATGTGGAGCTGGGGTCCCTGCAAGTCATGAACAAAACGAGAAAGATTATGGAACATGGGGGGGCCACCTTCATC 
AATGCCTTTGTGACTACACCCATGTGCTGCCCGTCACGGTCCTCCATGCTCACCGGGAAGTATGTGCACAATCACAA 
TGTCTACACCAACAACGAGAACTGCTCTTCCCCCTCGTGGCAGGCCATGCATGAGCCTCGGACTTTTGCTGTATATC 
TTAACAACACTGGCTACAGAACAGCCTTTTTTGGAAAATACCTCAATGAATATAATGGCAGCTACATCCCCCCTGGG 
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TGGCGAGAATGGCTTGGATTAATCAAGAATTCTCGCTTCTATAATTACACTGTTTGTCGCAATGGCATCAAAGAAAA 

GCATGGATTTGATTATGCAAAGGACTACTTCACAGACTTAATCACTAACGAGAGCATTAATTACTTCAAAATGTCTA 

AGAGAATGTATCCCCATAGGCCCGTTATGATGGTGATCAGCCACGCTGCGCCCCACGGCCCCGAGGACTCAGCCCCA 

CAGTTTTCTAAACTGTACCCCAATGCTTCCCAACACATAACTCCTAGTTATAACTATGCACCAAATATGGATAAACA 

CTGGATTATGCAGTACACAGGACCAATGCTGCCCATCCACATGGAATTTACAAACATTCTACAGCGCAAAAGGCTCC 

AGACTTTGATGTCAGTGGATGATTCTGTGGAGAGGCTGTATAACATGCTCGTGGAGACGGGGGAGCTGGAGAATACT 

TACATCATTTACACCGCCGACCATGGTTACCATATTGGGCAGTTTGGACTGGTCAAGGGGAAATCCATGCCATATGA 

CTTTGATATTCGTGTGCCTTTTTTTATTCGTGGTCCAAGTGTAGAACCAGGATCAATAGTCCCACAGATCGTTCTCA 

ACATTGACTTGGCCCCCACGATCCTGGATATTGCTGGGCTCGACACACCTCCTGATGTGGACGGCAAGTCTGTCCTC 

AAACTTCTGGACCCAGAAAAGCCAGGTAACAGGTGTGTCATTGTTCCTCCTCTCAGCCAGCCCCAAATACACTGAGC 

TCCAGCTGGTGCCCAGAGCCAGCCAGCAGCTGAAGACATGGAGGCAGAATATGCCTTGCCCACAAGGATCACCCCAA 

GCTGAGCATTTCTCAGCTGCTTGTGAATAGCATATTGATGGAGATGCACTCATGGTCTGTGGGAAGTGAGAGGTGTT 

TCTTTAAATAAGCTGTTAGCACAGATCCATTTGGAAAAACGTCCAGATGCCAAAAGTAAATATTATCATTTTGCTTT 

CAGGTTTCGAACAAACAAGAAGGCCAAAATTTGGCGTGATACATTCCTAGTGGAAAGAGGGTAATTATTGGTTCCTG 

GGGTGCTTCTGGGAACCAGTCCTAGTGGGCAGCTTTCCCTGCTGAGTATTTTTTTTCTCCTTATTTTTGTTTACTAA 

GCATGCAGATTTCGTAAACCTAGTCACAAGATTGAATGGTTTGCTGCTTATTCTGTAGTGGTCAATAGAGTAATAAT 

TGCTGGATCAGAATTGTAAAGAATAACCCTCAAGTTGGTTAATTGGTACAAAAACACAGTTAGATAGAAGTTATAGA 

ATTTGATAGTATAGTTGGGACATTATCGTTAACAATAATTTATGTATATCTTAAAATAGCTAGAAGTGAAGAATTGC 

AAAGTTCCCAACACAAGGAAAAGATAAATGAGATGATGAATATCCCAATTATCTTGATTTGATCATTACACATTGTA 

GACTGGTATCCATATATCACACGTACCCCCAAAATATGTATAATTGTGATATATCAATTTTTAAAATACCAAAAAAG 

CAAGAGAATGACGACTCCACATCCCCCAAAAAGAATAAATTCTCATAAGCTTGGACCAAAGCCTTTATCATGGGTGT 

AGATTTACTGTTGCATTTCTCAGTGCTGGTTTCTAATCAGACCAGTGGATTGAGTTTCTCTACCATCCTCCCCACGT 

TCTTCTCTAAGCTGCCTCCAAGCCTCACCCGGCACCCTTCTTCCTACTTCCTACTTCTTTTCCTTGTGTGCCTTTCC 

TAGTTTTAAATAGATAAATGTATGCCATTGTAATTATTTCCATTGTCACTTCTGGGTTTCCCCTTTTGGTTCATTAA 

TACCCATTGCCTTGTTTTTCTCTGTACATAAATTAGGAGAGAGAAAATATTTGTATAATTTTTTTA 

<210> SEQ ID NO 13 

<211> Length : 5,384 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 13 
>Z21368_PEA_1_T5 

TCTTCATCTTGCGAGCACTTGGCAGACCGTCGCTAATGAATCTTGGGGCCGGTGTCGGGCCGGGGCGGCTTGATCGG 
CAACTAGGAAACCCCAGGCGCAGAGGCCAGGAGCGAGGGCAGCGAGGATCAGAGGCCAGGCCTTCCCGGCTGCCGGC 
GCTCCTCGGAGGTCAGGGCAGATGAGGAACATGACTCTCCCCCTTCGGAGGAGGAAGGAAGTCCCGCTGCCACCTTA 
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TCTCTGCTCCTCTGCCTCCTCCCTGTTCCCAGAGCTTTTTCTCTAGAGAAGATTTTGAAGGCGGCTTTTGTGCTGAC 
GGCCACCCACCATCATCTAAAGAAGATAAACTTGGCAAATGACATGCAGGTTCTTCAAGGCAGAATAATTGCAGAAA 
ATCTTCAAAGGACCCTATCTGCAGATGTTCTGAATACCTCTGAGAATAGAGATTGATTATTCAACCAGGATACCTAA 
TTCAAGAACTCCAGAAATCAGGAGACGGAGACATTTTGTCAGTTTTGCAACATTGGACCAAATACAATGAAGTATTC 
TTGCTGTGCTCTGGTTTTGGCTGTCCTGGGCACAGAATTGCTGGGAAGCCTCTGTTCGACTGTCAGATCCCCGAGGT 
TCAGAGGACGGATACAGCAGGAACGAAAAAACATCCGACCCAACATTATTCTTGTGCTTACCGATGATCAAGATGTG 
GAGCTGGGGTCCCTGCAAGTCATGAACAAAACGAGAAAGATTATGGAACATGGGGGGGCCACCTTCATCAATGCCTT 
TGTGACTACACCCATGTGCTGCCCGTCACGGTCCTCCATGCTCACCGGGAAGTATGTGCACAATCACAATGTCTACA 
CCAACAACGAGAACTGCTCTTCCCCCTCGTGGCAGGCCATGCATGAGCCTCGGACTTTTGCTGTATATCTTAACAAC 
ACTGGCTACAGAACAGCCTTTTTTGGAAAATACCTCAATGAATATAATGGCAGCTACATCCCCCCTGGGXGGCGAGA 
ATGGCTTGGATTAATCAAGAATTCTCGCTTCTATAATTACACTGTTTGTCGCAATGGCATCAAAGAAAAGCATGGAT 
TTGATTATGCAAAGGACTACTTCACAGACTTAATCACTAACGAGAGCATTAATTACTTCAAAATGTCTAAGAGAATG 
TATCCCCATAGGCCCGTTATGATGGTGATCAGCCACGCTGCGCCCCACGGCCCCGAGGACTCAGCCCCACAGTTTTC 
TAAACTGTACCCCAATGCTTCCCAACACATAACTCCTAGTTATAACTATGCACCAAATATGGATAAACACTGGATTA 
TGCAGTACACAGGACCAATGCTGCCCATCCACATGGAATTTACAAACATTCTACAGCGCAAAAGGCTCCAGACTTTG 
ATGTCAGTGGATGATTCTGTGGAGAGGCTGTATAACATGCTCGTGGAGACGGGGGAGCTGGAGAATACTTACATCAT 
TTACACCGCCGACCATGGTTACCATATTGGGCAGTTTGGACTGGTCAAGGGGAAATCCATGCCATATGACTTTGATA 
TTCGTGTGCCTTTTTTTATTCGTGGTCCAAGTGTAGAACCAGGATCAATAGTCCCACAGATCGTTCTCAACATTGAC 
TTGGCCCCCACGATCCTGGATATTGCTGGGCTCGACACACCTCCTGATGTGGACGGCAAGTCTGTCCTCAAACTTCT 
GGACCCAGAAAAGCCAGGTAACAGGTTTCGAACAAACAAGAAGGCCAAAATTTGGCGTGATACATTCCTAGTGGAAA 
GAGGCAAATTTCTACGTAAGAAGGAAGAATCCAGCAAGAATATCCAACAGTCAAATCACTTGCCCAAATATGAACGG 
GTCAAAGAACTATGCCAGCAGGCCAGGTACCAGACAGCCTGTGAACAACCGGGGCAGAAGTGGCAATGCATTGAGGA 
TACATCTGGCAAGCTTCGAATTCACAAGTGTAAAGGACCCAGTGACCTGCTCACAGTCCGGCAGAGCACGCGGAACC 
TCTACGCTCGCGGCTTCCATGACAAAGACAAAGAGTGCAGTTGTAGGGAGTCTGGTTACCGTGCCAGCAGAAGCCAA 
AGAAAGAGTCAACGGCAATTCTTGAGAAACCAGGGGACTCCAAAGTACAAGCCCAGATTTGTCCATACTCGGCAGAC 
ACGTTCCTTGTCCGTCGAATTTGAAGGTGAAATATATGACATAAATCTGGAAGAAGAAGAAGAATTGCAAGTGTTGC 
AACCAAGAAACATTGCTAAGCGTCATGATGAAGGCCACAAGGGGCCAAGAGATCTCCAGGCTTCCAGTGGTGGCAAC 
AGGGGCAGGATGCTGGCAGATAGCAGCAACGCCGTGGGCCCACCTACCACTGTCCGAGTGACACACAAGTGTTTTAT 
TCTTCCCAATGACTCTATCCATTGTGAGAGAGAACTGTACCAATCGGCCAGAGCGTGGAAGGACCATAAGGCATACA 
TTGACAAAGAGATTGAAGCTCTGCAAGATAAAATTAAGAATTTAAGAGAAGTGAGAGGACATCTGAAGAGAAGGAAG 
CCTGAGGAATGTAGCTGCAGTAAACAAAGCTATTACAATAAAGAGAAAGGTGTAAAAAAGCAAGAGAAATTAAAGAG 
CCATCTTCACCCATTCAAGGAGGCTGCTCAGGAAGTAGATAGCAAACTGCAACTTTTCAAGGAGAACAACCGTAGGA 
GGAAGAAGGAGAGGAAGGAGAAGAGACGGCAGAGGAAGGGGGAAGAGTGCAGCCTGCCTGGCCTCACTTGCTTCACG 
CATGACAACAACCACTGGCAGACAGCCCCGTTCTGGAACCCTCACAAATACAGTGCACACGGTAGAACGAGGCATTT 
TGAATCAGCTACACGTACAACTAATGGAGCTCAGAAGCTGTCAAGGATATAAGCAGTGCAACCCAAGACCTAAGAAT 
CTTGATGTTGGAAATAAAGATGGAGGAAGCTATGACCTACACAGAGGACAGTTATGGGATGGATGGGAAGGTTAATC 
AGCCCCGTCTCACTGCAGACATCAACTGGCAAGGCCTAGAGGAGCTACACAGTGTGAATGAAAACATCTATGAGTAC 
AGACAAAACTACAGACTTAGTCTGGTGGACTGGACTAATTACTTGAAGGATTTAGATAGAGTATTTGCACTGCTGAA 
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GAGTCACTATGAGCAAAATAAAACAAATAAGACTCAAACTGCTCAAAGTGACGGGTTCTTGGTTGTCTCTGCTGAGC 
ACGCTGTGTCAATGGAGATGGCCTCTGCTGACTCAGATGAAGACCCAAGGCATAAGGTTGGGAAAACACCTCATTTG 
ACCTTGCCAGCTGACCTTCAAACCCTGCATTTGAACCGACCAACATTAAGTCCAGAGAGTAAACTTGAATGGAATAA 
CGACATTCCAGAAGTTAATCATTTGAATTCTGAACACTGGAGAAAAACCGAAAAATGGACGGGGCATGAAGAGACTA 
ATCATCTGGAAACCGATTTCAGTGGCGATGGCATGACAGAGCTAGAGCTCGGGCCCAGCCCCAGGCTGCAGCCCATT 
CGCAGGCACCCGAAAGAACTTCCCCAGTATGGTGGTCCTGGAAAGGACATTTTTGAAGATCAACTATATCTTCCTGT 
GCATTCCGATGGAATTTCAGTTCATCAGATGTTCACCATGGCCACCGCAGAACACCGAAGTAATTCCAGCATAGCGG 
GGAAGATGTTGACCAAGGTGGAGAAGAATCACGAAAAGGAGAAGTCACAGCACCTAGAAGGCAGCGCCTCCTCTTCA 
CTCTCCTCTGATTAGATGAAACTGTTACCTTACCCTAAACACAGTATTTCTTTTTAACTTTTTTATTTGTAAACTAA 
TAAAGGTAATCACAGCCACCAACATTCCAAGCTACCCTGGGTACCTTTGTGCAGTAGAAGCTAGTGAGCATGTGAGC 
AAGCGGTGTGCACACGGAGACTCATCGTTATAATTTACTATCTGCCAAGAGTAGAAAGAAAGGCTGGGGATATTTGG 
GTTGGCTTGGTTTTGATTTTTTGCTTGTTTGTTTGTTTTGTACTAAAACAGTATTATCTTTTGAATATCGTAGGGAC 
ATAAGTATATACATGTTATCCAATCAAGATGGCTAGAATGGTGCCTTTCTGAGTGTCTAAAACTTGACACCCCTGGT 
AAATCTTTCAACACACTTCCACTGCCTGCGTAATGAAGTTTTGATTCATTTTTAACCACTGGAATTTTTCAATGCCG 
TCATTTTCAGTTAGATGATTTTGCACTTTGAGATTAAAATGCCATGTCTATTTGATTAGTCTTATTTTTTTATTTTT 
ACAGGCTTATCAGTCTCACTGTTGGCTGTCATTGTGACAAAGTCAAATAAACCCCCAAGGACGACACACAGTATGGA 
TCACATATTGTTTGACATTAAGCTTTTGCCAGAAAATGTTGCATGTGTTTTACCTCGACTTGCTAAAATCGATTAGC 
AGAAAGGCATGGCTAATAATGTTGGTGGTGAAAATAAATAAATAAGTAAACAAAATGAAGATTGCCTGCTCTCTCTG 
TGCCTAGCCTCAAAGCGTTCATCATACATCATACCTTTAAGATTGCTATATTTTGGGTTATTTTCTTGACAGGAGAA 
AAAGATCTAAAGATCTTTTATTTTCATCTTTTTTGGTTTTCTTGGCATGACTAAGAAGCTTAAATGTTGATAAAATA 
TGACTAGTTTTGAATTTACACCAAGAACTTCTCAATAAAAGAAAATCATGAATGCTCCACAATTTCAACATACCACA 
AGAGAAGTTAATTTCTTAACATTGTGTTCTATGATTATTTGTAAGACCTTCACCAAGTTCTGATATCTTTTAAAGAC 
ATAGTTCAAAATTGCTTTTGAAAATCTGTATTCTTGAAAATATCCTTGTTGTGTATTAGGTTTTTAAATACCAGCTA 
AAGGATTACCTCACTGAGTCATCAGTACCCTCCTATTCAGCTCCCCAAGATGATGTGTTTTTGCTTACCCTAAGAGA 
GGTTTTCTTCTTATTTTTAGATAATTCAAGTGCTTAGATAAATTATGTTTTCTTTAAGTGTTTATGGTAAACTCTTT 
TAAAGAAAATTTAATATGTTATAGCTGAATCTTTTTGGTAACTTTAAATCTTTATCATAGACTCTGTACATATGTTC 
AAATTAGCTGCTTGCCTGATGTGTGTATCATCGGTGGGATGACAGAACAAACATATTTATGATCATGAATAATGTGC 
TTTGTAAAAAGATTTCAAGTTATTAGGAAGCATACTCTGTTTTTTAATCATGTATAATATTCCATGATACTTTTATA 
GAACAATTCTGGCTTCAGGAAAGTCTAGAAGCAATATTTCTTCAAATAAAAGGTGTTTAAACTTTTTTCTG 

<210> SEQ ID NO 14 

<211> Length : 4, 524 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 



<400> sequence : 14 



WO 2006/131783 



PCT/IB2005/004037 



17 

> Z 2 1 3 6 8_PE A__1_T 6 

TCTTCATCTTGCGAGCACTTGGCAGACCGTCGCTAATGAATCTTGGGGCCGGTGTCGGGCCGGGGCGGCTTGATCGG 
CAACTAGGAAACCCCAGGCGCAGAGGCCAGGAGCGAGGGCAGCGAGGATCAGAGGCCAGGCCTTCCCGGCTGCCGGC 
GCTCCTCGGAGGTCAGGGCAGATGAGGAACATGACTCTCCCCCTTCGGAGGAGGAAGGAAGTCCCGCTGCCACCTTA 
TCTCTGCTCCTCTGCCTCCTCCCTGTTCCCAGAGCTTTTTCTCTAGAGAAGATTTTGAAGGCGGCTTTTGTGCTGAC 
GGCCACCCACCATCATCTAAAGAAGATAAACTTGGCAAATGACATGCAGGTTCTTCAAGGCAGAATAATTGCAGAAA 
ATCTTCAAAGGACCCTATCTGCAGATGTTCTGAATACCTCTGAGAATAGAGATTGATTATTCAACCAGGATACCTAA 
TTCAAGAACTCCAGAAATCAGGAGACGGAGACATTTTGTCAGTTTTGCAACATTGGACCAAATACAATGAAGTATTC 
TTGCTGTGCTCTGGTTTTGGCTGTCCTGGGCACAGAATTGCTGGGAAGCCTCTGTTCGACTGTCAGATCCCCGAGGT 
TCAGAGGACGGATACAGCAGGAACGAAAAAACATCCGACCCAACATTATTCTTGTGCTTACCGATGATCAAGATGTG 
GAGCTGGGGTCCCTGCAAGTCATGAACAAAACGAGAAAGATTATGGAACATGGGGGGGCCACCTTCATCAATGCCTT 
TGTGACTACACCCATGTGCTGCCCGTCACGGTCCTCCATGCTCACCGGGAAGTATGTGCACAATCACAATGTCTACA 
CCAACAACGAGAACTGCTCTTCCCCCTCGTGGCAGGCCATGCATGAGCCTCGGACTTTTGCTGTATATCTTAACAAC 
ACTGGCTACAGAACAGCCTTTTTTGGAAAATACCTCAATGAATATAATGGCAGCTACATCCCCCCTGGGTGGCGAGA 
ATGGCTTGGATTAATCAAGAATTCTCGCTTCTATAATTACACTGTTTGTCGCAATGGCATCAAAGAAAAGCATGGAT 
TTGATTATGCAAAGGACTACTTCACAGACTTAATCACTAACGAGAGCATTAATTACTTCAAAATGTCTAAGAGAATG 
TATCCCCATAGGCCCGTTATGATGGTGATCAGCCACGCTGCGCCCCACGGCCCCGAGGACTCAGCCCCACAGTTTTC 
TAAACTGTACCCCAATGCTTCCCAACACATAACTCCTAGTTATAACTATGCACCAAATATGGATAAACACTGGATTA 
TGCAGTACACAGGACCAATGCTGCCCATCCACATGGAATTTACAAACATTCTACAGCGCAAAAGGCTCCAGACTTTG 
ATGTCAGTGGATGATTCTGTGGAGAGGCTGTATAACATGCTCGTGGAGACGGGGGAGCTGGAGAATACTTACATCAT 
TTACACCGCCGACCATGGTTACCATATTGGGCAGTTTGGACTGGTCAAGGGGAAATCCATGCCATATGACTTTGATA 
TTCGTGTGCCTTTTTTTATTCGTGGTCCAAGTGTAGAACCAGGATCAATAGTCCCACAGATCGTTCTCAACATTGAC 
TTGGCCCCCACGATCCTGGATATTGCTGGGCTCGACACACCTCCTGATGTGGACGGCAAGTCTGTCCTCAAACTTCT 
GGACCCAGAAAAGCCAGGTAACAGGTTTCGAACAAACAAGAAGGCCAAAATTTGGCGTGATACATTCCTAGTGGAAA 
GAGGCAAATTTCTACGTAAGAAGGAAGAATCCAGCAAGAATATCCAACAGTCAAATCACTTGCCCAAATATGAACGG 
GTCAAAGAACTATGCCAGCAGGCCAGGTACCAGACAGCCTGTGAACAACCGGGGCAGAAGTGGCAATGCATTGAGGA 
TACATCTGGCAAGCTTCGAATTCACAAGTGTAAAGGACCCAGTGACCTGCTCACAGTCCGGCAGAGCACGCGGAACC 
TCTACGCTCGCGGCTTCCATGACAAAGACAAAGAGTGCAGTTGTAGGGAGTCTGGTTACCGTGCCAGCAGAAGCCAA 
AGAAAGAGTCAACGGCAATTCTTGAGAAACCAGGGGACTCCAAAGTACAAGCCCAGATTTGTCCATACTCGGCAGAC 
ACGTTCCTTGTCCGTCGAATTTGAAGGTGAAATATATGACATAAATCTGGAAGAAGAAGAAGAATTGCAAGTGTTGC 
AACCAAGAAACATTGCTAAGCGTCATGATGAAGGCCACAAGGGGCCAAGAGATCTCCAGGCTTCCAGTGGTGGCAAC 
AGGGGCAGGATGCTGGCAGATAGCAGCAACGCCGTGGGCCCACCTACCACTGTCCGAGTGACACACAAGTGTTTTAT 
TCTTCCCAATGACTCTATCCATTGTGAGAGAGAACTGTACCAATCGGCCAGAGCGTGGAAGGACCATAAGGCATACA 
TTGACAAAGAGATTGAAGCTCTGCAAGATAAAATTAAGAATTTAAGAGAAGTGAGAGGACATCTGAAGAGAAGGAAG 
CCTGAGGAATGTAGCTGCAGTAAACAAAGCTATTACAATAAAGAGAAAGGTGTAAAAAAGCAAGAGAAATTAAAGAG 
CCATCTTCACCCATTCAAGGAGGCTGCTCAGGAAGTAGATAGCAAACTGCAACTTTTCAAGGAGAACAACCGTAGGA 
GGAAGAAGGAGAGGAAGGAGAAGAGACGGCAGAGGAAGGGGGAAGAGTGCAGCCTGCCTGGCCTCACTTGCTTCACG 
CATGACAACAACCACTGGCAGACAGCCCCGTTCTGGAACCCTCACAAATACAGTGCACACGGTAGAACGAGGCATTT 
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TGAATCAGCTACACGTACAACTAATGGAGCTCAGAAGCTGTCAAGGATATAAGCAGTGCAACCCAAGACCTAAGAAT 
CTTGATGTTGGAAATAAAGATGGAGGAAGCTATGACCTACACAGAGGACAGTTATGGGATGGATGGGAAGGTTAATC 
AGCCCCGTCTCACTGCAGACATCAACTGGCAAGGCCTAGAGGAGCTACACAGTGTGAATGAAAACATCTATGAGTAC 
AGACAAAACTACAGACTTAGTCTGGTGGACTGGACTAATTACTTGAAGGATTTAGATAGAGTATTTGCACTGCTGAA 
GAGTCACTATGAGCAAAATAAAACAAATAAGACTCAAACTGCTCAAAGTGACGGGTTCTTGGTTGTCTCTGCTGAGC 
ACGCTGTGTCAATGGAGATGGCCTCTGCTGACTCAGATGAAGACCCAAGGCATAAGGTTGGGAAAACACCTCATTTG 
ACCTTGCCAGCTGACCTTCAAACCCTGCATTTGAACCGACCAACATTAAGTCCAGAGAGTAAACTTGAATGGAATAA 
C G AC AT T C C AG A AG T T A AT CAT T T G AAT TC T G AAC AC TG G AGAAAAAC C G A AAA AT G GAG G G GGC AT G AAG AG AC T A 
ATCATCTGGAAACCGATTTCAGTGGCGATGGCATGACAGAGCTAGAGCTCGGGCCCAGCCCCAGGCTGCAGCCCATT 
CGCAGGCACCCGAAAGAACTTCCCCAGTATGGTGGTCCTGGAAAGGACATTTTTGAAGATCAACTATATCTTCCTGT 
GCATTCCGATGGAATTTCAGTTCATCAGATGTTCACCATGGCCACCGCAGAACACCGAAGTAATTCCAGCATAGCGG 
GGAAGATGTTGACCAAGGTGGAGAAGAATCACGAAAAGGAGAAGTCACAGCACCTAGAAGGCAGCGCCTCCTCTTCA 
CTCTCCTCTGATTAGATGAAACTGTTACCTTACCCTAAACACAGTATTTCTTTTTAACTTTTTTATTTGTAAACTAA 
TAAAGGTAATCACAGCCACCAACATTCCAAGCTACCCTGGGTACCTTTGTGCAGTAGAAGCTAGTGAGCATGTGAGC 
AAGCGGTGTGCACACGGAGACTCATCGTTATAATTTACTATCTGCCAAGAGTAGAAAGAAAGGCTGGGGATATTTGG 
GTTGGCTTGGTTTTGATTTTTTGCTTGTTTGTTTGTTTTGTACTAAAACAGTATTATCTTTTGAATATCGTAGGGAC 
ATAAGTATATACATGTTATCCAATCAAGATGGCTAGAATGGTGCCTTTCTGAGTGTCTAAAACTTGACACCCCTGGT 
AAATCTTTCAACACACTTCCACTGCCTGCGTAATGAAGTTTTGATTCATTTTTAACCACTGGAATTTTTCAATGCCG 
TCATTTTCAGTTAGATGATTTTGCACTTTGAGATTAAAATGCCATGTCTATTTGATTAGTCTTATTTTTTTATTTTT 
ACAGGCTTATCAGTCTCACTGTTGGCTGTCATTGTGACAAAGTCAAATAAACCCCCAAGGACGACACACAGTATGGA 
TCACATATTGTTTGACATTAAGCTTTTGCCAGAAAATGTTGCATGTGTTTTACCTCGACTTGCTAAAATCGATTAGC 
AGAAAGGCATGGCTAATAATGTTGGTGGTGAAAATAAATAAATAAGTAAACAAAATGA 

<210> SEQ ID NO 15 

<211> Length : 4,454 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 15 
>Z 2 1 3 6 8_PE A_1_T 9 

AGGTTACTTGACTGGGAGTTCTCAGACCTCCAGTTTCAGCCCTGCCCTCAGCCTCCAATCCGTAAGAGACACCCAGC 
CCCAGCAATTGGATTGGGCAGCCCGTCTTGACACACCACTGTGCTGAGTGCTTGAGGACGTGTTTCAACAGATGGTT 
GGGGTTAGTGTGTGTCATCACATTCGAGTGGGGATTAAGAGAAGGAAGGCTGCCTTGCTGGAGCTGTGTGGTCTTCT 
CCAAGTGAGAGTCGCAGGCAATAGAACTACTTTGCTTTTGGAGGAAAAGGAGGAATTCATTTTCAGCAGACACAAGA 
AAAGCAGTTTTTTTTTCAGGTGCTGACGGCCACCCACCATCATCTAAAGAAGATAAACTTGGCAAATGACATGCAGG 
TTCTTCAAGGCAGAATAATTGCAGAAAATCTTCAAAGGACCCTATCTGCAGATGTTCTGAATACCTCTGAGAATAGA 
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GATTGATTATTCAACCAGGATACCTAATTCAAGAACTCCAGAAATCAGGAGACGGAGACATTTTGTCAGTTTTGCAA 
CATTGGACCAAATACAATGAAGTATTCTTGCTGTGCTCTGGTTTTGGCTGTCCTGGGCACAGAATTGCTGGGAAGCC 
TCTGTTCGACTGTCAGATCCCCGAGGTTCAGAGGACGGATACAGCAGGAACGAAAAAACATCCGACCCAACATTATT 
CTTGTGCTTACCGATGATCAAGATGTGGAGCTGGCCTTTTTTGGAAAATACCTCAATGAATATAATGGCAGCTACAT 
CCCCCCTGGGTGGCGAGAATGGCTTGGATTAATCAAGAATTCTCGCTTCTATAATTACACTGTTTGTCGCAATGGCA 
TCAAAGAAAAGCATGGATTTGATTATGCAAAGGACTACTTCACAGACTTAATCACTAACGAGAGCATTAATTACTTC 
AAAATGTCTAAGAGAATGTATCCCCATAGGCCCGTTATGATGGTGATCAGCCACGCTGCGCCCCACGGCCCCGAGGA 
CTCAGCCCCACAGTTTTCTAAACTGTACCCCAATGCTTCCCAACACATAACTCCTAGTTATAACTATGCACCAAATA 
TGGATAAACACTGGATTATGCAGTACACAGGACCAATGCTGCCCATCCACATGGAATTTACAAACATTCTACAGCGC 
AAAAGGCTCCAGACTTTGATGTCAGTGGATGATTCTGTGGAGAGGCTGTATAACATGCTCGTGGAGACGGGGGAGCT 
GGAGAATACTTACATCATTTACACCGCCGACCATGGTTACCATATTGGGCAGTTTGGACTGGTCAAGGGGAAATCCA 
TGCCATATGACTTTGATATTCGTGTGCCTTTTTTTATTCGTGGTCCAAGTGTAGAACCAGGATCAATAGTCCCACAG 
ATCGTTCTCAACATTGACTTGGCCCCCACGATCCTGGATATTGCTGGGCTCGACACACCTCCTGATGTGGACGGCAA 
GTCTGTCCTCAAACTTCTGGACCCAGAAAAGCCAGGTAACAGGTTTCGAACAAACAAGAAGGCCAAAATTTGGCGTG 
ATACATTCCTAGTGGAAAGAGGCAAATTTCTACGTAAGAAGGAAGAATCCAGCAAGAATATCCAACAGTCAAATCAC 
TTGCCCAAATATGAACGGGTCAAAGAACTATGCCAGCAGGCCAGGTACCAGACAGCCTGTGAACAACCGGGGCAGAA 
GTGGCAATGCATTGAGGATACATCTGGCAAGCTTCGAATTCACAAGTGTAAAGGACCCAGTGACCTGCTCACAGTCC 
GGCAGAGCACGCGGAACCTCTACGCTCGCGGCTTCCATGACAAAGACAAAGAGTGCAGTTGTAGGGAGTCTGGTTAC 
CGTGCCAGCAGAAGCCAAAGAAAGAGTCAACGGCAATTCTTGAGAAACCAGGGGACTCCAAAGTACAAGCCCAGATT 
TGTCCATACTCGGCAGACACGTTCCTTGTCCGTCGAATTTGAAGGTGAAATATATGACATAAATCTGGAAGAAGAAG 
AAGAATTGCAAGTGTTGCAACCAAGAAACATTGCTAAGCGTCATGATGAAGGCCACAAGGGGCCAAGAGATCTCCAG 
GCTTCCAGTGGTGGCAACAGGGGCAGGATGCTGGCAGATAGCAGCAACGCCGTGGGCCCACCTACCACTGTCCGAGT 
GACACACAAGTGTTTTATTCTTCCCAATGACTCTATCCATTGTGAGAGAGAACTGTACCAATCGGCCAGAGCGTGGA 
AGGACCATAAGGCATACATTGACAAAGAGATTGAAGCTCTGCAAGATAAAATTAAGAATTTAAGAGAAGTGAGAGGA 
CATCTGAAGAGAAGGAAGCCTGAGGAATGTAGCTGCAGTAAACAAAGCTATTACAATAAAGAGAAAGGTGTAAAAAA 
GCAAGAGAAATTAAAGAGCCATCTTCACCCATTCAAGGAGGCTGCTCAGGAAGTAGATAGCAAACTGCAACTTTTCA 
AGGAGAACAACCGTAGGAGGAAGAAGGAGAGGAAGGAGAAGAGACGGCAGAGGAAGGGGGAAGAGTGCAGCCTGCCT 
GGCCTCACTTGCTTCACGCATGACAACAACCACTGGCAGACAGCCCCGTTCTGGAACCTGGGATCTTTCTGTGCTTG 
CACGAGTTCTAACAATAACACCTACTGGTGTTTGCGTACAGTTAATGAGACGCATAATTTTCTTTTCTGTGAGTTTG 
CTACTGGCTTTTTGGAGTATTTTGATATGAATACAGATCCTTATCAGCTCACAAATACAGTGCACACGGTAGAACGA 
GGCATTTTGAATCAGCTACACGTACAACTAATGGAGCTCAGAAGCTGTCAAGGATATAAGCAGTGCAACCCAAGACC 
TAAGAATCTTGATGTTGGAAATAAAGATGGAGGAAGCTATGACCTACACAGAGGACAGTTATGGGATGGATGGGAAG 
GTTAATCAGCCCCGTCTCACTGCAGACATCAACTGGCAAGGCCTAGAGGAGCTACACAGTGTGAATGAAAACATCTA 
TGAGTACAGACAAAACTACAGACTTAGTCTGGTGGACTGGACTAATTACTTGAAGGATTTAGATAGAGTATTTGCAC 
TGCTGAAGAGTCACTATGAGCAAAATAAAACAAATAAGACTCAAACTGCTCAAAGTGACGGGTTCTTGGTTGTCTCT 
GCTGAGCACGCTGTGTCAATGGAGATGGCCTCTGCTGACTCAGATGAAGACCCAAGGCATAAGGTTGGGAAAACACC 
TCATTTGACCTTGCCAGCTGACCTTCAAACCCTGCATTTGAACCGACCAACATTAAGTCCAGAGAGTAAACTTGAAT 
GGAATAACGACATTCCAGAAGTTAATCATTTGAATTCTGAACACTGGAGAAAAACCGAAAAATGGACGGGGCATGAA 
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GAGACTAATCATCTGGAAACCGATTTCAGTGGCGATGGCATGACAGAGCTAGAGCTCGGGCCCAGCCCCAGGCTGCA 
GCCCATTCGCAGGCACCCGAAAGAACTTCCCCAGTATGGTGGTCCTGGAAAGGACATTTTTGAAGATCAACTATATC 
TTCCTGTGCATTCCGATGGAATTTCAGTTCATCAGATGTTCACCATGGCCACCGCAGAACACCGAAGTAATTCCAGC 
ATAGCGGGGAAGATGTTGACCAAGGTGGAGAAGAATCACGAAAAGGAGAAGTCACAGCACCTAGAAGGCAGCGCCTC 
CTCTTCACTCTCCTCTGATTAGATGAAACTGTTACCTTACCCTAAACACAGTATTTCTTTTTAACTTTTTTATTTGT 
AAACTAATAAAGGTAATCACAGCCACCAACATTCCAAGCTACCCTGGGTACCTTTGTGCAGTAGAAGCTAGTGAGCA 
TGTGAGCAAGCGGTGTGCACACGGAGACTCATCGTTATAATTTACTATCTGCCAAGAGTAGAAAGAAAGGCTGGGGA 
TATTTGGGTTGGCTTGGTTTTGATTTTTTGCTTGTTTGTTTGTTTTGTACTAAAACAGTATTATCTTTTGAATATCG 
TAGGGACATAAGTATATACATGTTATCCAATCAAGATGGCTAGAATGGTGCCTTTCTGAGTGTCTAAAACTTGACAC 
CCCTGGTAAATCTTTCAACACACTTCCACTGCCTGCGTAATGAAGTTTTGATTCATTTTTAACCACTGGAATTTTTC 
AATGCCGTCATTTTCAGTTAGATGATTTTGCACTTTGAGATTAAAATGCCATGTCTATTTGATTAGTCTTATTTTTT 
TATTTTTACAGGCTTATCAGTCTCACTGTTGGCTGTCATTGTGACAAAGTCAAATAAACCCCCAAGGACGACACACA 
GTATGGATCACATATTGTTTGACATTAAGCTTTTGCCAGAAAATGTTGCATGTGTTTTACCTCGACTTGCTAAAATC 
G A T T A G C A G A A A GGCATGGCT A A T A A TGTTGGTGGT G AAA A T A A AT A A AT A AG T A A AC A A A A T G A 



SEQ ID NO: 16 
>H5 3 62 6_PEA_1_T1 5 

GTCCGGACAGGCCGAGATGACGCCGAGCCCCCTGTTGCTGCTCCTGCTGCCGCCGCTGCTGCTGGGGGCCTTCCCGC 
CGGCCGCCGCCGCCCGAGGCCCCCCAAAGATGGCGGACAAGGTGGTCCCACGGCAGGTGGCCCGGCTGGGCCGCACT 
GTGCGGCTGCAGTGCCCAGTGGAGGGGGACCCGCCGCCGCTGACCATGTGGACCAAGGATGGCCGCACCATCCACAG 
CGGCTGGAGCCGCTTCCGCGTGCTGCCGCAGGGGCTGAAGGTGAAGCAGGTGGAGCGGGAGGATGCCGGCGTGTACG 
TGTGCAAGGCCACCAACGGCTTCGGCAGCCTGAGCGTCAACTACACCCTCGTCGTGCTGGATGACATTAGCCCAGGG 
AAGGAGAGCCTGGGGCCCGACAGCTCCTCTGGGGGTCAAGAGGACCCCGCCAGCCAGCAGTGGGCACGACCGCGCTT 
CACACAGCCCTCCAAGATGAGGCGCCGGGTGATCGCACGGCCCGTGGGTAGCTCCGTGCGGCTCAAGTGCGTGGCCA 
GCGGGCACCCTCGGCCCGACATCACGTGGATGAAGGACGACCAGGCCTTGACGCGCCCAGAGGCCGCTGAGCCCAGG 
AAGAAGAAGTGGACACTGAGCCTGAAGAACCTGCGGCCGGAGGACAGCGGCAAATACACCTGCCGCGTGTCGAACCG 
CGCGGGCGCCATCAACGCCACCTACAAGGTGGATGTGATCCAGCGGACCCGTTCCAAGCCCGTGCTCACAGGCACGC 
ACCCCGTGAACACGACGGTGGACTTCGGGGGGACCACGTCCTTCCAGTGCAAGGTGCGCAGCGACGTGAAGCCGGTG 
ATCCAGTGGCTGAAGCGCGTGGAGTACGGCGCCGAGGGCCGCCACAACTCCACCATCGATGTGGGCGGCCAGAAGTT 
TGTGGTGCTGCCCACGGGTGACGTGTGGTCGCGGCCCGACGGCTCCTACCTCAATAAGCTGCTCATCACCCGTGCCC 
GCCAGGACGATGCGGGCATGTACATCTGCCTTGGCGCCAACACCATGGGCTACAGCTTCCGCAGCGCCTTCCTCACC 
GTGCTGCCAGGTGCGCGGCTGCCACGCCACGCCACACCATGCTGGTGCCCGGACCCGCCCCCTGGGCCCGGCGTCCC 
ACCCACCGGGTGGGGCCCCACCCTTCCCTCCCGGGCCGTGCTGGCCAGGTCATCTGCCGAGGGAGGGCAGCCCAGGG 
GCACCGTCTCCACAGCCCCTGGGATGGGTCTGGGGTGCTCTCCTGGTCTTTGTGTCGGCGTTCCCCTCCCTACCTCC 
TTTCCTCTCGCTCTTGCAGACCCAAAACCGCCAGGGCCACCTGTGGCCTCCTCGTCCTCGGCCACTAGCCTGCCGTG 
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GCCCGTGGTCATCGGCATCCCAGCCGGCGCTGTCTTCATCCTGGGCACCCTGCTCCTGTGGCTTTGCCAGGCCCAGA 
AGAAGCCGTGCACCCCCGCGCCTGCCCCTCCCCTGCCTGGGCACCGCCCGCCGGGGACGGCCCGCGACCGCAGCGGA 
GACAAGGACCTTCCCTCGTTGGCCGCCCTCAGCGCTGGCCCTGGTGTGGGGCTGTGTGAGGAGCATGGGTCTCCGGC 
AGCCCCCCAGCACTTACTGGGCCCAGGCCCAGTTGCTGGCCCTAAGTTGTACCCCAAACTCTACACAGACATCCACA 
CACACACACACACACACTCTCACACACACTCACACGTGGAGGGCAAGGTCCACCAGCACATCCACTATCAGTGCTAG 
ACGGCACCGTATCTGCAGTGGGCACGGGGGGGCCGGCCAGACAGGCAGACTGGGAGGATGGAGGACGGAGCTGCAGA 
CGAAGGCAGGGGACCCATGGCGAGGAGGAATGGCCAGCACCCCAGGCAGTCTGTGTGTGAGGCATAGCCCCTGGACA 
CACACACACAGACACACACACTACCTGGATGCATGTATGCACACACATGCGCGCACACGTGCTCCCTGAAGGCACAC 
GTACGCACACACGCACATGCACAGATATGCCGCCTGGGCACACAGATAAGCTGCCCAAATGCACGCACACGCACAGA 
GACATGCCAGAACATACAAGGACATGCTGCCTGAACATACACACGCACACCCATGCGCAGATGTGCTGCCTGGACAC 
ACACACACACACGGATATGCTGTCTGGACGCACACACGTGCAGATATGGTATCCGGACACACACGTGCACAGATATG 
CTGCCTGGACACACAGATAATGCTGCCTTGACACACACATGCACGGATATTGCCTGGACACACACACACACACGCGT 
GCACAGATATGCTGTCTGGACACGCACACACATGCAGATATGCTGCCTGGACACACACTTCCAGACACACGTGCACA 
GGCGCAGATATGCTGCCTGGACACACGCAGATATGCTGTCTAGTCACACACACACGCAGACATGCTGTCCGGACACA 
CACACGCATGCACAGATATGCTGTCCGGACACACACACGCACGCAGATATGCTGCCTGGACACACACACAGATAATG 
CTGCCTCAACACTCACACACGTGCAGATATTGCCTGGACACACACATGTGCACAGATATGCTGTCTGGACATGCACA 
CACGTGCAGATATGCTGTCCGGATACACACGCACGCACACATGCAGATATGCTGCCTGGGCACACACTTCCGGACAC 
ACATGCACACACAGGTGCAGATATGCTGCCTGGACACACGCAGACTGACGTGCTTTTGGGAGGGTGTGCCGTGAAGC 
CTGCAGTACGTGTGCCGTGAGGCTCATAGTTGATGAGGGACTTTCCCTGCTCCACCGTCACTCCCCCAACTCTGCCC 
GCCTCTGTCCCCGCCTCAGTCCCCGCCTCCATCCCCGCCTCTGTCCCCTGGCCTTGGCGGCTATTTTTGCCACCTGC 
CTTGGGTGCCCAGGAGTCCCCTACTGCTGTGGGCTGGGGTTGGGGGCACAGCAGCCCCAAGCCTGAGAGGCTGGAGC 
CCATGGCTAGTGGCTCATCCCCACTGCATTCTCCCCCTGACACAGAGAAGGGGCCTTGGTATTTATATTTAAGAAAT 
GAAGATAATATTAATAATGATGGAAGGAAGACTGGGTTGCAGGGACTGTGGTCTCTCCTGGGGCCCGGGACCCGCCT 
GGTCTTTCAGCCATGCTGATGACCACACCCCGTCCAGGCCAGACACCACCCCCCACCCCACTGTCGTGGTGGCCCCA 
GATCTCTGTAATTTTATGTAGAGTTTGAGCTGAAGCCCCGTATATTTAATTTATTTTGTTAAACATGAAAGTGCATC 
CTTTCCCTCCA 

SEQ ID NO: 17 
>H53 62 6_PEA_1_T1 6 

GTCCGGACAGGCCGAGATGACGCCGAGCCCCCTGTTGCTGCTCCTGCTGCCGCCGCTGCTGCTGGGGGCCTTCCCGC 
CGGCCGCCGCCGCCCGAGGCCCCCCAAAGATGGCGGACAAGGTGGTCCCACGGCAGGTGGCCCGGCTGGGCCGCACT 
GTGCGGCTGCAGTGCCCAGTGGAGGGGGACCCGCCGCCGCTGACCATGTGGACCAAGGATGGCCGCACCATCCACAG 
CGGCTGGAGCCGCTTCCGCGTGCTGCCGCAGGGGCTGAAGGTGAAGCAGGTGGAGCGGGAGGATGCCGGCGTGTACG 
TGTGCAAGGCCACCAACGGCTTCGGCAGCCTGAGCGTCAACTACACCCTCGTCGTGCTGGATGACATTAGCCCAGGG 
AAGGAGAGCCTGGGGCCCGACAGCTCCTCTGGGGGTCAAGAGGACCCCGCCAGCCAGCAGTGGGCACGACCGCGCTT 
CACACAGCCCTCCAAGATGAGGCGCCGGGTGATCGCACGGCCCGTGGGTAGCTCCGTGCGGCTCAAGTGCGTGGCCA 
GCGGGCACCCTCGGCCCGACATCACGTGGATGAAGGACGACCAGGCCTTGACGCGCCCAGAGGCCGCTGAGCCCAGG 
AAGAAGAAGTGGACACTGAGCCTGAAGAACCTGCGGCCGGAGGACAGCGGCAAATACACCTGCCGCGTGTCGAACCG 
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CGCGGGCGCCATCAACGCCACCTACAAGGTGGATGTGATCCAGCGGACCCGTTCCAAGCCCGTGCTCACAGGCACGC 
ACCCCGTGAACACGACGGTGGACTTCGGGGGGACCACGTCCTTCCAGTGCAAGACCCAAAACCGCCAGGGCCACCTG 
TGGCCTCCTCGTCCTCGGCCACTAGCCTGCCGTGGCCCGTGGTCATCGGCATCCCAGCCGGCGCTGTCTTCATCCTG 
GGCACCCTGCTCCTGTGGCTTTGCCAGGCCCAGAAGAAGCCGTGCACCCCCGCGCCTGCCCCTCCCCTGCCTGGGCA 
CCGCCCGCCGGGGACGGCCCGCGACCGCAGCGGAGACAAGGACCTTCCCTCGTTGGCCGCCCTCAGCGCTGGCCCTG 
GTGTGGGGCTGTGTGAGGAGCATGGGTCTCCGGCAGCCCCCCAGCACTTACTGGGCCCAGGCCCAGTTGCTGGCCCT 
AAGTTGTACCCCAAACTCTACACAGACATCCACACACACACACACACACACTCTCACACACACTCACACGTGGAGGG 
CAAGGTCCACCAGCACATCCACTATCAGTGCTAGACGGCACCGTATCTGCAGTGGGCACGGGGGGGCCGGCCAGACA 
GGCAGACTGGGAGGATGGAGGACGGAGCTGCAGACGAAGGCAGGGGACCCATGGCGAGGAGGAATGGCCAGCACCCC 
AGGCAGTCTGTGTGTGAGGCATAGCCCCTGGACACACACACACAGACACACACACTACCTGGATGCATGTATGCACA 
CACATGCGCGCACACGTGCTCCCTGAAGGCACACGTACGCACACACGCACATGCACAGATATGCCGCCTGGGCACAC 
AGATAAGCTGCCCAAATGCACGCACACGCACAGAGACATGCCAGAACATACAAGGACATGCTGCCTGAACATACACA 
CGCACACCCATGCGCAGATGTGCTGCCTGGACACACACACACACACGGATATGCTGTCTGGACGCACACACGTGCAG 
ATATGGTATCCGGACACACACGTGCACAGATATGCTGCCTGGACACACAGATAATGCTGCCTTGACACACACATGCA 
CGGATATTGCCTGGACACACACACACACACGCGTGCACAGATATGCTGTCTGGACACGCACACACATGCAGATATGC 
TGCCTGGACACACACTTCCAGACACACGTGCACAGGCGCAGATATGCTGCCTGGACACACGCAGATATGCTGTCTAG 
TCACACACACACGCAGACATGCTGTCCGGACACACACACGCATGCACAGATATGCTGTCCGGACACACACACGCACG 
CAGATATGCTGCCTGGACACACACACAGATAATGCTGCCTCAACACTCACACACGTGCAGATATTGCCTGGACACAC 
ACATGTGCACAGATATGCTGTCTGGACATGCACACACGTGCAGATATGCTGTCCGGATACACACGCACGCACACATG 
CAGATATGCTGCCTGGGCACACACTTCCGGACACACATGCACACACAGGTGCAGATATGCTGCCTGGACACACGCAG 
ACTGACGTGCTTTTGGGAGGGTGTGCCGTGAAGCCTGCAGTACGTGTGCCGTGAGGCTCATAGTTGATGAGGGACTT 
TCCCTGCTCCACCGTCACTCCCCCAACTCTGCCCGCCTCTGTCCCCGCCTCAGTCCCCGCCTCCATCCCCGCCTCTG 
TCCCCTGGCCTTGGCGGCTATTTTTGCCACCTGCCTTGGGTGCCCAGGAGTCCCCTACTGCTGTGGGCTGGGGTTGG 
GGGCACAGCAGCCCCAAGCCTGAGAGGCTGGAGCCCATGGCTAGTGGCTCATCCCCACTGCATTCTCCCCCTGACAC 
AGAGAAGGGGCCTTGGTATTTATATTTAAGAAATGAAGATAATATTAATAATGATGGAAGGAAGACTGGGTTGCAGG 
GACTGTGGTCTCTCCTGGGGCCCGGGACCCGCCTGGTCTTTCAGCCATGCTGATGACCACACCCCGTCCAGGCCAGA 
CACCACCCCCCACCCCACTGTCGTGGTGGCCCCAGATCTCTGTAATTTTATGTAGAGTTTGAGCTGAAGCCCCGTAT 
ATTTAATTTATTTTGTTAAACATGAAAGTGCATCCTTTCCCTCCA 



SEQ ID NO: 18 

>H5362 6_PEA_l_node_15 

GCCCCCCAAAGATGGCGGACAAGGTGGTCCCACGGCAGGTGGCCCGGCTGGGCCGCACTGTGCGGCTGCAGTGCCCA 
GTGGAGGGGGACCCGCCGCCGCTGACCATGTGGACCAAGGATGGCCGCACCATCCACAGCGGCTGGAGCCGCTTCCG 
CGTGCTGCCGCAGGGGCTGAAGGTGAAGCAGGTGGAGCGGGAGGATGCCGGCGTGTACGTGTGCAAGGCCACCAACG 
GCTTCGGCAGCCTGAGC 



SEQ ID NO: 19 
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>H53 62 6_PEA_l_node_22 

CACGACCGCGCTTCACACAGCCCTCCAAGATGAGGCGCCGGGTGATCGCACGGCCCGTGGGTAGCTCCGTGCGGCTC 
AAGTGCGTGGCCAGCGGGCACCCTCGGCCCGACATCACGTGGATGAAGGACGACCAGGCCTTGACGCGCCCAGAGGC 
CGCTGAGCCCAGGAAGAAGAAGTGGACACTGAGCCTGAAGAACCTGCGGCCGGAGGACAGCGGCAAATACACCTGCC 
GCGTGTCGAACCGCGCGGGCGCCATCAACGCCACCTACAAGGTGGATGTGATCC 

<210> SEQ ID NO 20 

<211> Length : 1, 362 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 20 
>HUMGRP5E_T4 

CCAAAATCTATGGGCTGGGACAGCAAAGATGTGGCCTACGAAGAGAAAGGTCTGGAGAATCAGAAGGCCTTCAAATG 
GTGGTTCCAAATCCCTCCAGCAAAGCCCATCCATCTTTAGAGCTCACCCGTCTCCAGCTACACCCCCCACCCCTCCC 
GGCCCAGATCAGGCAGCGGGGTCGCCCTCTCCAGGACTCTCAAGGCAGCTAAGGCTGGAGGCGCCGGCGAGCCTGGA 
GAGGGAGGAGTTCACTAAATTGTGTTGGATGGAAGGCGTCGAGGACCGGAGGAATTAATCCGATGTGGGGAAGGCGG 
ACGGGGCTACGAGGAAAAAAGAGGGGGCAATGTACACTCAGCCTTTTCATCACTCGGCGGGGAGATGGATGGTTTTC 
CGGACCGGGCGTCCCAGCGCCCCGGTTAGCTATAGGGAGACGTCAGAGCGCTCTGGTCCGCGATAGAAGAGCCCCCC 
AGCCCCCCCGCCCGGGCTTCCATATAAAGTAGGGGCCCTAGTGGAGGCCGCAGCAGTAGCACCAGCGGCTGCGGCGG 
CGGAGCTCCTCCGAGGTCCGGGTCACCAGTCTCTGCTCTTCCCAGCCTCTCCGGCGCGCTCCAAGGGCTTCCCGTCG 
GGACCATGCGCGGCAGTGAGCTCCCGCTGGTCCTGCTGGCGCTGGTCCTCTGCCTGGCGCCCCGGGGGCGAGCGGTC 
CCGCTGCCTGCGGGCGGAGGGACCGTGCTGACCAAGATGTACCCGCGCGGCAACCACTGGGCGGTGGGGCACTTAAT 
GGGGAAAAAGAGCACAGGGGAGTCTTCTTCTGTTTCTGAGAGAGGGAGCCTGAAGCAGCAGCTGAGAGAGTACATCA 
GGTGGGAAGAAGCTGCAAGGAATTTGCTGGGTCTCATAGAAGCAAAGGAGAACAGAAACCACCAGCCACCTCAACCC 
AAGGCCCTGGGCAATCAGCAGCCTTCGTGGGATTCAGAGGATAGCAGCAACTTCAAAGATGTAGGTTCAAAAGGCAA 
AGGTTCTCAACGTGAAGGAAGGAACCCCCAGCTGAACCAGCAATGATAATGATGGCCTCTCTCAAAAGAGAAAAACA 
AAACCCCTAAGAGACTGCGTTCTGCAAGCATCAGTTCTACGGATCATCAACAAGATTTCCTTGTGCAAAATATTTGA 
CTATTCTGTATCTTTCATCCTTGACTAAATTCGTGATTTTCAAGCAGCATCTTCTGGTTTAAACTTGTTTGCTGTGA 
ACAATTGTCGAAAAGAGTCTTCCAATTAATGCTTTTTTATATCTAGGCTACCTGTTGGTTAGATTCAAGGCCCCGAG 
CTGTTACCATTCACAATAAAAGCTTAAACACATTGTCCAAAGGGCAGGCTGTT 

<210> SEQ ID NO 21 
<211> Length : 1,376 
<212> Type : DNA 
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<213> ORGANISM : Homo sapiens 

<400> sequence : 21 
>HUMGRP5E_T5 

CCAAAATCTATGGGCTGGGACAGCAAAGATGTGGCCTACGAAGAGAAAGGTCTGGAGAATCAGAAGGCCTTCAAATG 
GTGGTTCCAAATCCCTCCAGCAAAGCCCATCCATCTTTAGAGCTCACCCGTCTCCAGCTACACCCCCCACCCCTCCC 
GGCCCAGATCAGGCAGCGGGGTCGCCCTCTCCAGGACTCTCAAGGCAGCTAAGGCTGGAGGCGCCGGCGAGCCTGGA 
GAGGGAGGAGTTCACTAAATTGTGTTGGATGGAAGGCGTCGAGGACCGGAGGAATTAATCCGATGTGGGGAAGGCGG 
ACGGGGCTACGAGGAAAAAAGAGGGGGCAATGTACACTCAGCCTTTTCATCACTCGGCGGGGAGATGGATGGTTTTC 
CGGACCGGGCGTCCCAGCGCCCCGGTTAGCTATAGGGAGACGTCAGAGCGCTCTGGTCCGCGATAGAAGAGCCCCCC 
AGCCCCCCCGCCCGGGCTTCCATATAAAGTAGGGGCCCTAGTGGAGGCCGCAGCAGTAGCACCAGCGGCTGCGGCGG 
CGGAGCTCCTCCGAGGTCCGGGTCACCAGTCTCTGCTCTTCCCAGCCTCTCCGGCGCGCTCCAAGGGCTTCCCGTCG 
GGACCATGCGCGGCAGTGAGCTCCCGCTGGTCCTGCTGGCGCTGGTCCTCTGCCTGGCGCCCCGGGGGCGAGCGGTC 
CCGCTGCCTGCGGGCGGAGGGACCGTGCTGACCAAGATGTACCCGCGCGGCAACCACTGGGCGGTGGGGCACTTAAT 
GGGGAAAAAGAGCACAGGGGAGTCTTCTTCTGTTTCTGAGAGAGGGAGCCTGAAGCAGCAGCTGAGAGAGTACATCA 
GGTGGGAAGAAGCTGCAAGGAATTTGCTGGGTCTCATAGAAGCAAAGGAGAACAGAAACCACCAGCCACCTCAACCC 
AAGGCCCTGGGCAATCAGCAGCCTTCGTGGGATTCAGAGGATAGCAGCAACTTCAAAGATGTAGGTTCAAAAGGCAA 
AGACTCTCTGCTCCAGGTTCTCAACGTGAAGGAAGGAACCCCCAGCTGAACCAGCAATGATAATGATGGCCTCTCTC 
AAAAGAGAAAAACAAAACCCCTAAGAGACTGCGTTCTGCAAGCATCAGTTCTACGGATCATCAACAAGATTTCCTTG 
TGCAAAATATTTGACTATTCTGTATCTTTCATCCTTGACTAAATTCGTGATTTTCAAGCAGCATCTTCTGGTTTAAA 
CTTGTTTGCTGTGAACAATTGTCGAAAAGAGTCTTCCAATTAATGCTTTTTTATATCTAGGCTACCTGTTGGTTAGA 
TTCAAGGCCCCGAGCTGTTACCATTCACAATAAAAGCTTAAACACATTGTCCAAAGGGCAGGCTGTT 

<210> SEQ ID NO 22 

<211> Length : 902 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 22 
>D5 64 0 6_PEA_1JT3 

TTCACTCACTTTCAAAGCCAGCTGAAGGAAAGAGGAAGTGCTAGAGAGAGCCCCCTTCAGTGTGCTTCTGACTTTTA 
CGGACTTGGCTTGTTAGAAGGCTGAAAGATGATGGCAGGAATGAAAATCCAGCTTGTATGCATGCTACTCCTGGCTT 
TCAGCTCCTGGAGTCTGTGCTCAGATTCAGAAGAGGAAATGAAAGCATTAGAAGCAGATTTCTTGACCAATATGCAT 
ACATCAAAGATTAGTAAAGCACATGTTCCCTCTTGGAAGATGACTCTGCTAAATGTTTGCAGTCTTGTAAATAATTT 
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GAACAGCCCAGCTGAGGAAACAGGAGAAGTTCATGAAGAGGAGCTTGTTGCAAGAAGGAAACTTCCTACTGCTTTAG 
ATGGCTTTAGCTTGGAAGCAATGTTGACAATATACCAGCTCCACAAAATCTGTCACAGCAGGGCTTTTCAACACTGG 
GAGGCACGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGACGGGCGGATCACGAGGTCAAGAGATGGA 
GACCATCCCGGCTAACACGTTAATCCAGGAAGATATTCTTGATACTGGAAATGACAAAAATGGAAAGGAAGAAGTCA 
TAAAGAGAAAAATTCCTTATATTCTGAAACGGCAGCTGTATGAGAATAAACCCAGAAGACCCTACATACTCAAAAGA 
GATTCTTACTATTACTGAGAGAATAAATCATTTATTTACATGTGATTGTGATTCATCATCCCTTAATTAAATATCAA 
ATTATATTTGTGTGAAAATGTGACAAACACACTTATCTGTCTCTTCTACAATTGTGGTTTATTGAATGTGATTTTTC 
TGCACTAATATAAATTAGACTAAGTGTTTTCAAATAAATCTAAATCTTCAGCATG 

<210> SEQ ID NO 23 

<211> Length : 1,239 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 23 
>D5 64 0 6_PEA_1 JT 6 

TTCACTCACTTTCAAAGCCAGCTGAAGGAAAGAGGAAGTGCTAGAGAGAGCCCCCTTCAGTGTGCTTCTGACTTTTA 
CGGACTTGGCTTGTTAGAAGGCTGAAAGATGATGGCAGGAATGAAAATCCAGCTTGTATGCATGCTACTCCTGGCTT 
TCAGCTCCTGGAGTCTGTGCTCAGAAGAGGAAATGAAAGCATTAGAAGCAGATTTCTTGACCAATATGCATACATCA 
AAGATTAGTAAAGCACATGTTCCCTCTTGGAAGATGACTCTGCTAAATGTTTGCAGTCTTGTAAATAATTTGAACAG 
CCCAGCTGAGGAAACAGGAGAAGTTCATGAAGAGGAGCTTGTTGCAAGAAGGAAACTTCCTACTGCTTTAGATGGCT 
TTAGCTTGGAAGCAATGTTGACAATATACCAGCTCCACAAAATCTGTCACAGCAGGGCTTTTCAACACTGGGAGTTA 
ATCCAGGAAGATATTCTTGATACTGGAAATGACAAAAATGGAAAGGAAGAAGTCATAAAGAGAAAAATTCCTTATAT 
TCTGAAACGGCAGCTGTATGAGAATAAACCCAGAAGACCCTACATACTCAAAAGAGATTCTTACTATTACTGAGAGA 
ATAAATCATTTATTTACATGTGATTGTGATTCATCATCCCTTAATTAAATATCAAATTATATTTGTGTGAAAATGTG 
ACAAACACACTTATCTGTCTCTTCTACAATTGTGGTTTATTGAATGTGATTTTTCTGCACTAATATAAATTAGACTA 
AGTGTTTTCAAATAAATCTAAATCTTCAGCATGATGTGTTGTGTATAATTGGAGTAGATATTAATTAAGTCACCTGT 
ATAATGTTTTGTAATTTTGCAAAACATATCTTGAGTTGTTTAAACAGTCAAAATGTTTGATATTTTATACCAGCTTA 
TGAGCTCAAAGTACTACAGCAAAGCCTAGCCTGCATATCATTCACCCAAAACAAAGTAATAGCGCCTCTTTTATTAT 
TTTGACTGAATGTTTTATGGAATTGAAAGAAACATACGTTCTTTTCAAGACTTCCTCATGAATCTCTCAATTATAGG 
AAAAGTTATTGTGATAAAATAGGAACAGCTGAAAGATTGATTAATGAACTATTGTTAATTCTTCCTATTTTAATGAA 
TGACATTGAACTGAATTTTTTGTCTGTTAAATGAACTTGATAGCTAATAAAAAGACAACTAGCCATCAAAATCAAAA 
GTTTCTC 



<210> SEQ ID NO 24 
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<211> Length : 1, 020 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 24 
>D5 64 0 6_PEA_1_T7 

TTCACTCACTTTCAAAGCCAGCTGAAGGAAAGAGGAAGTGCTAGAGAGAGCCCCCTTCAGTGTGCTTCTGACTTTTA 
CGGACTTGGCTTGTTAGAAGGCTGAAAGATGATGGCAGGAATGAAAATCCAGCTTGTATGCATGCTACTCCTGGCTT 
TCAGCTCCTGGAGTCTGTGCTCAGATTCAGAAGAGGAAATGAAAGCATTAGAAGCAGATTTCTTGACCAATATGCAT 
ACATCAAAGTTAATCCAGGAAGATATTCTTGATACTGGAAATGACAAAAATGGAAAGGAAGAAGTCATAAAGAGAAA 
AATTCCTTATATTCTGAAACGGCAGCTGTATGAGAATAAACCCAGAAGACCCTACATACTCAAAAGAGATTCTTACT 
ATTACTGAGAGAATAAATCATTTATTTACATGTGATTGTGATTCATCATCCCTTAATTAAATATCAAATTATATTTG 
TGTGAAAATGTGACAAACACACTTATCTGTCTCTTCTACAATTGTGGTTTATTGAATGTGATTTTTCTGCACTAATA 
TAAATTAGACTAAGTGTTTTCAAATAAATCTAAATCTTCAGCATGATGTGTTGTGTATAATTGGAGTAGATATTAAT 
TAAGTCACCTGTATAATGTTTTGTAATTTTGCAAAACATATCTTGAGTTGTTTAAACAGTCAAAATGTTTGATATTT 
TATACCAGCTTATGAGCTCAAAGTACTACAGCAAAGCCTAGCCTGCATATCATTCACCCAAAACAAAGTAATAGCGC 
CTCTTTTATTATTTTGACTGAATGTTTTATGGAATTGAAAGAAACATACGTTCTTTTCAAGACTTCCTCATGAATCT 
CTCAATTATAGGAAAAGTTATTGTGATAAAATAGGAACAGCTGAAAGATTGATTAATGAACTATTGTTAATTCTTCC 

TATTTTAATGAATGACATTGAACTGAATTTTTTGTCTGTTAAATGAACTTGATAGCTAATAAAAAGACAACTAGCCA 
TCAAAATCAAAAGTTTCTC 

<210> SEQ ID NO 25 

<211> Length : 1, 737 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 25 
>F0 5 0 6 8_PEA_1__T3 

AAGAAAGGGAAGGCAACCGGGCAGCCCAGGCCCCGCCCCGCCGCTCCCCCACCCGTGCGCTTATAAAGCACAGGAAC 
CAGAGCTGGCCACTCAGTGGTTTCTTGGTGACACTGGATAGAACAGCTCAAGCCTTGCCACTTCGGGCTTCTCACTG 
CAGCTGGGCTTGGACTTCGGAGTTTTGCCATTGCCAGTGGGACGTCTGAGACTTTCTCCTTCAAGTACTTGGCAGAT 
CACTCTCTTAGCAGGGTCTGCGCTTCGCAGCCGGGATGAAGCTGGTTTCCGTCGCCCTGATGTACCTGGGTTCGCTC 
GCCTTCCTAGGCGCTGACACCGCTCGGTTGGATGTCGCGTCGGAGTTTCGAAAGAAGTGAGTCCGGGCAGCGCCTTC 
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CCCCTTGCTGGTACCTGGCAGGCAAGGGGAACTGACCGTTGGTCCCGAAGGTCTAGAAGTGAATGGGAGCAGGGACA 
GGCCTGGGCGTCACCTGAACGCACGCGAATCGGGTCTGCTTGTGTTTTCCAGGTGGAATAAGTGGGCTCTGAGTCGT 
GGGAAGAGGGAACTGCGGATGTCCAGCAGCTACCCCACCGGGCTCGCTGACGTGAAGGCCGGGCCTGCCCAGACCCT 
TATTCGGCCCCAGGACATGAAGGGTGCCTCTCGAAGCCCCGAAGACAGCAGTCCGGATGCCGCCCGCATCCGAGTCA 
AGCGCTACCGCCAGAGCATGAACAACTTCCAGGGCCTCCGGAGCTTTGGCTGCCGCTTCGGGACGTGCACGGTGCAG 
AAGCTGGCACACCAGATCTACCAGTTCACAGATAAGGACAAGGACAACGTCGCCCCCAGGAGCAAGATCAGCCCCCA 
GGGCTACGGCCGCCGGCGCCGGCGCTCCCTGCCCGAGGCCGGCCCGGGTCGGACTCTGGTGTCTTCTAAGCCACAAG 
CACACGGGGCTCCAGCCCCCCCGAGTGGAAGTGCTCCCCACTTTCTTTAGGATTTAGGCGCCCATGGTACAAGGAAT 
AGTCGCGCAAGCATCCCGCTGGTGCCTCCCGGGACGAAGGACTTCCCGAGCGGTGTGGGGACCGGGCTCTGACAGCC 
CTGCGGAGACCCTGAGTCCGGGAGGCACCGTCCGGCGGCGAGCTCTGGCTTTGCAAGGGCCCCTCCTTCTGGGGGCT 
TCGCTTCCTTAGCCTTGCTCAGGTGCAAGTGCCCCAGGGGGCGGGGTGCAGAAGAATCCGAGTGTTTGCCAGGCTTA 
AGGAGAGGAGAAACTGAGAAATGAATGCTGAGACCCCCGGAGCAGGGGTCTGAGCCACAGCCGTGCTCGCCCACAAA 
CTGATTTCTCACGGCGTGTCACCCCACCAGGGCGCAAGCCTCACTATTACTTGAACTTTCCAAAACCTAAAGAGGAA 
AAGTGCAATGCGTGTTGTACATACAGAGGTAACTATCAATATTTAAGTTTGTTGCTGTCAAGATTTTTTTTGTAACT 
TCAAATATAGAGATATTTTTGTACGTTATATATTGTATTAAGGGCATTTTAAAAGCAATTATATTGTCCTCCCCCTA 
TTTTAAGACGTGAATGTCTCAGCGAGGTGTAAAGTTGTTCGCCGCGTGGAATGTGAGTGTGTTTGTGTGCATGAAAG 
AGAAAGACTGATTACCTCCTGTGTGGAAGAAGGAAACACCGAGTCTCTGTATAATCTATTTACATAAAATGGGTGAT 
ATGCGAACAGCAAACCAATAAACTGTCTCAATGCTGAATAAAA 

<210> SEQ ID NO 26 

<211> Length : 1, 820 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 26 
>F050 68_PEA_1_T4 

AAGAAAGGGAAGGCAACCGGGCAGCCCAGGCCCCGCCCCGCCGCTCCCCCACCCGTGCGCTTATAAAGCACAGGAAC 
CAGAGCTGGCCACTCAGTGGTTTCTTGGTGACACTGGATAGAACAGCTCAAGCCTTGCCACTTCGGGCTTCTCACTG 
CAGCTGGGCTTGGACTTCGGAGTTTTGCCATTGCCAGTGGGACGTCTGAGACTTTCTCCTTCAAGTACTTGGCAGAT 
CACTCTCTTAGCAGGGTCTGCGCTTCGCAGCCGGGATGAAGCTGGTTTCCGTCGCCCTGATGTACCTGGGTTCGCTC 
GCCTTCCTAGGCGCTGACACCGCTCGGTTGGATGTCGCGTCGGAGTTTCGAAAGAAGTGGAATAAGTGGGCTCTGAG 
TCGTGGGAAGAGGGAACTGCGGATGTCCAGCAGCTACCCCACCGGGCTCGCTGACGTGAAGGCCGGGCCTGCCCAGA 
CCCTTATTCGGCCCCAGGACATGAAGGGTGCCTCTCGAAGCCCCGAAGACAGGTAACTACGCCCTGTGCTGTCCAGG 
GACGGGAGGGAAGGAAGGTGTGCGGGAGGAGTTCTCTGTCTCCACTCCCCTGGCCCGGGGGATCGTCGGGGCTGGAC 
CGCAGCTCAGATGGCGCGAGCAGTTTCCAGCTCCCTCTGGCTCTAGAATGGCTCCCGTTCCCGGTGTTGGGGCCAAA 
GCTCTGCTTGATGGGGTCTCAAGTTGCCTTTCTTCCCCCTCCCCCCGCCCGCAGCAGTCCGGATGCCGCCCGCATCC 
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GAGTCAAGCGCTACCGCCAGAGCATGAACAACTTCCAGGGCCTCCGGAGCTTTGGCTGCCGCTTCGGGACGTGCACG 
GTGCAGAAGCTGGCACACCAGATCTACCAGTTCACAGATAAGGACAAGGACAACGTCGCCCCCAGGAGCAAGATCAG 
CCCCCAGGGCTACGGCCGCCGGCGCCGGCGCTCCCTGCCCGAGGCCGGCCCGGGTCGGACTCTGGTGTCTTCTAAGC 
CACAAGCACACGGGGCTCCAGCCCCCCCGAGTGGAAGTGCTCCCCACTTTCTTTAGGATTTAGGCGCCCATGGTACA 
AGGAATAGTCGCGCAAGCATCCCGCTGGTGCCTCCCGGGACGAAGGACTTCCCGAGCGGTGTGGGGACCGGGCTCTG 
ACAGCCCTGCGGAGACCCTGAGTCCGGGAGGCACCGTCCGGCGGCGAGCTCTGGCTTTGCAAGGGCCCCTCCTTCTG 
GGGGCTTCGCTTCCTTAGCCTTGCTCAGGTGCAAGTGCCCCAGGGGGCGGGGTGCAGAAGAATCCGAGTGTTTGCCA 
GGCTTAAGGAGAGGAGAAACTGAGAAATGAATGCTGAGACCCCCGGAGCAGGGGTCTGAGCCACAGCCGTGCTCGCC 
CACAAACTGATTTCTCACGGCGTGTCACCCCACCAGGGCGCAAGCCTCACTATTACTTGAACTTTCCAAAACCTAAA 
GAGGAAAAGTGCAATGCGTGTTGTACATACAGAGGTAACTATCAATATTTAAGTTTGTTGCTGTCAAGATTTTTTTT 
GTAACTTCAAATATAGAGATATTTTTGTACGTTATATATTGTATTAAGGGCATTTTAAAAGCAATTATATTGTCCTC 
CCCCTATTTTAAGACGTGAATGTCTCAGCGAGGTGTAAAGTTGTTCGCCGCGTGGAATGTGAGTGTGTTTGTGTGCA 
TGAAAGAGAAAGACTGATTACCTCCTGTGTGGAAGAAGGAAACACCGAGTCTCTGTATAATCTATTTACATAAAATG 
GGTGATATGCGAACAGCAAACCAATAAACTGTCTCAATGCTGAATAAAA 

<210> SEQ ID NO 27 

<211> Length : 1, 970 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 27 
>F 0 5 0 6 8_PE A__1_T 6 

AAGAAAGGGAAGGCAACCGGGCAGCCCAGGCCCCGCCCCGCCGCTCCCCCACCCGTGCGCTTATAAAGCACAGGAAC 
CAGAGCTGGCCACTCAGTGGTTTCTTGGTGACACTGGATAGAACAGCTCAAGCCTTGCCACTTCGGGCTTCTCACXG 
CAGCTGGGCTTGGACTTCGGAGTTTTGCCATTGCCAGTGGGACGTCTGAGACTTTCTCCTTCAAGTACTTGGCAGAT 
CACTCTCTTAGCAGGGTCTGCGCTTCGCAGCCGGGATGAAGCTGGTTTCCGTCGCCCTGATGTACCTGGGTTCGCTC 
GCCTTCCTAGGCGCTGACACCGCTCGGTTGGATGTCGCGTCGGAGTTTCGAAAGAAGTGAGTCCGGGCAGCGCCTTC 
CCCCTTGCTGGTACCTGGCAGGCAAGGGGAACTGACCGTTGGTCCCGAAGGTCTAGAAGTGAATGGGAGCAGGGACA 
GGCCTGGGCGTCACCTGAACGCACGCGAATCGGGTCTGCTTGTGTTTTCCAGGTGGAATAAGTGGGCTCTGAGTCGT 
GGGAAGAGGGAACTGCGGATGTCCAGCAGCTACCCCACCGGGCTCGCTGACGTGAAGGCCGGGCCTGCCCAGACCCT 
TATTCGGCCCCAGGACATGAAGGGTGCCTCTCGAAGCCCCGAAGACAGGTAACTACGCCCTGTGCTGTCCAGGGACG 
GGAGGGAAGGAAGGTGTGCGGGAGGAGTTCTCTGTCTCCACTCCCCTGGCCCGGGGGATCGTCGGGGCTGGACCGCA 
GCTCAGATGGCGCGAGCAGTTTCCAGCTCCCTCTGGCTCTAGAATGGCTCCCGTTCCCGGTGTTGGGGCCAAAGCTC 
TGCTTGATGGGGTCTCAAGTTGCCTTTCTTCCCCCTCCCCCCGCCCGCAGCAGTCCGGATGCCGCCCGCATCCGAGT 
CAAGCGCTACCGCCAGAGCATGAACAACTTCCAGGGCCTCCGGAGCTTTGGCTGCCGCTTCGGGACGTGCACGGTGC 
AGAAGCTGGCACACCAGATCTACCAGTTCACAGATAAGGACAAGGACAACGTCGCCCCCAGGAGCAAGATCAGCCCC 
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CAGGGCTACGGCCGCCGGCGCCGGCGCTCCCTGCCCGAGGCCGGCCCGGGTCGGACTCTGGTGTCTTCTAAGCCACA 
AGCACACGGGGCTCCAGCCCCCCCGAGTGGAAGTGCTCCCCACTTTCTTTAGGATTTAGGCGCCCATGGTACAAGGA 
ATAGTCGCGCAAGCATCCCGCTGGTGCCTCCCGGGACGAAGGACTTCCCGAGCGGTGTGGGGACCGGGCTCTGACAG 
CCCTGCGGAGACCCTGAGTCCGGGAGGCACCGTCCGGCGGCGAGCTCTGGCTTTGCAAGGGCCCCTCCTTCTGGGGG 
CTTCGCTTCCTTAGCCTTGCTCAGGTGCAAGTGCCCCAGGGGGCGGGGTGCAGAAGAATCCGAGTGTTTGCCAGGCT 
TAAGGAGAGGAGAAACTGAGAAATGAATGCTGAGACCCCCGGAGCAGGGGTCTGAGCCACAGCCGTGCTCGCCCACA 
AACTGATTTCTCACGGCGTGTCACCCCACCAGGGCGCAAGCCTCACTATTACTTGAACTTTCCAAAACCTAAAGAGG 
AAAAGTGCAATGCGTGTTGTACATACAGAGGTAACTATCAATATTTAAGTTTGTTGCTGTCAAGATTTTTTTTGTAA 
CTTCAAATATAGAGATATTTTTGTACGTTATATATTGTATTAAGGGCATTTTAAAAGCAATTATATTGTCCTCCCCC 
TATTTTAAGACGTGAATGTCTCAGCGAGGTGTAAAGTTGTTCGCCGCGTGGAATGTGAGTGTGTTTGTGTGCATGAA 
AGAGAAAGACTGATTACCTCCTGTGTGGAAGAAGGAAACACCGAGTCTCTGTATAATCTATTTACATAAAATGGGTG 
ATATGCGAACAGCAAACCAATAAACTGTCTCAATGCTGAATAAAA 

<210> SEQ ID NO 28 

<211> Length : 1, 745 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 28 
>H14624_T20 

TTATGCTCCCGCGGAGGCCAAGCGGACTCCCTGACAGGACAGAATCTGAACGTGAGAGTGAAGGTCTTGCCTGTCCA 
GAAACTCTTGTAGCCAGCACAGGTTTAAACAAGAAGCCAAATTGTTCTGGAGAGATTGCTGGGGGCTTTCTTTGTGC 
CTCAAGCTTCTTCAGTGCCCTGAGCACAGGAAACACTCAAGCAGAGAAGCAGAGCCAAACCCAGGATACGGGAGGTC 
GAGGCTCTTCCGTAGACCTGCAGCATTGGGGTGGGATGATGTTCATTCTGTGTGTGTTCTGGACCAAGCCCCTCTCC 
AGGGACCTATGGGCAGCCCCCTTTAAGCAAGATGCCCGGTGGAGTGGGCATCCACCATCACTTACCCTGGGCTTGGG 
TGAATAGATTTTCCGTGCCTTAAATGGGCAGGGAGGGGGTAAACAXGGACGGTCCATTGGTACAAATAAAAGCCTTT 
GGTGGGTTTTGATCAATTGCAAGGATCGAAGGAGACCTGTGGACCTGAGGTCAACTGGCAGCAGAGAAGAGTCTGGG 
TTCGTGAAGGCGCCGCCGCGGTGCCGCGCCACGTATTTGCATAAAAAAGGCCAAGAAAACTCTGGCTGTGCCCCAGC 
AACGGCTCATTCTGCTCCCCCGGGTCGGAGCCCCCCGGAGCTGCGCGCGGGCTTGCAGCGCCTCGCCCGCGCTGTCC 
TCCCGGTGTCCCGCTTCTCCGCGCCCCAGCCGCCGGCTGCCAGCTTTTCGGGGCCCCGAGTCGCACCCAGCGAAGAG 
AGCGGGCCCGGGACAAGCTCGAACTCCGGCCGCCTCGCCCTTCCCCGGCTCCGCTCCCTCTGCCCCCTCGGGGTCGC 
GCGCCCACGATGCTGCAGGGCCCTGGCTCGCTGCTGCTGCTCTTCCTCGCCTCGCACTGCTGCCTGGGCTCGGCGCG 
CGGGCTCTTCCTCTTTGGCCAGCCCGACTTCTCCTACAAGCGCAGCAATTGCAAGCCCATCCCGGCCAACCTGCAGC 
TGTGCCACGGCATCGAATACCAGAACATGCGGCTGCCCAACCTGCTGGGCCACGAGACCATGAAGGAGGTGCTGGAG 
CAGGCCGGCGCTTGGATCCCGCTGGTCATGAAGCAGTGCCACCCGGACACCAAGAAGTTCCTGTGCTCGCTCTTCGC 
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CCCCGTCTGCCTCGATGACCTAGACGAGACCATCCAGCCATGCCACTCGCTCTGCGTGCAGGTGAAGGACCGCTGCG 
CCCCGGTCATGTCCGCCTTCGGCTTCCCCTGGCCCGACATGCTTGAGTGCGACCGTTTCCCCCAGGACAACGACCTT 
TGCATCCCCCTCGCTAGCAGCGACCACCTCCTGCCAGCCACCGAGGAAGGTAAGCCTTCCCTCTTGCTTCCCCACTC 
CCTGCTGGGCTGAGACGCTCCCAGGAGATCCCGCCCCTGCCACGCATCCCAGTGCATCCCTGCTTGGGGTGCCAGTA 
GCGGGAAGGGCAGAAGTTCTGCCTGACCTGGTCTGTCATCACAACAAGCCTGTATCAAATTTGAGGCACCCCTCCCA 
CGCCGCCCAAGTCTCGCGCATTCTCCTCCCGAGTTGTACCAGCTATACTTAAGGGCAGTTTAAAAATAAAACAAACA 
AACAAAAACAACAAAACTAAAAAAACGAAGAACTGAACGGCGGTTTAAAAAAAAATAGATACACGATTATTGTTAAA 
GATGCTAGCACTGGAGCTGCGCAGAGCGTTGGAAGTGGTGTTTGGTGGAGG 



<210> SEQ ID NO 29 

<211> Length : 3, 170 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 29 
>H3 8 8 0 4_PEA_1_T2 4 

GACTGGGTTGACCGATGCTGGGCAGCTGAGCGGACCAATCGGCCCCCTAGACTGAGACGTTGGCGTTTGAAATCAGC 
CAATGGCAGGTCTACACTGGAGCTTCCTCTCCGCCTCCTTCGCCTAGCCTGCGAGTGTTCTGAGGGAAGCAAGGAGG 
CGGCGGCGGCCGCAGCGAGTGGCGAGTAGTGGAAACGTTGCTTCTGAGGGGAGCCCAAGGTAGGGAGGCGAGGCGAC 
GGTGTGCGGGAGCGGGCTCTCCAGGGACTTCCCGGGTCCGCAACTGGCAGGGCCGTTCGATTCGCAGGGGATCCCGT 
TTCGTTTCTGTTGTTTTCCCTTTATTTTTAGGAGTGCCCGGGGCGACGGGACCCCGGGAGAGGGGAAAGGGAACAGT 
CTGGGGTCCGGGCATCGCTGTGGGCCGGGCTGGGTTTAGGGGGACGGCGGTGCGGGCTGGGCCGGTTTGGGCGCGGC 
GGGGGCCGGATGATGGGGCGAGTCCGGACCTTGGCGGGCGAGTGCTCGGCGCAGGCGCAAGCGCAGAGTCTCCTCGC 
GGTCGTCCTCTCGGCCCCTCCCTCTGGGGGGACCCCCAGTGCCAGGCTGTCAGTGCGCAGCCCCAGCCCGCGGGACC 
CCTGGGGACTCTGGGCGCCTGTTCTGCAGATGACCGGTTCTAACGAGTTCAAGCTGAACCAGCCACCCGAGGATGGC 
ATCTCCTCCGTGAAGTTCAGCCCCAACACCTCCCAGTTCCTGCTTGTCTCCTCCTGGGACACGTCCGTGCGTCTCTA 
CGATGTGCCGGCCAACTCCATGCGGCTCAAGTACCAGCACACCGGCGCCGTCCTGGACTGCGCCTTCTACGATCCAA 
CGCATGCCTGGAGTGGAGGACTAGATCATCAATTGAAAATGCATGATTTGAACACTGATCAAGAAAATCTTGTTGGG 
ACCCATGATGCCCCTATCAGATGTGTTGAATACTGTCCAGAAGTGAATGTGATGGTCACTGGAAGTTGGGATCAGAC 
AGTTAAACTGTGGGATCCCAGAACTCCTTGTAATGCTGGGACCTTCTCTCAGCCTGAAAAGGTATATACCCTCTCAG 
TGTCTGGAGACCGGCTGATTGTGGGAACAGCAGGCCGCAGAGTGTTGGTGTGGGACTTACGGAACATGGGTTACGTG 
CAGCAGCGCAGGGAGTCCAGCCTGAAATACCAGACTCGCTGCATACGAGCGTTTCCAAACAAGCAGGGTTATGTATT 
AAGCTCTATTGAAGGCCGAGTGGCAGTTGAGTATTTGGACCCAAGCCCTGAGGTACAGAAGAAGAAGTATGCCTTCA 
AATGTCACAGACTAAAAGAAAATAATATTGAGCAGATTTACCCAGTCAATGCCATTTCTTTTCACAATATCCACAAT 
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ACATTTGCCACAGGTGGTTCTGATGGCTTTGTAAATATTTGGGATCCATTTAACAAAAAGCGACTGTGCCAATTCCA 
TCGGTACCCCACGAGCATCGCATCACTTGCCTTCAGTAATGATGGGACTACGCTTGCAATAGCGTCATCATATATGT 
ATGAAATGGATGACACAGAACATCCTGAAGATGGTATCTTCATTCGCCAAGTGACAGATGCAGAAACAAAACCCAAG 
TCACCATGTACTTGACAAGATTTCATTTACTTAAGTGCCATGTTGATGATAATAAAACAATTCGTACTCCCCAATGG 
TGGATTTATTACTATTAAAGAAACCAGGGAAAATATTAATTTTAATATTATAACAACCTGAAAATAATGGAAAAGAG 
GTTTTTGAATTTTTTTTTTTAAATAAACACCTTCTTAAGTGCATGAGATGGTTTGATGGTTTGCTGCATTAAAGGTA 
TTTGGGCAAACAAAATTGGAGGGCAAGTGACTGCAGTTTTGAGAATCAGTTTTGACCTTGATGATTTTTTGTTTCCA 
CTGTGGAAATAAATGTTTGTAAATAAGTGTAATAAAAATCCCTTTGCATTCTTTCTGGACCTTAAATGGTAGAGGAA 
AAGGCTCGTGAGCCATTTGTTTCTTTTGCTGGTTATAGTTGCTAATTCTAAAGCTGCTTCAGACTGCTTCATGAGGA 
GGTTAATCTACAATTAAACAATATTTCCTCTTGGCCGTCCATTATTTTCTGAAGCAGATGGTTCATCATTTCCTGGG 
CTGTTAAACAAAGCGAGGTTAAGGTTAGACTCTTGGGAATCAGCTAGTTTTCAATCTTATTAGGGTGCAGAAGGAAA 
ACTAATAAGAAAACCTCCTAATATCATTTTGTGACTGTAAACAATTATTTATTAGCAAACAATTGATCCCAGAAGGG 
CAAATTGTTTGAGTCAGTAATGAGCTGAGAAAAGACAGAGCATATCTGTGTATTTGGAAAAATAATTGTAACGTAAT 
TGCAGTGCATTTAGACAGGCATCTATTTGGACCTGTTTCTATCTCTAAATGAATTTTTGGAAACATTAATGAGGTTT 
ACATATTTCTCTGACATTTATATAGTTCTTATGTCCATTTCAGTTGACCAGCCGCTGGTGATTAAAGTTAAAAAGAA 
AAAAATTATAGTGAGAATGAGATTCATTTCAATGTAATGCACTAAAGCAGAACACGAACTTAGCTTGGCCTATTCTA 
GGTAGTTCCAAATAGTATTTTTGTTGTCAAACTTTAAAATTTATATTAATTTGCAAATGTATGTCTCTGAGTAGGAC 
TTGGACCTTTCCTGAGATTTATTTTATCCGTGATGTATTTTTTTTAATTCTTTTGATACAGAGAAGGGTCTTTTTTT 
TTTTTAAGTATTTCAGTGAAAACTTGGTGTAAGTCTGAACCCATCTTTTGAAATGTATTTTCTTCATTGCAGGTCCA 
CCTAATCATCCTGTGAAAGTGGTTTCTCTATGGAAAGCTTTGTTTGCTTCCTACAAATACATGCTTATTCCTTAAGG 
GATGTGTTAGAGTTACTGTGGATTTCTCTGTTTTCTGTCTTACAAGAAACTTGTCTATGTACCTTAATACTTTGTTT 
AGGATGAGGAGTCTTTGTGTCCCTGTACAGTAGTCTGACGTATTTCCCCTTCTGTCCCCTAGTAAGCCCAGTTGCTG 
TATCTGAACAGTTTGAGCTCTTTTTGTAATATACTCTAAACCTGTTATTTCTGTGCTAATAAACGAGATGCAGAACC 
CTTGAAAAATGGA 

<210> SEQ ID NO 30 

<211> Length : 4,161 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 30 
>H3 88 0 4_PE A_1_T 8 

GACTGGGTTGACCGATGCTGGGCAGCTGAGCGGACCAATCGGCCCCCTAGACTGAGACGTTGGCGTTTGAAATCAGC 
CAATGGCAGGTCTACACTGGAGCTTCCTCTCCGCCTCCTTCGCCTAGCCTGCGAGTGTTCTGAGGGAAGCAAGGAGG 
CGGCGGCGGCCGCAGCGAGTGGCGAGTAGTGGAAACGTTGCTTCTGAGGGGAGCCCAAGGTAGGGAGGCGAGGCGAC 
GGTGTGCGGGAGCGGGCTCTCCAGGGACTTCCCGGGTCCGCAACTGGCAGGGCCGTTCGATTCGCAGGGGATCCCGT 



WO 2006/131783 



PCT/IB2005/004037 



32 

TTCGTTTCTGTTGTTTTCCCTTTATTTTTAGGAGTGCCCGGGGCGACGGGACCCCGGGAGAGGGGAAAGGGAACAGT 

CTGGGGTCCGGGCATCGCTGTGGGCCGGGCTGGGTTTAGGGGGACGGCGGTGCGGGCTGGGCCGGTTTGGGCGCGGC 

GGGGGCCGGATGATGGGGCGAGTCCGGACCTTGGCGGGCGAGTGCTCGGCGCAGGCGCAAGCGCAGAGTCTCCTCGC 

GGTCGTCCTCTCGGCCCCTCCCTCTGGGGGGACCCCCAGTGCCAGGCTGTCAGTGCGCAGCCCCAGCCCGCGGGACC 

CCTGGGGACTCTGGGCGCCTGTTCTGCAGATGACCGGTTCTAACGAGTTCAAGCTGAACCAGCCACCCGAGGATGGC 

ATCTCCTCCGTGAAGTTCAGCCCCAACACCTCCCAGTTCCTGCTTGTCTCCTCCTGGGACACGTCCGTGCGTCTCTA 

CGATGTGCCGGCCAACTCCATGCGGCTCAAGTACCAGCACACCGGCGCCGTCCTGGACTGCGCCTTCTACGATCCAA 

CGCATGCCTGGAGTGGAGGACTAGATCATCAATTGAAAATGCATGATTTGAACACTGATCAAGAAAATCTTGTTGGG 

ACCCATGATGCCCCTATCAGATGTGTTGAATACTGTCCAGAAGTGAATGTGATGGTCACTGGAAGTTGGGATCAGAC 

AGTTAAACTGTGGGATCCCAGAACTCCTTGTAATGCTGGGACCTTCTCTCAGCCTGAAAAGGTATATACCCTCTCAG 

TGTCTGGAGACCGGCTGATTGTGGGAACAGCAGGCCGCAGAGTGTTGGTGTGGGACTTACGGAACATGGGTTACGTG 

CAGCAGCGCAGGGAGTCCAGCCTGAAATACCAGACTCGCTGCATACGAGCGTTTCCAAACAAGCAGGGTTATGTATT 

AAGCTCTATTGAAGGCCGAGTGGCAGTTGAGTATTTGGACCCAAGCCCTGAGGTACAGAAGAAGAAGTATGCCTTCA 

AATGTCACAGACTAAAAGAAAATAATATTGAGCAGATTTACCCAGTCAATGCCATTTCTTTTCACAATATCCACAAT 

ACATTTGCCACAGGTGGTTCTGATGGCTTTGTAAATATTTGGGATCCATTTAACAAAAAGCGACTGTGCCAATTCCA 

TCGGTACCCCACGAGCATCGCATCACTTGCCTTCAGTAATGATGGGACTACGCTTGCAATAGCGTCATCATATATGT 

ATGAAATGGATGACACAGAACATCCTGAAGATGGTATCTTCATTCGCCAAGTGACAGATGCAGAAACAAAACCCAAG 

TGAGTATGCTTCACCTGTATTTGAGCCTTTTCTTGCATTCAACCCAGGATTTATTAATTTTTCTAAATTCATGAATA 

GCATTGTTGATGCCTGCTCGATATTACAGCTGACTGTAGGGTTGGAGTTGATGTTATCATGTTCTCCCAAGCTTTCA 

ATATCCGTAGGTTGATAGACGTCTGATGGATAAAATTGTGCCTAGTTGTTTTGTAGAGAAGAATGTCAAACTCTTAT 

TCTTCTTGAATAGGCTCTATTATTTGAATCTCTGGAGTTATTACCAGCTCATTGCTTCAAAATTAAGTTGAGGAATT 

CAAGAATAATTTATTTTAGTAAATTCTATTTAAGATGTTTAAGAATTTGAACTGCCAAAAATCTTTCCTCTCCACAG 

AGGTTGTTTCTTTAATATTAACACAAAGTAAGTGACCTTCAGGTCTTATTGGAAACTCAGAGTAATATGGCCTTGCC 

TGGAATTGCAAATTTCCTTAGTTTTGAAATTTTCATAGATGTCTTTGGTTCTTGGTTGTAACTGTTGACTGAGAAGA 

GCCATTTACATTTTTTGATACCAACAGGGCAAAGCTTTTTACTTAATTACCTCTACCAGGCTTTAAGGGAAATCTGA 

TACTTCAGCATGTGTTAAACTATAAAATACCTACTCCAAGTATCTGCCCAGTTCCTTGTCCCCTCTCCCCAGGCCCT 

TAAAGGAAGTTCTCGATACATATTTGTAGAATAACTGAATGTTTTCAGGATTCCTGTACTTTGCTGAGTTAAAATGG 

ATATGGTACCCTTGCTGATTGGTTGAGCCCCTAAGAGGGGGCAGAATATTAAATATTCCATATCAGATATGCTTTTA 

CAGGTTTGACTTTAGAAAAGTCTTAGCATGTGAAGCCTGTTGGATAAAGGGCTGTGTTTGCATTTAATCTGTCACTT 

TTGTATCTCCTGTCCTGGCTGGCCATTTTGATCTCATGCTGTTCTTTTTTTCTTTTGAACTTGTAGGTCACCATGTA 

CTTGACAAGATTTCATTTACTTAAGTGCCATGTTGATGATAATAAAACAATTCGTACTCCCCAATGGTGGATTTATT 

ACTATTAAAGAAACCAGGGAAAATATTAATTTTAATATTATAACAACCTGAAAATAATGGAAAAGAGGTTTTTGAAT 

TTTTTTTTTTAAATAAACACCTTCTTAAGTGCATGAGATGGTTTGATGGTTTGCTGCATTAAAGGTATTTGGGCAAA 

CAAAATTGGAGGGCAAGTGACTGCAGTTTTGAGAATCAGTTTTGACCTTGATGATTTTTTGTTTCCACTGTGGAAAT 

AAATGTTTGTAAATAAGTGTAATAAAAATCCCTTTGCATTCTTTCTGGACCTTAAATGGTAGAGGAAAAGGCTCGTG 

AGCCATTTGTTTCTTTTGCTGGTTATAGTTGCTAATTCTAAAGCTGCTTCAGACTGCTTCATGAGGAGGTTAATCTA 

CAATTAAACAATATTTCCTCTTGGCCGTCCATTATTTTCTGAAGCAGATGGTTCATCATTTCCTGGGCTGTTAAACA 

AAGCGAGGTTAAGGTTAGACTCTTGGGAATCAGCTAGTTTTCAATCTTATTAGGGTGCAGAAGGAAAACTAATAAGA 
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AAACCTCCTAATATCATTTTGTGACTGTAAACAATTATTTATTAGCAAACAATTGATCCCAGAAGGGCAAATTGTTT 
GAGTCAGTAATGAGCTGAGAAAAGACAGAGCATATCTGTGTATTTGGAAAAATAATTGTAACGTAATTGCAGTGCAT 
TTAGACAGGCATCTATTTGGACCTGTTTCTATCTCTAAATGAATTTTTGGAAACATTAATGAGGTTTACATATTTCT 
CTGACATTTATATAGTTCTTATGTCCATTTCAGTTGACCAGCCGCTGGTGATTAAAGTTAAAAAGAAAAAAATTATA 
GTGAGAATGAGATTCATTTCAATGTAATGCACTAAAGCAGAACACGAACTTAGCTTGGCCTATTCTAGGTAGTTCCA 
AATAGTATTTTTGTTGTCAAACTTTAAAATTTATATTAATTTGCAAATGTATGTCTCTGAGTAGGACTTGGACCTTT 
CCTGAGATTTATTTTATCCGTGATGTATTTTTTTTAATTCTTTTGATACAGAGAAGGGTCTTTTTTTTTTTTAAGTA 
TTTCAGTGAAAACTTGGTGTAAGTCTGAACCCATCTTTTGAAATGTATTTTCTTCATTGCAGGTCCACCTAATCATC 
CTGTGAAAGTGGTTTCTCTATGGAAAGCTTTGTTTGCTTCCTACAAATACATGCTTATTCCTTAAGGGATGTGTTAG 
AGTTACTGTGGATTTCTCTGTTTTCTGTCTTACAAGAAACTTGTCTATGTACCTTAATACTTTGTTTAGGATGAGGA 
GTCTTTGTGTCCCTGTACAGTAGTCTGACGTATTTCCCCTTCTGTCCCCTAGTAAGCCCAGTTGCTGTATCTGAACA 
GTTTGAGCTCTTTTTGTAATATACTCTAAACCTGTTATTTCTGTGCTAATAAACGAGATGCAGAACCCTTGAAAAAT 
GGA 



<210> SEQ ID NO 31 

<211> Length : 2, 54 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 31 
>HSENA7 8_T5 

AGTGGGGAGAGATGAGTGTAGATAAAAGGAGTGCAGAAGGCACGAGGAAGCCACAGTGCTCCGGATCCTCCAATCTT 
CGCTCCTCCAATCTCCGCTCCTCCACCCAGTTCAGGAACCCGCGACCGCTCGCAGCGCTCTCTTGACCACTATGAGC 
CTCCTGTCCAGCCGCGCGGCCCGTGTCCCCGGTCCTTCGAGCTCCTTGTGCGCGCTGTTGGTGCTGCTGCTGCTGCT 
GACGCAGCCAGGGCCCATCGCCAGCGCTGGTCCTGCCGCTGCTGTGTTGAGAGAGCTGCGTTGCGTTTGTTTACAGA 
CCACGCAGGGAGTTCATCCCAAAATGATCAGTAATCTGCAAGTGTTCGCCATAGGCCCACAGTGCTCCAAGGTGGAA 
GTGGTGTAAGTTCTGTGCTGCTGTGTCCGCTGTGACCTTGGCZVAGAGAGAAATCCCGCAGCCTGGGTCTTCAACCTT 
GGTATCTCATGAGTGTATCTTCTTTTTCTTTCCTTCAGAGCCTCCCTGAAGAACGGGAAGGAAATTTGTCTTGATCC 
AGAAGCCCCTTTTCTAAAGAAAGTCATCCAGAAAATTTTGGACGGTGGAAACAAGGAAAACTGATTAAGAGAAATGA 
GCACGCATGGAAAAGTTTCCCAGTCTTCAGCAGAGAAGTTTTCTGGAGGTCTCTGAACCCAGGGAAGACAAGAAGGA 
AAGATTTTGTTGTTGTTTGTTTATTTGTTTTTCCAGTAGTTAGCTTTCTTCCTGGATTCCTCACTTTGAAGAGTGTG 
AGGAAAACCTATGTTTGCCGCTTAAGCTTTCAGCTCAGCTAATGAAGTGTTTAGCATAGTACCTCTGCTATTTGCTG 
TTATTTTATCTGCTATGCTATTGAAGTTTTGGCAATTGACTATAGTGTGAGCCAGGAATCACTGGCTGTTAATCTTT 
CAAAGTGTCTTGAATTGTAGGTGACTATTATATTTCCAAGAAATATTCCTTAAGATATXAACTGAGAAGGCTGTGGA 
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TTTAATGTGGAAATGATGTTTCATAAGAATTCTGTTGATGGAAATACACTGTTATCTTCACTTTTATAAGAAATAGG 
AAATATTTTAATGTTTCTTGGGGAATATGTTAGAGAATTTCCTTACTCTTGATTGTGGGATACTATTTAATTATTTC 
ACTTTAGAAAGCTGAGTGTTTCACACCTTATCTATGTAGAATATATTTCCTTATTCAGAATTTCTAAAAGTTTAAGT 
TCTATGAGGGCTAATATCTTATCTTCCTATAATTTTAGACATTCTTTATCTTTTTAGTATGGCAAACTGCCATCATT 
TACTTTTAAACTTTGATTTTATATGCTATTTATTAAGTATTTTATTAGGAGTACCATAATTCTGGTAGCTAAATATA 
TATTTTAGATAGATGAAGAAGCTAGAAAACAGGCAAATTCCTGACTGCTAGTTTATATAGAAATGTATTCTTTTAGT 
TTTTAAAGTAAAGGCAAACTTAACAATGACTTGTACTCTGAAAGTTTTGGAAACGTATTCAAACAATTTGAATATAA 
ATTTATCATTTAGTTATAAAAATATATAGCGACATCCTCGAGGCCCTAGCATTTCTCCTTGGATAGGGGACCAGAGA 
GAGCTTGGAATGTTAAAAACAAAACAAAACAAAAAAAAACAAGGAGAAGTTGTCCAAGGGATGTCAATTTTTTATCC 
CTCTGTATGGGTTAGATTTTCCAAAATCATAATTTGAAGAAGGCCAGCATTTATGGTAGAATATATAATTATATATA 
AGGTGGCCACGCTGGGGCAAGTTCCCTCCCCACTCACAGCTTTGGCCCCXTTCACAGAGTAGAACCTGGGTTAGAGG 
ATTGCAGAAGACGAGCGGCAGCGGGGAGGGCAGGGAAGATGCCTGTCGGGTTTTTAGCACAGTTCATTTCACTGGGA 
TTTTGAAGCATTTCTGTCTGAATGTAAAGCCTGTTCTAGTCCTGGTGGGACACACTGGGGTTGGGGGTGGGGGAAGA 
TGCGGTAATGAAACCGGTTAGTCAGTGTTGTCTTAATATCCTTGATAATGCTGTAAAGTTTATTTTTACAAATATTT 
CTGTTTAAGCTATTTCACCTTTGTTTGGAAATCCTTCCCTTTTAAAGAGAAAATGTGACACTTGTGAAAAGGCTTGT 
AGGAAAGCTCCTCCCTTTTTTTCTTTAAACCTTTAAATGACAAACCTAGGTAATTAATGGTTGTGAATTTCTATTTT 
TGCTTTGTTTTTAATGAACATTTGTCTTTCAGAATAGGATTCTGTGATAATATTTAAATGGCAAAAACAAAACATAA 
TTTTGTGCAATTAACAAAGCTACTGCAAGAAAAATAAAACATTTCTTGGTAAAAACGTATGTATTTATATATTATAT 
ATTTATATATAATATATATTATATATTTAGCATTGCTGAGCTTTTTAGATGCCTATTGTGTATCTTTTAAAGGTTTT 
GACCATTTTGTTATGAGTAATTACATATATATTACATTCACTATATTAAAATTGTACTTTTTTACTATGTGTCTCAT 
TGGTT 



<210> SEQ ID NO 32 

<211> Length : 1, 893 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 32 
>HUMODCA__T17 

GACGTCGGCCCGCCGGCGCCCCACCAGCTCCGCGCGGGCCCGGGTTGGCCACCGCCGGGCCCCCGCCCCTCCCCCGG 
CGGTGTCCCGGCCGGAACCGATCGTGGCTGGTTTGAGCTGGTGCGTCTCCATGGCGACCCGCCGGTGCTATAAGTAG 
GGAGCGGCGTGCCGTGGGGCTTTGTCAGTCCCTCCTGTAGCCGCCGCCGCCGCCGCCCGCCGCCCCTCTGCCAGCAG 
CTCCGGCGCCACCTCGGGCCGGCGTCTCCGGCGGGCGGGAGCCAGGCGCTGACGGGCGCGGCGGGGGCGGCCGAGCG 
CTCCTGCGGCTGCGACTCAGGCTCCGGCGTCTGCGCTTCCCCATGGGGCTGGCCTGCGGCGCCTGGGCGCTCTGAGA 
TTGTCACTGCTGTTCCAAGGGCACACGCAGAGGGATTTGGAATTCCTGGAGAGTTGCCTTTGTGAGAAGCXGGAAAT 
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ATTTCTTTCAATTCCATCTCTTAGTTTTCCATAGGAACATCAAGAAATCATGAACAACTTTGGTAATGAAGAGTTTG 
ACTGCCACTTCCTCGATGAAGGTTTTACTGCCAAGGACATTCTGGACCAGAAAATTAATGAAGTTTCTTCTTCTGTT 
GGTTTTGCGGATTGCCACTGATGATTCCAAAGCAGTCTGTCGTCTCAGTGTGAAATTCGGTGCCACGCTCAGAACCA 
GCAGGCTCCTTTTGGAACGGGCGAAAGAGCTAAATATCGATGTTGTTGGTGTCAGCTTCCATGTAGGAAGCGGCTGT 
ACCGATCCTGAGACCTTCGTGCAGGCAATCTCTGATGCCCGCTGTGTTTTTGACATGGGGGCTGAGGTTGGTTTCAG 
CATGTATCTGCTTGATATTGGCGGTGGCTTTCCTGGATCTGAGGATGTGAAACTTAAATTTGAAGAGATCACCGGCG 
TAATCAACCCAGCGTTGGACAAATACTTTCCGTCAGACTCTGGAGTGAGAATCATAGCTGAGCCCGGCAGATACTAT 
GTTGCATCAGCTTTCACGCTTGCAGTTAATATCATTGCCAAGAAAATTGTATTAAAGGAACAGACGGGCTCTGATGA 
CGAAGATGAGTCGAGTGAGCAGACCTTTATGTATTATGTGAATGATGGCGXCTATGGATCATTTAATTGCATACTCT 
ATGACCACGCACATGTAAAGCCCCTTCTGCAAAAGAGACCTAAACCAGATGAGAAGTATTATTCATCCAGCATATGG 
GGACCAACATGTGATGGCCTCGATCGGATTGTTGAGCGCTGTGACCTGCCTGAAATGCATGTGGGTGATTGGATGCT 
CTTTGAAAACATGGGCGCTTACACTGTTGCTGCTGCCTCTACGTTCAATGGCTTCCAGAGGCCGACGATCTACTATG 
TGATGTCAGGGCCTGCGTGGCAACTCATGCAGCAATTCCAGAACCCCGACTTCCCACCCGAAGTAGAGGAACAGGAT 
GCCAGCACCCTGCCTGTGTCTTGTGCCTGGGAGAGTGGGATGAAACGCCACAGAGCAGCCTGTGCTTCGGCTAGTAT 
TAATGTGTAGATAGCACTCTGGTAGCTGTTAACTGCAAGTTTAGCTTGAATTAAGGGATTTGGGGGGACCATGTAAC 
TTAATTACTGCTAGTTTTGAAATGTCTTTGTAAGAGTAGGGTCGCCATGATGCAGCCATATGGAAGACTAGGATATG 
GGTCACACTTATCTGTGTTCCTATGGAAACTATTTGAATATTTGTTTTATATGGATTTTTATTCACTCTTCAGACAC 
GCTACTCAAGAGTGCCCCTCAGCTGCTGAACAAGCATTTGTAGCTTGTACAATGGCAGAATGGGCCAAAAGCTTAGT 
GTTGTGACCTGTTTTTAAAATAAAGTATCTTGAAATAATTAGGCA 

<210> SEQ ID NO 33 

<211> Length : 1,069 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 33 
>R00299_T2 

GCGGCCGCAGAGCACTTTGCCCGGAGCCCAGCGTCCTCCCCTGAGTTCGCTGAGTCTCCCGGGACCAGCAAAGGCTG 
CGCGCCCCGCATCGGCCCGGAGGCGGGGAGCCCTGGGAGGCCTGGCCGAGCTGCCCGCAGGGAAATGGCGGAGAAAG 
CGCTTCTCTGCCCGAGTTCAGCCGGGCTGGGGACTTGGCCCTGGGTCCTGAACTCGGCATGGCCAGTTCTGCCTCTG 
GCTGTGGACCAGGGTGTGGACTGGAGACCGCGGGGGCCAGTCTCATCGGATCAGATCGAGCAGCTCCATCGGAGATT 
TAAGCAGCTGAGTGGAGATCAGCCTACCATTCGCAAGGAGAACTTCAACAATGTCCCGGACCTGGAGCTCAACCCCA 
TCCGATCCAAAATTGTTCGTGCCTTCTTCGACAACAGGAACCTGCGCAAGGGACCCAGTGGCCTGGCTGATGAGATC 
AATTTCGAGGACTTCCTGACCATCATGTCCTACTTCCGGCCCATCGACACCACCATGGACGAGGAACAGGTGGAGCT 
GTCCCGGAAGGAGAAGCTGAGATTTCTGTTCCACATGTACGACTCGGACAGCGACGGCCGCATCACTCTGGAAGAAT 
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ATCGAAATGTGGTCGAGGAGCTGCTGTCGGGAAACCCTCACATCGAGAAGGAGTCCGCTCGCTCCATCGCCGACGGG 
GCCATGATGGAGGCGGCCAGCGTGTGCATGGGGCAGATGGAGCCTGATCAGGTGTACGAGGGGATCACCTTCGAGGA 
CTTCCTGAAGATCTGGCAGGGGATCGACATTGAGACCAAGATGCACGTCCGCTTCCTTAACATGGAAACCATGGCCC 
TCTGCCACTGACCCACCGCCACCTCCGCGGAGAAACTGCACTTTGCAATGGGGCCGCCTCCCCGCGTAGCTGGAGCA 
GCCCAGGCCCGGCGGACAGCCTCTTCCTGCAGCGCCGGTACATAGCCAAGGCTCGTCTGCGCACCTTGTGTCTTGTA 
GGGTATGGTATGTGGGACTTCGCTGTTTTTATCTCCAATAAAAAAAAAAAAAAGGTTTGTTAATTAAT 

<210> SEQ ID NO 34 

<211> Length : 1,250 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 34 
>W602 82_PEA_1JT11 

GGAGCGGCCTAGGGGAGGCCAGGGGCCCACCTGGGCTGGGGCTGTGGAGAGGGAGTGGCTGGGACGGGAGGAAAAAG 
AGAGACGGAGATTAGATGGAAGAAGAGGGATTTCAAGACAAATTGCCAGAGATGCAGTCAGAGAGACTGACTGAGAG 
ACACAAAGATAGAAGGAATTAGAGAAAGGGCCACACAGAGCCAGACAGAGAGAGAAGAGTGGAGATGGAGACAGGGA 
CGAGGACAGAGAAAGGCAGACAGACACATAGGGACAGAAAGAGAAAAATCACACAAAGTCAGAATTACTGAATGACA 
GGGAATGACACATAGAACGAGACACAGATTCAGAGACTCAGGGCAGGGAAAGGAAGGCTGCAGACAGACAGACAGAC 
AGAGGGAGGCTGAGACACAGGGAGAAGAGGGGCTTGGAGAGGTGGCACAGGCAGGCAGCCAGTGCCTCAGAGGCCTC 
CGGGGAGGGCCCTCACACACACCCCGCCCCGGGGCATTAAGGCAGGGCTTGGAGGCCAGTCATCCTGGGCCCGCCCA 
GGGCCGCCCCCCTGCCAGCCCGCCTGCCTGGTGCCTGGCACCTGGCGCTCCAACCCAGCCTACCTGCTGTAGCTGCC 
GCCACTGCCGTCTCCGCCGCCACTGGGCCCCCAGAGCCCCAGCCCCAGAGCCTAGGAACCTGGGGCCCGCTCCTCCC 
CCCTCCAGGCCATGAGGATTCTGCAGTTAATCCTGCTTGCTCTGGCAACAGGGCTTGTAGGGGGAGAGACCAGGATC 
ATCAAGGGGTTCGAGTGCAAGCCTCACTCCCAGCCCTGGCAGGCAGCCCTGTTCGAGAAGACGCGGCTACTCTGTGG 
GGCGACGCTCATCGCCCCCAGATGGCTCCTGACAGCAGCCCACTGCCTCAAGCCTACGCCTGCCTCACACCTTGCGA 
TGCGCCAACATCACCATCATTGAGCACCAGAAGTGTGAGAACGCCTACCCCGGCAACATCACAGACACCATGGTGTG 
TGCCAGCGTGCAGGAAGGGGGCAAGGACTCCTGCCAGGGTGACTCCGGGGGCCCTCTGGTCTGTAACCAGTCTCTTC 
AAGGCATTATCTCCTGGGGCCAGGATCCGTGTGCGATCACCCGAAAGCCTGGTGTCTACACGAAAGTCTGCAAATAT 
GTGGACTGGATCCAGGAGACGATGAAGAACAATTAGACTGGACCCACCCACCACAGCCCATCACCCTCCATTTCCAC 
TTGGTGTTTGGTTCCTGTTCACTCTGTTAATAAGAAACCCTAAGCCAAGACCCTCTACGAACATTCTTTGGGCCTCC 
TGGACTACAGGAGATGCTGTCACTTAATAATCAACCTGGGGTTCGAAATCAGTGAGACCTGGATTCAAATTCTGCCT 
TGAAATATTGTGACTCTGGGAATGACAACACCTGGTTTGTTCTCTGTTGTATCCCCAGCCCCAAAGACAGCTCCTGG 
C C A T AT AT C AAG GT TT C A AT A AAT AT T T G C T AA AT GAGT G AAT C 
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<210> SEQ ID NO 35 

<211> Length : 4, 901 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 35 
>Z4164 4_PEA_1_T5 

CCTTGCTGTTCATGGCCCAGCAGGGGCCCTATGGGGGTCAGGCCTGCAGGCACTCACACTTGGCACCTGCTCCAAAA 
CCCTTTCAGGTCTTTGAGGATCTGAGCCCTGGGCCTGGGTCTCCCGCCGGCTCGGAAAAGCTGGCCTGCCGGGCCAG 
ACGAGAGAACCACACGATTCAGAAAAGCAGTGCCCTTCAGCAGCCTCTCCACCGTCTGGGCTCCCCAAAGGCAGAGC 
GGGACGCTGGAAATGTGTGCGCGCTGTGGTATGGGTGTGCAAGTGTGCGAAGGCGGCGTGTTGTGTGAGCGAGAGGG 
TAGCGGATGTGTGTGTGCGTGTGCGCGCGTGGCTCCGGGTGTGCGCCGCTGCGATAGCGGGTCCTTTCCCGGGGCGG 
GCGACGGGCGGGCTGGGAAGGTCTCCTCCCCTCACCACATTGAGAAATCTCAGTGAGTCACCGAGTGGTTCTGCATA 
TTAATGAGCTCGCTCGCTGCGAGGGCAGGAGCGGATTTAAAAGAGGCCAGGGCGGGCGGAGGGAGGCTGTGGAGAGA 
GCGCGGAGACAAGCGCAGAGCGCAGCGCACGGCCACAGACAGCCCTGGGCATCCACCGACGGCGCAGCCGGAGCCAG 
CAGAGCCGGAAGGCGCGCCCCGGGCAGAGAAAGCCGAGCAGAGCTGGGTGGCGTCTCCGGGCCGCCGCTCCGACGGG 
CCAGCGCCCTCCCCATGTCCCTGCTCCCACGCCGCGCCCCTCCGGTCAGCATGAGGCTCCTGGCGGCCGCGCTGCTC 
CTGCTGCTGCTGGCGCTGTACACCGCGCGTGTGGACGGGTCCAAATGCAAGTGCTCCCGGAAGGGACCCAAGATCCG 
CTACAGCGACGTGAAGAAGCTGGAAATGAAGCCAAAGTACCCGCACTGCGAGGAGAAGATGGTTATCATCACCACCA 
AGAGCGTGTCCAGGTACCGAGGTCAGGAGCACTGCCTGCACCCCAAGCTGCAGAGCACCAAGCGCTTCATCAAGTGG 
TACAACGCCTGGAACGAGAAGCGCAGGTACGCCCCGCCTCTACTCACCTTCCTTCCCACCAGACCCAGCTGTGGCTC 
TCAGGATGGGAAGGGACCCCCCCACCAGGTCATCTAGCCCCATCTAATATGTGAACACCCACCACAACATCCACAGC 
AAACAGATACTCAGACAATGCTTACATACCCCCAGGGACAAGGAACTCACCACTTACCAAAGCTAGCTATCCATCTC 
TTGTCCATTTGCAAGCATGGCAGGTTTGTCATTTTGTAAACTAAAGTCTGTCTCACTCTAATATTTGCATTATAATC 
TTAATTCCTCTTTTTATTTCAGTTACGTAAGTTGTTAAATGGCAGAGTGAGCACTGGCATGGCTGCCAGGGGAGCTC 
TGAGGACTTCAGTGGGGTGAAATGTGACCACTTAGGTGACTGTGTATGTTGGCTATAAAACTGCGCTATAAAACCAT 
GAGGTGCTGAGGATGATCCTTGCCAGAAACATGTTTTCTTCTCCAAGGTGCCCCACTCCCTCTGCTGCCCAGAAACC 
TGATAAACTCCTTCCTTCGCAGGTGCTGGAAGGCACCACAGGTTTGGCTCTTTAAAATCAGAGCCACTGTTAACCAA 
GGCGGGCAGCAGTGTTAAGACCACCAGCACCCTGAACCAGCCCTGTACTTACTGGGCACTGTTTCCTTAAAATCAGA 
AGGTGGCTTCCCATCTCTGGTTTCCTGGGGTCTTATGTCTGTCCTCGGAGGGAGAATCCAGTTCCTAGCTCCCCTGT 
ACCATGCGAAGGTAGCCTGTCCTGTCTCACTCCTCAGATACGCAGAGTCTGTTTACACATTTGCCTGCATAGCATGA 
TCAGGAAGCACACACACACACACACACACACACACACACGCATGCATGCACACACCATGCAGGTGACTTCCCCAGGA 
ACTAGTGCCAGCACCCCTGCTGCAGAGGGGGATATCAAGGCTAAATGGAAGAGAGGGGTGACTTGCCTGGGAGCACA 
GGGCAAAGCCAGGACAGCAAACCAGGCCTCCTGGTGCTACCCCACCAGCTGCCCTCACAGGGTGGAAGGTACAGCCA 



WO 2006/131783 



PCT/IB2005/004037 



38 

TAGTGGGTGCCTGCATTGCCCTCCCCTCACCTGGCCCAGCCATGCTACCCCAAGCTCAGCCCTGTGACCAGCTCTCC 
CAGAGCTGACACTCGGGCTCAACCCCTATACCTGAGCCTTTTTTGCTGCCTCCAAAACAGCCTCATCTGCAGTTGCT 
TGAAATAGAAAGTGATGAGAGCAATAAATTATTTTCTATAAATCTGCTGGGAATGAAGCCCTCTTTCTGGTCAAGCC 
AGGCAGCTCATGTGGCAAAGGCCAGAACTGCGCAGTCCACACTCTGTCACCCTCCAGGCCCTGTGTCATCAAAACTG 
GCAGAAGGCTAATCCCATGGGCAGGTTATGGAAGAGGCTGAGGGCATCTTGATCTGATTGCTGGGGGATACTCAAAC 
CTTTAGCTCACCTTGCTTCTCCCCTCCACCTGAGCTGCAGCCTGGAAAAGGAGGCACCCACAGGTCTAAACATGGCC 
CTGCTTTTTTTTTTCCTGAAAATTCCAATAACAAAAGCAACGAGAGCCTCTCACTACCAGGCCTTCTCTCACTTTGC 
TATAAAATTAGTTCACCCCTCTTTCTTAGAGTGTTGAGGTCCCTGCCCTCCCCACCTCCCTCCCCTGAAACAAGTTG 
AAAATATCTTAATGAACATAGAACAGTGATAAAGGAAGTGTTTGAAGTCCTCTTTGTACAGAGAGAGAGAGAAAGAG 
AATGCCAAAGCTAGGTTGGAGGAAGTAGAAGGGTATACGGTGGGCTCAGGCCCATGGGGGCCACACAGAGGAGCTCT 
GTGCACTTCAGAGACCAGAGCTTCCAGGGAGCTTCTGGCCACCACAGGAAGCAGCCTAGTCAGGCATTTTATTTCAA 
TGGATAATTCAGTGGTCTTACTCAGAAATCAAGAACGAGACAGAAAAGTGATAGGCTAAGTGTAACGTATGGCCCCA 
GGGCAGCCATGGGGCAGAACTAGAAGAAAGCAAAATATCTAACTGGGCACAGCTTGAGAGGTGAGGGGAAGGTGGGG 
CTGGGAACGAGTAGAGATGAGGCAATGCAGCCAGGAGCAGGGACTGAGGGGCACAGGCCTCCTGCACCACTGCCCCA 
CCCCACCAACCACCTCTTCTGTCTCCAGGAAGCAGCTTCTAGAGCTAGCATTCTTCTGGAGGACATGCATTATTTGG 
GCAAAATACAAAGAAATATACAAGCCTAAGTCAAGTAAGGGAATGCCTCCCACCCTTGCTATTTTCTCTAAATAGAG 
AGGCTGAGTACAGACGCGGAAAGAAACAAGGAGGTGTGGGAGCAGCCCGCCATGCTAGAGAAAGACTACATTCCTGC 
CACTAACAGTCGGTGGCCACTGGGCAAATCTTAAGTCTGTGGTGCCTCAGTTTCCTCATATGCAAAGCGGGTTTGTT 
CCATAGGCCTCTGAGGACAAAATGAGATTGCAGAAGTGAGATTGCAGATGGTTAGAAAAGACAAAGCCACACTGGTG 
TGAGTTTTCATGGTCCCCGGGACCACATCCTCAGAAGGATCCCTCCCACTTCTCCTGGGGGTTCCTGCAGTTCTGGG 
ACAGGGGCATTCCCTGCAGACCAGACGTGAATGAAGCCGCTTAGCCAGCATCTTGTGAACGGCCTGCCTCATGTCCT 
GAGCCACTTACACATGTGTTTTTTCTCCCCAGGGTCTACGAAGAATAGGGTGAAAAACCTCAGAAGGGAAAACTCCA 
AACCAGTTGGGAGACTTGTGCAAAGGACTTTGCAGATTAAAAAAAAAAAAAAAAAAAAGCCTTTCTTTCTCACAGGC 
ATAAGACACAAATTATATATTGTTATGAAGCACTTTTTACCAACGGTCAGTTTTTACATTTTATAGCTGCGTGCGAA 
AGGCTTCCAGATGGGAGACCCATCTCTCTTGTGCTCCAGACTTCATCACAGGCTGCTTTTTATCAAAAAGGGGAAAA 
CTCATGCCTTTCCTTTTTAAAAAATGCTTTTTTGTATTTGTCCATACGTCACTATACATCTGAGCTTTATAAGCGCC 
CGGGAGGAACAATGAGCTTGGTGGACACATTTCATTGCAGTGTTGCTCCATTCCTAGCTTGGGAAGCTTCCGCTTAG 
AGGTCCTGGCGCCTCGGCACAGCTGCCACGGGCTCTCCTGGGCTTATGGCCGGTCACAGCCTCAGTGTGACTCCACA 
GTGGCCCCTGTAGCCGGGCAAGCAGGAGCAGGTCTCTCTGCATCTGTTCTCTGAGGAACTCAAGTTTGGTTGCCAGA 
AAAATGTGCTTCATTCCCCCCTGGTTAATTTTTACACACCCTAGGAAACATTTCCAAGATCCTGTGATGGCGAGACA 
AATGATCCTTAAAGAAGGTGTGGGGTCTTTCCCAACCTGAGGATTTCTGAAAGGTTCACAGGTTCAATATTTAATGC 
TTCAGAAGCATGTGAGGTTCCCAACACTGTCAGCAAAAACCTTAGGAGAAAACTTAAAAATATATGAATACATGCGC 
AATACACAGCTACAGACACACATTCTGTTGACAAGGGAAAACCTTCAAAGCATGTTTCTTTCCCTCACCACAACAGA 
ACATGCAGTACTAAAGCAATATATTTGTGATTCCCCATGTAATTCTTCAATGTTAAACAGTGCAGTCCTCTTTCGAA 
AGCTAAGATGACCATGCGCCCTTTCCTCTGTACATATACCCTTAAGAACGCCCCCTCCACACACTGCCCCCCAGTAT 
ATGCCGCATTGTACTGCTGTGTTATATGCTATGTACATGTCAGAAACCATTAGCATTGCATGCAGGTTTCATATTCT 
T T C T A AG AT G G A AA GT AAT AAAAT AT AT TT G A AAT G T AC C AAAAT T C TAG A 
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<210> SEQ ID NO 36 

<211> Length : 3, 429 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 36 
>Z4 4 808_PEA_1_T11 

CCTGGACCCTGGGGCGTGAGGAGGGCGCGGTGCGTCCCGTGGTTGTGCTTGGAAGCCCCCCGAGGGTGCGCGCGCGT 
GGGTATGAGTGCGTGCGTGTGCCTGGGTGTGCGTGTGTGTAAGTGTGCACGTGTGTGTGTGAGAGTGCGCGCGGGGA 
AGGAGGCACAGAGACAGCCCGGACAGGCCACTGCGCAGCCCTGGTGGCCCCCGCTCCACCTCTCGCTCCGCAGACCC 
GCGCCAGGGAGGCCTCTGGGCCGCAGCGGGCACCGGAGCGGAGCGGGCGCGGCAGCGGGCGCTGGGAGGTGGGGCTG 
GGGGAGGAGAGGGGGAGGGAGAGAGGCGGGCGGGAGGGGAGGATCCGGGAAGCTCCGGGGTATTTGACAGGAGCGAG 
GGCGGACGCAAAGAACGCGGAGGACCTCTGGGTGCCTGCAGGGGAGCTGCTCCAGCCGGGCCGCCGGGAGCGGTGGG 
GAGAGCATCGCGGAGCCGCCCCTCCACGCGCCCGCCCAGCCGCGCTCGCCCACTGGGCTCTCCCGGCTGCAGTGCCA 
GGGCGCAGGACGCGGCCGATCTCCCGCTCCCGCCACCTCCGCCACCATGCTGCTCCCCCAGCTCTGCTGGCTGCCGC 
TGCTCGCTGGGCTGCTCCCGCCGGTGCCCGCTCAGAAGTTCTCGGCGCTCACGTTTTTGAGAGTGGATCAAGATAAA 
GACAAGGATTGTAGCTTGGACTGTGCGGGTTCGCCCCAGAAACCTCTCTGCGCATCTGACGGAAGGACCTTCCTTTC 
CCGTTGTGAATTTCAACGTGCCAAGTGCAAAGATCCCCAGCTAGAGATTGCATATCGAGGAAACTGCAAAGACGTGT 
CCAGGTGTGTGGCCGAAAGGAAGTATACCCAGGAGCAAGCCCGGAAGGAGTTTCAGCAAGTGTTCATTCCTGAGTGC 
AATGACGACGGCACCTACAGTCAGGTCCAGTGTCACAGCTACACGGGATACTGCTGGTGCGTCACGCCCAACGGGAG 
GCCCATCAGCGGCACTGCCGTGGCCCACAAGACGCCCCGGTGCCCGGGTTCCGTAAATGAAAAGTTACCCCAACGCG 
AAGGCACAGGAAAAACAGATATTGCATCACGTTACCCTACCCTTTGGACTGAACAGGTTAAAAGTCGGCAGAACAAA 
ACCAATAAGAATTCAGTGTCATCCTGTGACCAAGAGCACCAGTCTGCCCTGGAGGAAGCCAAGCAGCCCAAGAACGA 
CAATGTGGTGATCCCTGAGTGTGCGCACGGCGGCCTCTACAAGCCAGTGCAGTGCCACCCCTCCACGGGGTACTGCT 
GGTGCGTCCTGGTGGACACGGGGCGCCCCATTCCCGGCACATCCACAAGGTACGAGCAGCCGAAATGTGACAACACG 
GCCAGGGCCCACCCAGCCAAAGCCCGGGACCTGTACAAGGGCCGCCAGCTACAAGGTTGTCCGGGTGCCAAAAAGCA 
TGAGTTTCTGACCAGCGTTCTGGACGCGCTGTCCACGGACATGGTCCACGCCGCCTCCGACCCCTCCTCCTCGTCAG 
GCAGGCTCTCAGAACCCGACCCCAGCCATACCCTAGAGGAGCGGGTGGTGCACTGGTACTTCAAACTACTGGATAAA 
AACTCCAGTGGAGACATCGGCAAAAAGGAAATCAAACCCTTCAAGAGGTTCCTTCGCAAAAAATCAAAGCCCAAAAA 
ATGTGTGAAGAAGTTTGTTGAATACTGTGACGTGAATAATGACAAATCCATCTCCGTACAAGAACTGATGGGCTGCC 
TGGGCGTGGCGAAAGAGGACGGCAAAGCGGACACCAAGAAACGCCACACCCCCAGAGGTCATGCTGAAAGTACGTCT 
AATAGACAGCCAAGGAAACAAGGATAAATGGCTCATACCCCGAAGGCAGTTCCTAGACACATGGGAAATTTCCCTCA 
CCAAAGAGCAATTAAGAAAACAAAAACAGAAACACATAGTATTTGCACTTTGTACTTTAAATGTAAATTCACTTTGT 
AGAAATGAGCTATTTAAACAGACTGTTTTAATCTGTGAAAATGGAGAGCTGGCTTCAGAAAATTAATCACATACAAT 
GTATGTGTCCTCTTTTGACCTTGGAAATCTGTATGTGGTGGAGAAGTATTTGAATGCATTTAGGCTTAATTTCTTCG 
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CCTTCCATATGTTAACAGTAGAGCTCTATGCACTCCGGCTGCAATCGTATGGCTTTCTCTAACCCCTGCAGTCACTT 
CCAGATGCCTGTGCTTACAGCATTGTGGAATCATGTTGGAAGCTCCACATGTCCATGGAAGTTTGTGATGTACGGCC 
GACCCTACAGGCAGTTAACATGCATGGGCTGGTTTGTTTCTTGGGATTTTCTGTTAGTTTGTCTTGTTTTGCTTTCC 
AGAGATCTTGCTCATACAATGAATCACGCAACCACTAAAGCTATCCAGTTAAGTGCAGGTAGTTCCCCTGGAGGAAA 
TAATATTTTCAAACTGTCGTTGGTGTGATACTTTGGCTCAAAGGATCTTTGCTTTTCCATTTTAAGCTTCTGTTTTG 
AGTTTTGCCCTGGGGCTTGAATGAGTCCCAGAGAGTCGTTCGGATGGTGGGAGGCTGCCTAGGAGGCAGTAAATCCA 
GTCACAGTGCCTGGGAGGGGCCCATCCTTCCAAAATGTAAATCCAGTCGCGGTGTGACCGAGCTGGCTAACAGGCTT 
GTCTGCCTGGTTTTCCTCCTACACGTGGACATTATTCTCCTGATCCTCCTACCTGGTCCACCCCAGGGCTACCGGAA 
GGTAAAATCTTCACCTGAACCAATTATGAGCAGTCTCCTTACTGAAGGTACAGCCGGATACGTGGTGCCCCCGGGGC 
TGGTGTTGGCAGCCGGGGGGAGGTGCCTGAGGGTCCCCACGGTTCCTTTCTGCTTTTCTGAATGCATCAAGGGTACG 
AGAACTTGCCAATGGGAAATTCATCCGAGTGGCACTGGCAGAGAAGGATAGGAGTGGAATGCCCACACAGTGACCAA 
CAGAACTGGTCTGCGTGCATAACCAGCTGCCACCCTCAGGCCTGGGCCCCAGAGCTCAGGGCACCCAGTGTCTTAAG 
GAACCATTTGGAGGACAGTCTGAGAGCAGGAACTTCAAGCTGTGATTCTATCTCGGCTCAGACTTTTGGTTGGAAAA 
AGATCTTCATGGCCCCAAATCCCCTGAGACATGCCTTGTAGAATGATTTTGTGATGTTGTGATGCTTGTGGAGCATC 
GCGTAAGGCTTCTTGCTTATTTAAACTGTGCAAGGTAAAAATCAAGCCTTTGGAGCCACAGAACCAGCTCAAGTACA 
TGCCAATGTTGTTTAAGAAACAGTTATGATCCTAAACTTTTTGGATAATCTTTTATATTTCTGACCTTTGAATTTAA 
TCATTGTTCTTAGATTAAAATAAAATATGCTATTGAAACTA 

<210> SEQ ID NO 37 

<211> Length : 5,165 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 37 
> Z 4 4 8 0 8_PE A_1_T 4 

CCTGGACCCTGGGGCGTGAGGAGGGCGCGGTGCGTCCCGTGGTTGTGCTTGGAAGCCCCCCGAGGGTGCGCGCGCGT 
GGGTATGAGTGCGTGCGTGTGCCTGGGTGTGCGTGTGTGTAAGTGTGCACGTGTGTGTGTGAGAGTGCGCGCGGGGA 
AGGAGGCACAGAGACAGCCCGGACAGGCCACTGCGCAGCCCTGGTGGCCCCCGCTCCACCTCTCGCTCCGCAGACCC 
GCGCCAGGGAGGCCTCTGGGCCGCAGCGGGCACCGGAGCGGAGCGGGCGCGGCAGCGGGCGCTGGGAGGTGGGGCTG 
GGGGAGGAGAGGGGGAGGGAGAGAGGCGGGCGGGAGGGGAGGATCCGGGAAGCTCCGGGGTATTTGACAGGAGCGAG 
GGCGGACGCAAAGAACGCGGAGGACCTCTGGGTGCCTGCAGGGGAGCTGCTCCAGCCGGGCCGCCGGGAGCGGTGGG 
GAGAGCATCGCGGAGCCGCCCCTCCACGCGCCCGCCCAGCCGCGCTCGCCCACTGGGCTCTCCCGGCTGCAGTGCCA 
GGGCGCAGGACGCGGCCGATCTCCCGCTCCCGCCACCTCCGCCACCATGCTGCTCCCCCAGCTCTGCTGGCTGCCGC 
TGCTCGCTGGGCTGCTCCCGCCGGTGCCCGCTCAGAAGTTCTCGGCGCTCACGTTTTTGAGAGTGGATCAAGATAAA 
GACAAGGATTGTAGCTTGGACTGTGCGGGTTCGCCCCAGAAACCTCTCTGCGCATCTGACGGAAGGACCTTCCTTTC 
CCGTTGTGAATTTCAACGTGCCAAGTGCAAAGATCCCCAGCTAGAGATTGCATATCGAGGAAACTGCAAAGACGTGT 
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CCAGGTGTGTGGCCGAAAGGAAGTATACCCAGGAGCAAGCCCGGAAGGAGTTTCAGCAAGTGTTCATTCCTGAGTGC 
AATGACGACGGCACCTACAGTCAGGTCCAGTGTCACAGCTACACGGGATACTGCTGGTGCGTCACGCCCAACGGGAG 
GCCCATCAGCGGCACTGCCGTGGCCCACAAGACGCCCCGGTGCCCGGGTTCCGTAAATGAAAAGTTACCCCAACGCG 
AAGGCACAGGAAAAACAGATGATGCCGCAGCTCCAGCGTTGGAGACTCAGCCTCAAGGAGATGAAGAAGATATTGCA 
TCACGTTACCCTACCCTTTGGACTGAACAGGTTAAAAGTCGGCAGAACAAAACCAATAAGAATTCAGTGTCATCCTG 
TGACCAAGAGCACCAGTCTGCCCTGGAGGAAGCCAAGCAGCCCAAGAACGACAATGTGGTGATCCCTGAGTGTGCGC 
ACGGCGGCCTCTACAAGCCAGTGCAGTGCCACCCCTCCACGGGGTACTGCTGGTGCGTCCTGGTGGACACGGGGCGC 
CCCATTCCCGGCACATCCACAAGGTACGAGCAGCCGAAATGTGACAACACGGCCAGGGCCCACCCAGCCAAAGCCCG 
GGACCTGTACAAGGGCCGCCAGCTACAAGGTTGTCCGGGTGCCAAAAAGCATGAGTTTCTGACCAGCGTTCTGGACG 
CGCTGTCCACGGACATGGTCCACGCCGCCTCCGACCCCTCCTCCTCGTCAGGCAGGCTCTCAGAACCCGACCCCAGC 
CATACCCTAGAGGAGCGGGTGGTGCACTGGTACTTCAAACTACTGGATAAAAACTCCAGTGGAGACATCGGCAAAAA 
GGAAATCAAACCCTTCAAGAGGTTCCTTCGCAAAAAATCAAAGCCCAAAAAATGTGTGAAGAAGTTTGTTGAATACT 
GTGACGTGAATAATGACAAATCCATCTCCGTACAAGAACTGATGGGCTGCCTGGGCGTGGCGAAAGAGGACGGCAAA 
GCGGACACCAAGAAACGCCACACCCCCAGAGGTCATGCTGAAAGTACGTCTAATAGACAGGATGCAATGGTGGTGTC 
CTCCAGACCCAAAGCCACAACCCATCGCAAGTCAAGAACACTTTCCAGAAGATAAACATGAGTGGGTTCATGTCTCT 
CTCCTTCAAAGCCAGGACAAAATCCCCACTTCTTTGCTGCCGCGAGTCAATTTGTGATTTATTTTGTCTGCACCTGT 
TTGATGCCAGGTCGACATTTCCTAAGGCAAGCCCCTGTATTTGTTGTGGATTTAAGTGGAGGCGGCCAGCACACACC 
TTGGATGTAATTTAAAACCATTTCCTGAGGAAAGATGTGTGATATGCTTTCCTTTGTTTAGCAAATGTTTATGGTTT 
TAACTTTAAATCTCACCGCAAATCACTTACACTTGAAAACAGGGCTGGTCTGAAAGTAATTACCCTCCCTGAGTGCC 
AAGACCTCCAGAAGTTGTTTTCATTCCCGAATGGCAATCACTGTACTCATGCGCTCCACGCATCTTAAATAAACTCA 
GTTCAAAGCACATGCCTCCTGCTTCAGCTCTTTTTTCCAAAAAGAGAAACAGAAGCAGGTTCCCCCTCCTTTTATAG 
TGCCTGCGTGGACACGCGGACCTCCATGCCTTTCATGCTGTGGCTATGTCAGCAAACTACGATATTGGGATGATCCT 
AACGGGCAAGCCAGCTGCGGCTCCTACCGGCCGTGGCCATTGAAGGCCACCATGTTGCTTTGAAACATCTCAAAGAA 
TAACATAGTGCCAGCCAGCAAGGGTTTCACCATATGCATGACCCAGACAGGAACTATCAAAAGAAGGGATCACGGGA 
AGGTGCATGATGCTAATGTGGAATCCAGAGGAGCTCTTTCCTGATCTCTTCAGCTTCCGCTGCCACTCCAGAATCAT 
CAGAGCTGATATTAAATAAGTTAAAATGTTAGTCCACCGTCTCCTCCTGCAATCCTAACCATCTTTTGAGACTGTTA 
GAATACTTTGACGGGTTGTCTTTCTGTGCAACTAATTTAAACCTCAAGTTTAGTGTAGGAGATGGGTTTGTCTTCTC 
ACCTCTTCAGATCTTTATCAAGGGGGAATAAAAGCCAACCCAGAAACCTAAACTTTAAAATTTAATTATTTGAAATA 
ATAAAACAGAAGAAGGGATCAACATTTGTCGGAATTGGCACTCTTGGAAAACTAAGTCTAGGAGATCATATATTGCT 
TTTTTTTTTTCATTCTAAATTACTTTTAATTGAAAGTCAAGATGCTGAGTTACAGTTGTTTATCATTATAATAAGCA 
AACTTTTTAAGTTGGATTTCTTCTTAAAGAGGTAAACTAGTGAACAAAAAAGATAAAAAGGAAAATTAAGAATCAAC 
TATGCCTTTATCAAATTTGAAGCATAAGTTATATTATTAAAATTATTTTTGTATAATCAAGGTGATAAGACATTCTG 
GAAAACATTTAATGTATTTAGTACTTAGAATATTTACAGTGGATGTTACTTTTTTGAAACGATATATTTTTCCCAAT 
TTTTCTATCATGTCAAGGAAGGAAACTGTTAAGAAGTTACCAGTGTCCAAAATGTCTTCATTGTTTCTTACTCATAC 
TTACACCTCACATGACCTGCCCAGCCCTCTTTGGTTCAGTTCATTCCCAGAAGCCAAGCCTTAGTCTTCACAGATGA 
GCGACACACACCTCTGAATATAATGTCTCTTTTTTGTTTTTTCCTTTTCAGCCAAGGAAACAAGGATAAATGGCTCA 
TACCCCGAAGGCAGTTCCTAGACACATGGGAAATTTCCCTCACCAAAGAGCAATTAAGAAAACAAAAACAGAAACAC 
ATAGTATTTGCACTTTGTACTTTAAATGTAAATTCACTTTGTAGAAATGAGCTATTTAAACAGACTGTTTTAATCTG 
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TGAAAATGGAGAGCTGGCTTCAGAAAATTAATCACATACAATGTATGTGTCCTCTTTTGACCTTGGAAATCTGTATG 
TGGTGGAGAAGTATTTGAATGCATTTAGGCTTAATTTCTTCGCCTTCCATATGTTAACAGTAGAGCTCTATGCACTC 
CGGCTGCAATCGTATGGCTTTCTCTAACCCCTGCAGTCACTTCCAGATGCCTGTGCTTACAGCATTGTGGAATCATG 
TTGGAAGCTCCACATGTCCATGGAAGTTTGTGATGTACGGCCGACCCTACAGGCAGTTAACATGCATGGGCTGGTTT 
GTTTCTTGGGATTTTCTGTTAGTTTGTCTTGTTTTGCTTTCCAGAGATCTTGCTCATACAATGAATCACGCAACCAC 
TAAAGCTATCCAGTTAAGTGCAGGTAGTTCCCCTGGAGGAAATAATATTTTCAAACTGTCGTTGGTGTGATACTTTG 
GCTCAAAGGATCTTTGCTTTTCCATTTTAAGCTTCTGTTTTGAGTTTTGCCCTGGGGCTTGAATGAGTCCCAGAGAG 
TCGTTCGGATGGTGGGAGGCTGCCTAGGAGGCAGTAAATCCAGTCACAGTGCCTGGGAGGGGCCCATCCTTCCAAAA 
TGTAAATCCAGTCGCGGTGTGACCGAGCTGGCTAACAGGCTTGTCTGCCTGGTTTTCCTCCTACACGTGGACATTAT 
TCTCCTGATCCTCCTACCTGGTCCACCCCAGGGCTACCGGAAGGTAAAATCTTCACCTGAACCAATTATGAGCAGTC 
TCCTTACTGAAGGTACAGCCGGATACGTGGTGCCCCCGGGGCTGGTGTTGGCAGCCGGGGGGAGGTGCCTGAGGGTC 
CCCACGGTTCCTTTCTGCTTTTCTGAATGCATCAAGGGTACGAGAACTTGCCAATGGGAAATTCATCCGAGTGGCAC 
TGGCAGAGAAGGATAGGAGTGGAATGCCCACACAGTGACCAACAGAACTGGTCTGCGTGCATAACCAGCTGCCACCC 
TCAGGCCTGGGCCCCAGAGCTCAGGGCACCCAGTGTCTTAAGGAACCATTTGGAGGACAGTCTGAGAGCAGGAACTT 
CAAGCTGTGATTCTATCTCGGCTCAGACTTTTGGTTGGAAAAAGATCTTCATGGCCCCAAATCCCCTGAGACATGCC 
TTGTAGAATGATTTTGTGATGTTGTGATGCTTGTGGAGCATCGCGTAAGGCTTCTTGCTTATTTAAACTGTGCAAGG 
TAAAAATCAAGCCTTTGGAGCCACAGAACCAGCTCAAGTACATGCCAATGTTGTTTAAGAAACAGTTATGATCCTAA 
ACTTTTTGGATAATCTTTTATATTTCTGACCTTTGAATTTAATCATTGTTCTTAGATTAAAATAAAATATGCTATTG 
AAACTA 

<210> SEQ ID NO 38 

<211> Length : 3,575 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 38 
> Z 4 4 8 0 8__PE A_1_T 5 

CCTGGACCCTGGGGCGTGAGGAGGGCGCGGTGCGTCCCGTGGTTGTGCTTGGAAGCCCCCCGAGGGTGCGCGCGCGT 
GGGTATGAGTGCGTGCGTGTGCCTGGGTGTGCGTGTGTGTAAGTGTGCACGTGTGTGTGTGAGAGTGCGCGCGGGGA 
AGGAGGCACAGAGACAGCCCGGACAGGCCACTGCGCAGCCCTGGTGGCCCCCGCTCCACCTCTCGCTCCGCAGACCC 
GCGCCAGGGAGGCCTCTGGGCCGCAGCGGGCACCGGAGCGGAGCGGGCGCGGCAGCGGGCGCTGGGAGGTGGGGCTG 
GGGGAGGAGAGGGGGAGGGAGAGAGGCGGGCGGGAGGGGAGGATCCGGGAAGCTCCGGGGTATTTGACAGGAGCGAG 
GGCGGACGCAAAGAACGCGGAGGACCTCTGGGTGCCTGCAGGGGAGCTGCTCCAGCCGGGCCGCCGGGAGCGGTGGG 
GAGAGCATCGCGGAGCCGCCCCTCCACGCGCCCGCCCAGCCGCGCTCGCCCACTGGGCTCTCCCGGCTGCAGTGCCA 
GGGCGCAGGACGCGGCCGATCTCCCGCTCCCGCCACCTCCGCCACCATGCTGCTCCCCCAGCTCTGCTGGCTGCCGC 
TGCTCGCTGGGCTGCTCCCGCCGGTGCCCGCTCAGAAGTTCTCGGCGCTCACGTTTTTGAGAGTGGATCAAGATAAA 
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GACAAGGATTGTAGCTTGGACTGTGCGGGTTCGCCCCAGAAACCTCTCTGCGCATCTGACGGAAGGACCTTCCTTTC 
CCGTTGTGAATTTCAACGTGCCAAGTGCAAAGATCCCCAGCTAGAGATTGCATATCGAGGAAACTGCAAAGACGTGT 
CCAGGTGTGTGGCCGAAAGGAAGTATACCCAGGAGCAAGCCCGGAAGGAGTTTCAGCAAGTGTTCATTCCTGAGTGC 
AATGACGACGGCACCTACAGTCAGGTCCAGTGTCACAGCTACACGGGATACTGCTGGTGCGTCACGCCCAACGGGAG 
GCCCATCAGCGGCACTGCCGTGGCCCACAAGACGCCCCGGTGCCCGGGTTCCGTAAATGAAAAGTTACCCCAACGCG 
AAGGCACAGGAAAAACAGATGATGCCGCAGCTCCAGCGTTGGAGACTCAGCCTCAAGGAGATGAAGAAGATATTGCA 
TCACGTTACCCTACCCTTTGGACTGAACAGGTTAAAAGTCGGCAGAACAAAACCAATAAGAATTCAGTGTCATCCTG 
TGACCAAGAGCACCAGTCTGCCCTGGAGGAAGCCAAGCAGCCCAAGAACGACAATGTGGTGATCCCTGAGTGTGCGC 
ACGGCGGCCTCTACAAGCCAGTGCAGTGCCACCCCTCCACGGGGTACTGCTGGTGCGTCCTGGTGGACACGGGGCGC 
CCCATTCCCGGCACATCCACAAGGTACGAGCAGCCGAAATGTGACAACACGGCCAGGGCCCACCCAGCCAAAGCCCG 
GGACCTGTACAAGGGCCGCCAGCTACAAGGTTGTCCGGGTGCCAAAAAGCATGAGTTTCTGACCAGCGTTCTGGACG 
CGCTGTCCACGGACATGGTCCACGCCGCCTCCGACCCCTCCTCCTCGTCAGGCAGGCTCTCAGAACCCGACCCCAGC 
CATACCCTAGAGGAGCGGGTGGTGCACTGGTACTTCAAACTACTGGATAAAAACTCCAGTGGAGACATCGGCAAAAA 
GGAAATCAAACCCTTCAAGAGGTTCCTTCGCAAAAAATCAAAGCCCAAAAAATGTGTGAAGAAGTTTGTTGAATACT 
GTGACGTGAATAATGACAAATCCATCTCCGTACAAGAACTGATGGGCTGCCTGGGCGTGGCGAAAGAGGACGGCAAA 
GCGGACACCAAGAAACGCCACAGAAGTAAGAGAAACCTGTGATGGCCAGAGCCCAGATGTTCTTAGGAGGCAAGCCA 
GGAGAAGCCGGGTCTGACTTTTCAGCTCAGAGACAGCACTCCCCCAGAGGTCATGCTGAAAGTACGTCTAATAGACA 
GCCAAGGAAACAAGGATAAATGGCTCATACCCCGAAGGCAGTTCCTAGACACATGGGAAATTTCCCTCACCAAAGAG 
CAATTAAGAAAACAAAAACAGAAACACATAGTATTTGCACTTTGTACTTTAAATGTAAATTCACTTTGTAGAAATGA 
GCTATTTAAACAGACTGTTTTAATCTGTGAAAATGGAGAGCTGGCTTCAGAAAATTAATCACATACAATGTATGTGT 
CCTCTTTTGACCTTGGAAATCTGTATGTGGTGGAGAAGTATTTGAATGCATTTAGGCTTAATTTCTTCGCCTTCCAT 
ATGTTAACAGTAGAGCTCTATGCACTCCGGCTGCAATCGTATGGCTTTCTCTAACCCCTGCAGTCACTTCCAGATGC 
CTGTGCTTACAGCATTGTGGAATCATGTTGGAAGCTCCACATGTCCATGGAAGTTTGTGATGTACGGCCGACCCTAC 
AGGCAGTTAACATGCATGGGCTGGTTTGTTTCTTGGGATTTTCTGTTAGTTTGTCTTGTTTTGCTTTCCAGAGATCT 
TGCTCATACAATGAATCACGCAACCACTAAAGCTATCCAGTTAAGTGCAGGTAGTTCCCCTGGAGGAAATAATATTT 
TCAAACTGTCGTTGGTGTGATACTTTGGCTCAAAGGATCTTTGCTTTTCCATTTTAAGCTTCTGTTTTGAGTTTTGC 
CCTGGGGCTTGAATGAGTCCCAGAGAGTCGTTCGGATGGTGGGAGGCTGCCTAGGAGGCAGTAAATCCAGTCACAGT 
GCCTGGGAGGGGCCCATCCTTCCAAAATGTAAATCCAGTCGCGGTGTGACCGAGCTGGCTAACAGGCTTGTCTGCCT 
GGTTTTCCTCCTACACGTGGACATTATTCTCCTGATCCTCCTACCTGGTCCACCCCAGGGCTACCGGAAGGTAAAAT 
CTTCACCTGAACCAATTATGAGCAGTCTCCTTACTGAAGGTACAGCCGGATACGTGGTGCCCCCGGGGCTGGTGTTG 
GCAGCCGGGGGGAGGTGCCTGAGGGTCCCCACGGTTCCTTTCTGCTTTTCTGAATGCATCAAGGGTACGAGAACTTG 
CCAATGGGAAATTCATCCGAGTGGCACTGGCAGAGAAGGATAGGAGTGGAATGCCCACACAGTGACCAACAGAACTG 
GTCTGCGTGCATAACCAGCTGCCACCCTCAGGCCTGGGCCCCAGAGCTCAGGGCACCCAGTGTCTTAAGGAACCATT 
TGGAGGACAGTCTGAGAGCAGGAACTTCAAGCTGTGATTCTATCTCGGCTCAGACTTTTGGTTGGAAAAAGATCTTC 
ATGGCCCCAAATCCCCTGAGACATGCCTTGTAGAATGATTTTGTGATGTTGTGATGCTTGTGGAGCATCGCGTAAGG 
CTTCTTGCTTATTTAAACTGTGCAAGGTAAAAATCAAGCCTTTGGAGCCACAGAACCAGCTCAAGTACATGCCAATG 
TTGTTTAAGAAACAGTTATGATCCTAAACTTTTTGGATAATCTTTTATATTTCTGACCTTTGAATTTAATCATTGTT 
C T TAG AT T A A AAT A A A AT AT G C TAT T G A AAC T A 
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<210> SEQ ID NO 39 

<211> Length : 2, 397 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 39 
>Z4 4 808__PEA_1_T8 

CCTGGACCCTGGGGCGTGAGGAGGGCGCGGTGCGTCCCGTGGTTGTGCTTGGAAGCCCCCCGAGGGTGCGCGCGCGT 
GGGTATGAGTGCGTGCGTGTGCCTGGGTGTGCGTGTGTGTAAGTGTGCACGTGTGTGTGTGAGAGTGCGCGCGGGGA 
AGGAGGCACAGAGACAGCCCGGACAGGCCACTGCGCAGCCCTGGTGGCCCCCGCTCCACCTCTCGCTCCGCAGACCC 
GCGCCAGGGAGGCCTCTGGGCCGCAGCGGGCACCGGAGCGGAGCGGGCGCGGCAGCGGGCGCTGGGAGGTGGGGCTG 
GGGGAGGAGAGGGGGAGGGAGAGAGGCGGGCGGGAGGGGAGGATCCGGGAAGCTCCGGGGTATTTGACAGGAGCGAG 
GGCGGACGCAAAGAACGCGGAGGACCTCTGGGTGCCTGCAGGGGAGCTGCTCCAGCCGGGCCGCCGGGAGCGGTGGG 
GAGAGCATCGCGGAGCCGCCCCTCCACGCGCCCGCCCAGCCGCGCTCGCCCACTGGGCTCTCCCGGCTGCAGTGCCA 
GGGCGCAGGACGCGGCCGATCTCCCGCTCCCGCCACCTCCGCCACCATGCTGCTCCCCCAGCTCTGCTGGCTGCCGC 
TGCTCGCTGGGCTGCTCCCGCCGGTGCCCGCTCAGAAGTTCTCGGCGCTCACGTTTTTGAGAGTGGATCAAGATAAA 
GACAAGGATTGTAGCTTGGACTGTGCGGGTTCGCCCCAGAAACCTCTCTGCGCATCTGACGGAAGGACCTTCCTTTC 
CCGTTGTGAATTTCAACGTGCCAAGTGCAAAGATCCCCAGCTAGAGATTGCATATCGAGGAAACTGCAAAGACGTGT 
CCAGGTGTGTGGCCGAAAGGAAGTATACCCAGGAGCAAGCCCGGAAGGAGTTTCAGCAAGTGTTCATTCCTGAGTGC 
AATGACGACGGCACCTACAGTCAGGTCCAGTGTCACAGCTACACGGGATACTGCTGGTGCGTCACGCCCAACGGGAG 
GCCCATCAGCGGCACTGCCGTGGCCCACAAGACGCCCCGGTGCCCGGGTTCCGTAAATGAAAAGTTACCCCAACGCG 
AAGGCACAGGAAAAACAGATGATGCCGCAGCTCCAGCGTTGGAGACTCAGCCTCAAGGAGATGAAGAAGATATTGCA 
TCACGTTACCCTACCCTTTGGACTGAACAGGTTAAAAGTCGGCAGAACAAAACCAATAAGAATTCAGTGTCATCCTG 
TGACCAAGAGCACCAGTCTGCCCTGGAGGAAGCCAAGCAGCCCAAGAACGACAATGTGGTGATCCCTGAGTGTGCGC 
ACGGCGGCCTCTACAAGCCAGTGCAGTGCCACCCCTCCACGGGGTACTGCTGGTGCGTCCTGGTGGACACGGGGCGC 
CCCATTCCCGGCACATCCACAAGGTACGAGCAGCCGAAATGTGACAACACGGCCAGGGCCCACCCAGCCAAAGCCCG 
GGACCTGTACAAGGGCCGCCAGCTACAAGGTTGTCCGGGTGCCAAAAAGCATGAGTTTCTGACCAGCGTTCTGGACG 
CGCTGTCCACGGACATGGTCCACGCCGCCTCCGACCCCTCCTCCTCGTCAGGCAGGCTCTCAGAACCCGACCCCAGC 
CATACCCTAGAGGAGCGGGTGGTGCACTGGTACTTCAAACTACTGGATAAAAACTCCAGTGGAGACATCGGCAAAAA 
GGAAATCAAACCCTTCAAGAGGTTCCTTCGCAAAAAATCAAAGCCCAAAAAATGTGTGAAGAAGTTTGTTGAATACT 
GTGACGTGAATAATGACAAATCCATCTCCGTACAAGAACTGATGGGCTGCCTGGGCGTGGCGAAAGAGGACGGCAAA 
GCGGACACCAAGAAACGCCACACCCCCAGAGGTCATGCTGAAAGTACGTCTAATAGACAGGATGCAATGGTGGTGTC 
CTCCAGACCCAAAGCCACAACCCATCGCAAGTCAAGAACACTTTCCAGAAGATAAACATGAGTGGGTTCATGTCTCT 
CTCCTTCAAAGCCAGGACAAAATCCCCACTTCTTTGCTGCCGCGAGTCAATTTGTGATTTATTTTGTCTGCACCTGT 
TTGATGCCAGGTCGACATTTCCTAAGGCAAGCCCCTGTATTTGTTGTGGATTTAAGTGGAGGCGGCCAGCACACACC 
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TTGGATGTAATTTAAAACCATTTCCTGAGGAAAGATGTGTGATATGCTTTCCTTTGTTTAGCAAATGTTTATGGTTT 
TAACTTTAAATCTCACCGCAAATCACTTACACTTGAAAACAGGGCTGGTCTGAAAGTAATTACCCTCCCTGAGTGCC 
AAGACCTCCAGAAGTTGTTTTCATTCCCGAATGGCAATCACTGTACTCATGCGCTCCACGCATCTTAAATAAACTCA 
GTTCAAAGCA 

<210> SEQ ID NO 40 

<211> Length : 2,206 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 40 
> Z 4 4 8 0 8_PE A_1_T 9 

CCTGGACCCTGGGGCGTGAGGAGGGCGCGGTGCGTCCCGTGGTTGTGCTTGGAAGCCCCCCGAGGGTGCGCGCGCGT 
GGGTATGAGTGCGTGCGTGTGCCTGGGTGTGCGTGTGTGTAAGTGTGCACGTGTGTGTGTGAGAGTGCGCGCGGGGA 
AGGAGGCACAGAGACAGCCCGGACAGGCCACTGCGCAGCCCTGGTGGCCCCCGCTCCACCTCTCGCTCCGCAGACCC 
GCGCCAGGGAGGCCTCTGGGCCGCAGCGGGCACCGGAGCGGAGCGGGCGCGGCAGCGGGCGCTGGGAGGTGGGGCTG 
GGGGAGGAGAGGGGGAGGGAGAGAGGCGGGCGGGAGGGGAGGATCCGGGAAGCTCCGGGGTATTTGACAGGAGCGAG 
GGCGGACGCAAAGAACGCGGAGGACCTCTGGGTGCCTGCAGGGGAGCTGCTCCAGCCGGGCCGCCGGGAGCGGTGGG 
GAGAGCATCGCGGAGCCGCCCCTCCACGCGCCCGCCCAGCCGCGCTCGCCCACTGGGCTCTCCCGGCTGCAGTGCCA 
GGGCGCAGGACGCGGCCGATCTCCCGCTCCCGCCACCTCCGCCACCATGCTGCTCCCCCAGCTCTGCTGGCTGCCGC 
TGCTCGCTGGGCTGCTCCCGCCGGTGCCCGCTCAGAAGTTCTCGGCGCTCACGTTTTTGAGAGTGGATCAAGATAAA 
GACAAGGATTGTAGCTTGGACTGTGCGGGTTCGCCCCAGAAACCTCTCTGCGCATCTGACGGAAGGACCTTCCTTTC 
CCGTTGTGAATTTCAACGTGCCAAGTGCAAAGATCCCCAGCTAGAGATTGCATATCGAGGAAACTGCAAAGACGTGT 
CCAGGTGTGTGGCCGAAAGGAAGTATACCCAGGAGCAAGCCCGGAAGGAGTTTCAGCAAGTGTTCATTCCTGAGTGC 
AATGACGACGGCACCTACAGTCAGGTCCAGTGTCACAGCTACACGGGATACTGCTGGTGCGTCACGCCCAACGGGAG 
GCCCATCAGCGGCACTGCCGTGGCCCACAAGACGCCCCGGTGCCCGGGTTCCGTAAATGAAAAGTTACCCCAACGCG 
AAGGCACAGGAAAAACAGATGATGCCGCAGCTCCAGCGTTGGAGACTCAGCCTCAAGGAGATGAAGAAGATATTGCA 
TCACGTTACCCTACCCTTTGGACTGAACAGGTTAAAAGTCGGCAGAACAAAACCAATAAGAATTCAGTGTCATCCTG 
TGACCAAGAGCACCAGTCTGCCCTGGAGGAAGCCAAGCAGCCCAAGAACGACAATGTGGTGATCCCTGAGTGTGCGC 
ACGGCGGCCTCTACAAGCCAGTGCAGTGCCACCCCTCCACGGGGTACTGCTGGTGCGTCCTGGTGGACACGGGGCGC 
CCCATTCCCGGCACATCCACAAGGTACGAGCAGCCGAAATGTGACAACACGGCCAGGGCCCACCCAGCCAAAGCCCG 
GGACCTGTACAAGGGCCGCCAGCTACAAGGTTGTCCGGGTGCCAAAAAGCATGAGTTTCTGACCAGCGTTCTGGACG 
CGCTGTCCACGGACATGGTCCACGCCGCCTCCGACCCCTCCTCCTCGTCAGGCAGGCTCTCAGAACCCGACCCCAGC 
CATACCCTAGAGGAGCGGGTGGTGCACTGGTACTTCAAACTACTGGATAAAAACTCCAGTGGAGACATCGGCAAAAA 
GGAAATCAAACCCTTCAAGAGGTTCCTTCGCAAAAAATCAAAGCCCAAAAAATGTGTGAAGAAGTTTGTTGAATACT 
GTGACGTGAATAATGACAAATCCATCTCCGTACAAGAACTGATGGGCTGCCTGGGCGTGGCGAAAGAGGACGGCAAA 
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GCGGACACCAAGAAACGCCACACCCCCAGAGGTCATGCTGAAAGTACGTCTAATAGACAGCTACTGTGGTTGAGAGG 
AAAGGTGTCTTTTTATTGCTTCTAGAGACGTTGAAAGTGTGACCTGAGCACCTCATAATCATGTGAAAAAGACACTC 
AAAAACTACCATTTGAATGGATGGATGAAAATAACCTCCGTATATTCTACGAAGATGTTTAATAATAAATAGGTTTC 
GTTATAAGAGAATGTGTGTCACTTCGTCTCTTCCCTCACCCCCGAGACTTAGTGACAGTTATTTTTGACTTTTCCAA 
CTATACTATTTGCCTAGAAAATGTGTCTATTAAATAGCGTATTGAGAAAT 

<210> SEQ ID NO 41 

<211> Length : 1,173 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 41 
>AA161187_T0 

GCTGGGAGTAGAGGGCAGAGCTCCCACCCCGCCCCGCCCCCAGGGGGCGCCCCGGGCCCGGCGCGTTAGGAGGCAGA 
GGGGGCGTCAGGCCGCGGGAGAGGAGGCCATGGGCGCGCGCGGGGCGCTGCTGCTGGCGCTGCTGCTGGCTCGGGCT 
GGACTCAGGAAGCCGGAGTCGCAGGAGGCGGCGCCCTTATCAGGACCATGCGGCCGACGGGTCATCACGTCGCGCAT 
CGTGGGTGGAGAGGACGCCGAACTCGGGCGTTGGCCGTGGCAGGGGAGCCTGCGCCTGTGGGATTCCCACGTATGCG 
GAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGCGCACTGCTTTGAAACCTATAGTGACCTTAGTGATCCC 
TCCGGGTGGATGGTCCAGTTTGGCCAGCTGACTTCCATGCCATCCTTCTGGAGCCTGCAGGCCTACTACACCCGTTA 
CTTCGTATCGAATATCTATCTGAGCCCTCGCTACCTGGGGAATTCACCCTATGACATTGCCTTGGTGAAGCTGTCTG 
CACCTGTCACCTACACTAAACACATCCAGCCCATCTGTCTCCAGGCCTCCACATTTGAGTTTGAGAACCGGACAGAC 
TGCTGGGTGACTGGCTGGGGGTACATCAAAGAGGATGAGGCACTGCCATCTCCCCACACCCTCCAGGAAGTTCAGGT 
CGCCATCATAAACAACTCTATGTGCAACCACCTCTTCCTCAAGTACAGTTTCCGCAAGGACATCTTTGGAGACATGG 
TTTGTGCTGGCAATGCCCAAGGCGGGAAGGATGCCTGCTTCGGTGACTCAGGTGGACCCTTGGCCTGTAACAAGAAT 
GGACTGTGGTATCAGATTGGAGTCGTGAGCTGGGGAGTGGGCTGTGGTCGGCCCAATCGGCCCGGTGTCTACACCAA 
TATCAGCCACCACTTTGAGTGGATCCAGAAGCTGATGGCCCAGAGTGGCATGTCCCAGCCAGACCCCTCCTGGCCAC 
TACTCTTTTTCCCTCTTCTCTGGGCTCTCCCACTCCTGGGGCCGGTCTGAGCCTACCTGAGCCCATGCAGCCTGGGG 
CCACTGCCAAGTCAGGCCCTGGTTCTCTTCTGTCTTGTTTGGTAATAAACACATTCCAGTTGATGCCTTGCAGGGCA 
TTCTTCAAAAGCAATGGC 

<210> SEQ ID NO 42 
<211> Length : 1,104 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 42 
>AA161187_T7 

GCACACACGCGAGGGGACCCTGGGTGGGCAAAAACGTGCTTTCCCGGACGGGGTTGAAGGGGAGAAAGGGAGAGGTC 
GGGCTTGGGGGGCTGCCTCCCGCGGCTCAGCAGTTCCTCTGACCATCCGAGGACCATGCGGCCGACGGGTCATCACG 
TCGCGCATCGTGGGTGGAGAGGACGCCGAACTCGGGCGTTGGCCGTGGCAGGGGAGCCTGCGCCTGTGGGATTCCCA 
CGTATGCGGAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGCGCACTGCTTTGAAACCTATAGTGACCTTA 
GTGATCCCTCCGGGTGGATGGTCCAGTTTGGCCAGCTGACTTCCATGCCATCCTTCTGGAGCCTGCAGGCCTACTAC 
ACCCGTTACTTCGTATCGAATATCTATCTGAGCCCTCGCTACCTGGGGAATTCACCCTATGACATTGCCTTGGTGAA 
GCTGTCTGCACCTGTCACCTACACTAAACACATCCAGCCCATCTGTCTCCAGGCCTCCACATTTGAGTTTGAGAACC 
GGACAGACTGCTGGGTGACTGGCTGGGGGTACATCAAAGAGGATGAGGCACTGCCATCTCCCCACACCCTCCAGGAA 
GTTCAGGTCGCCATCATAAACAACTCTATGTGCAACCACCTCTTCCTCAAGTACAGTTTCCGCAAGGACATCTTTGG 
AGACATGGTTTGTGCTGGCAATGCCCAAGGCGGGAAGGATGCCTGCTTCGGTGACTCAGGTGGACCCTTGGCCTGTA 
ACAAGAATGGACTGTGGTATCAGATTGGAGTCGTGAGCTGGGGAGTGGGCTGTGGTCGGCCCAATCGGCCCGGTGTC 
TACACCAATATCAGCCACCACTTTGAGTGGATCCAGAAGCTGATGGCCCAGAGTGGCATGTCCCAGCCAGACCCCTC 
CTGGCCACTACTCTTTTTCCCTCTTCTCTGGGCTCTCCCACTCCTGGGGCCGGTCTGAGCCTACCTGAGCCCATGCA 
GCCTGGGGCCACTGCCAAGTCAGGCCCTGGTTCTCTTCTGTCTTGTTTGGTAATAAACACATTCCAGTTGATGCCTT 
GCAGGGCATTCTTCAAAAGCAATGGC 

<210> SEQ ID NO 43 

<211> Length : 1, 105 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 43 
>AA161187_T15 

GCTGGGAGTAGAGGGCAGAGCTCCCACCCCGCCCCGCCCCCAGGGGGCGCCCCGGGCCCGGCGCGTTAGGAGGCAGA 
GGGGGCGTCAGGCCGCGGGAGAGGAGGCCATGGGCGCGCGCGGGGCGCTGCTGCTGGCGCTGCTGCTGGCTCGGGCT 
GGACTCAGGAAGCCGGAGTCGCAGGAGGCGGCGCCCTTATCAGGACCATGCGGCCGACGGGTCATCACGTCGCGCAT 
CGTGGGTGGAGAGGACGCCGAACTCGGGCGTTGGCCGTGGCAGGGGAGCCTGCGCCTGTGGGATTCCCACGTATGCG 
GAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGCGCACTGCTTTGAAACCTATAGTGACCTTAGTGATCCC 
TCCGGGTGGATGGTCCAGTTTGGCCAGCTGACTTCCATGCCATCCTTCTGGAGCCTGCAGGCCTACTACACCCGTTA 
CTTCGTATCGAATATCTATCTGAGCCCTCGCTACCTGGGGAATTCACCCTATGACATTGCCTTGGTGAAGCTGTCTG 
CACCTGTCACCTACACTAAACACATCCAGCCCATCTGTCTCCAGGCCTCCACATTTGAGTTTGAGAACCGGACAGAC 
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TGCTGGGTGACTGGCTGGGGGTACATCAAAGAGGATGAGGGAAGTTCAGGTCGCCATCATAAACAACTCTATGTGCA 
ACCACCTCTTCCTCAAGTACAGTTTCCGCAAGGACATCTTTGGAGACATGGGTGACTCAGGTGGACCCTTGGCCTGT 
AACAAGAATGGACTGTGGTATCAGATTGGAGTCGTGAGCTGGGGAGTGGGCTGTGGTCGGCCCAATCGGCCCGGTGT 
CTACACCAATATCAGCCACCACTTTGAGTGGATCCAGAAGCTGATGGCCCAGAGTGGCATGTCCCAGCCAGACCCCT 
CCTGGCCACTACTCTTTTTCCCTCTTCTCTGGGCTCTCCCACTCCTGGGGCCGGTCTGAGCCTACCTGAGCCCATGC 
AGCCTGGGGCCACTGCCAAGTCAGGCCCTGGTTCTCTTCTGTCTTGTTTGGTAATAAACACATTCCAGTTGATGCCT 
TGCAGGGCATTCTTCAAAAGCAATGGC 

<210> SEQ ID NO 44 

<211> Length : 1,466 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 44 
>AA161187_T16 

GCTGGGAGTAGAGGGCAGAGCTCCCACCCCGCCCCGCCCCCAGGGGGCGCCCCGGGCCCGGCGCGTTAGGAGGCAGA 
GGGGGCGTCAGGCCGCGGGAGAGGAGGCCATGGGCGCGCGCGGGGCGCTGCTGCTGGCGCTGCTGCTGGCTCGGGCT 
GGACTCAGGAAGCCGGAGTCGCAGGAGGCGGCGCCCTTATCAGGACCATGCGGCCGACGGGTCATCACGTCGCGCAT 
CGTGGGTGGAGAGGACGCCGAACTCGGGCGTTGGCCGTGGCAGGGGAGCCTGCGCCTGTGGGATTCCCACGTATGCG 
GAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGCGCACTGCTTTGAAACCTATAGTGACCTTAGTGATCCC 
TCCGGGTGGATGGTCCAGTTTGGCCAGCTGACTTCCATGCCATCCTTCTGGAGCCTGCAGGCCTACTACACCCGTTA 
CTTCGTATCGAATATCTATCTGAGCCCTCGCTACCTGGGGAATTCACCCTATGACATTGCCTTGGTGAAGCTGTCTG 
CACCTGTCACCTACACTAAACACATCCAGCCCATCTGTCTCCAGGCCTCCACATTTGAGTTTGAGAACCGGACAGAC 
TGCTGGGTGACTGGCTGGGGGTACATCAAAGAGGATGAGGGTTGCTGTCTCTCTCCTTCCCACTATCGTCCGCACAG 
CACTGCCATCTCCCCACACCCTCCAGGAAGTTCAGGTCGCCATCATAAACAACTCTATGTGCAACCACCTCTTCCTC 
AAGTACAGTTTCCGCAAGGACATCTTTGGAGACATGGTTTGTGCTGGCAATGCCCAAGGCGGGAAGGATGCCTGCTT 
CGTGAGTGTCCCTGCCACCACTCCCAGCCCAGGAAAGCATCCTGTGTCCCTGTGCCTTATTTGACCCTCATGCCAAC 
CCCGGGAGGTGGAGACTGTTGCCCCACTCTGCAGATGCAGAAACGGAGGCTTGGCTGCTGCCAGGGGGAGGAGGAGG 
ATGTGCACCCAGTCTACCCAGCCCCATAGCCCTTCCCACTCTCAGCCCCTCCCCTGCCCCACTCACTCTGCCCCAGG 
CTGACCTCAGCCCCGCTGCTCCCCAGGGTGACTCAGGTGGACCCTTGGCCTGTAACAAGAATGGACTGTGGTATCAG 
ATTGGAGTCGTGAGCTGGGGAGTGGGCTGTGGTCGGCCCAATCGGCCCGGTGTCTACACCAATATCAGCCACCACTT 
TGAGTGGATCCAGAAGCTGATGGCCCAGAGTGGCATGTCCCAGCCAGACCCCTCCTGGCCACTACTCTTTTTCCCTC 
TTCTCTGGGCTCTCCCACTCCTGGGGCCGGTCTGAGCCTACCTGAGCCCATGCAGCCTGGGGCCACTGCCAAGTCAG 

GCCCTGGTTCTCTTCTGTCTTGTTTGGTAATAAACACATTCCAGTTGATGCCTTGCAGGGCATTCTTCAAAAGCAAT 
GGC 
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<210> SEQ ID NO 45 

<211> Length : 1,354 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 45 
>AA1 61187 JT20 

GCACACACGCGAGGGGACCCTGGGTGGGCAAAAACGTGCTTTCCCGGACGGGGTTGAAGGGGAGAAAGGGAGAGGTC 
GGGCTTGGGGGGCTGCCTCCCGCGGCTCAGCAGTTCCTCTGACCATCCGAGGACCATGCGGCCGACGGGTCATCACG 
TCGCGCATCGTGGGTGGAGAGGACGCCGAACTCGGGCGTTGGCCGTGGCAGGGGAGCCTGCGCCTGTGGGATTCCCA 
CGTATGCGGAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGCGCACTGCTTTGAAACTGACCTTAGTGATC 
CCTCCGGGTGGATGGTCCAGTTTGGCCAGCTGACTTCCATGCCATCCTTCTGGAGCCTGCAGGCCTACTACACCCGT 
TACTTCGTATCGAATATCTATCTGAGCCCTCGCTACCTGGGGAATTCACCCTATGACATTGCCTTGGTGAAGCTGTC 
TGCACCTGTCACCTACACTAAACACATCCAGCCCATCTGTCTCCAGGCCTCCACATTTGAGTTTGAGAACCGGACAG 
ACTGCTGGGTGACTGGCTGGGGGTACATCAAAGAGGATGAGGCACTGCCATCTCCCCACACCCTCCAGGAAGTTCAG 
GTCGCCATCATAAACAACTCTATGTGCAACCACCTCTTCCTCAAGTACAGTTTCCGCAAGGACATCTTTGGAGACAT 
GGTTTGTGCTGGCAATGCCCAAGGCGGGAAGGATGCCTGCTTCGTGAGTGTCCCTGCCACCACTCCCAGCCCAGGAA 
AGCATCCTGTGTCCCTGTGCCTTATTTGACCCTCATGCCAACCCCGGGAGGTGGAGACTGTTGCCCCACTCTGCAGA 
TGCAGAAACGGAGGCTTGGCTGCTGCCAGGGGGAGGAGGAGGATGTGCACCCAGTCTACCCAGCCCCATAGCCCTTC 
CCACTCTCAGCCCCTCCCCTGCCCCACTCACTCTGCCCCAGGCTGACCTCAGCCCCGCTGCTCCCCAGGGTGACTCA 
GGTGGACCCTTGGCCTGTAACAAGAATGGACTGTGGTATCAGATTGGAGTCGTGAGCTGGGGAGTGGGCTGTGGTCG 
GCCCAATCGGCCCGGTGTCTACACCAATATCAGCCACCACTTTGAGTGGATCCAGAAGCTGATGGCCCAGAGTGGCA 
TGTCCCAGCCAGACCCCTCCTGGCCACTACTCTTTTTCCCTCTTCTCTGGGCTCTCCCACTCCTGGGGCCGGTCTGA 
GCCTACCTGAGCCCATGCAGCCTGGGGCCACTGCCAAGTCAGGCCCTGGTTCTCTTCTGTCTTGTTTGGTAATAAAC 
ACATTCCAGTTGATGCCTTGCAGGGCATTCTTCAAAAGCAATGGC 

<210> SEQ ID NO 46 

<211> Length : 1, 171 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 46 
>AA161187 T21 
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GCTGGGAGTAGAGGGCAGAGCTCCCACCCCGCCCCGCCCCCAGGGGGCGCCCCGGGCCCGGCGCGTTAGGAGGCAGA 
GGGGGCGTCAGGCCGCGGGAGAGGAGGCCATGGGCGCGCGCGGGGCGCTGCTGCTGGCGCTGCTGCTGGCTCGGGCT 
GGACTCAGGAAGCCGGAGTCGCAGGAGGCGGCGCCCTTATCAGGACCATGCGGCCGACGGGTCATCACGTCGCGCAT 
CGTGGGTGGAGAGGACGCCGAACTCGGGCGTTGGCCGTGGCAGGGGAGCCTGCGCCTGTGGGATTCCCACGTATGCG 
GAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGCGCACTGCTTTGAAACCTATAGTGACCTTAGTGATCCC 
TCCGGGTGGATGGTCCAGTTTGGCCAGCTGACTTCCATGCCATCCTTCTGGAGCCTGCAGGCCTACTACACCCGTTA 
CTTCGTATCGAATATCTATCTGAGCCCTCGCTACCTGGGGAATTCACCCTATGACATTGCCTTGGTGAAGCTGTCTG 
CACCTGTCACCTACACTAAACACATCCAGCCCATCTGTCTCCAGGCCTCCACATTTGAGTTTGAGAACCGGACAGAC 
TGCTGGGTGACTGGCTGGGGGTACATCAAAGAGGATGAGGATAAGAGGACACAGTGAGAAGATGGGGGTCTGCTCGC 
CAGGAAGGAGCCCTCACCAGCAGCCGCATCGCTCAGCACCTTGATCCTGGACTTCCAGCCTCCAGAGCTGTGAGAAA 
CAAACCTCTATCATCTACCAGCCGCCCACGGCGTGGGATTTGTGTTACAGCAGCCTGAGCTGACCCAGACGCCAAGG 
AGCAACACACGCACCAGGGTAGGCTGGAGAAACCAGAACCCGGGAATCCCGCCTCCCTCAACTTGAAACTTGGGAAT 
AGTGTATTCTCTTTTCAACACTTGCACTAGTAGAAGGTTAATTACATGAAAGATTAGGCAAAATGTATGGCTATGTG 
TCCTGGTTTTCCAATAAAAGTATTGAGTTTCTCTGGGGAAAGTGCAGATAAAATGCTTAGTGGAGGCTGGGCGCTGT 
GGCTTATGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCAGGCAGGCAGATCACAAGGTCAGGAGTTTGAGACCGG 
CCTGGCCAATATGATG 

<210> SEQ ID NO 47 

<211> Length : 953 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 47 
>AA161187_T22 

GCTGGGAGTAGAGGGCAGAGCTCCCACCCCGCCCCGCCCCCAGGGGGCGCCCCGGGCCCGGCGCGTTAGGAGGCAGA 
GGGGGCGTCAGGCCGCGGGAGAGGAGGCCATGGGCGCGCGCGGGGCGCTGCTGCTGGCGCTGCTGCTGGCTCGGGCT 
GGACTCAGGAAGCCGGAGTCGCAGGAGGCGGCGCCCTTATCAGGACCATGCGGCCGACGGGTCATCACGTCGCGCAT 
CGTGGGTGGAGAGGACGCCGAACTCGGGCGTTGGCCGTGGCAGGGGAGCCTGCGCCTGTGGGATTCCCACGTATGCG 
GAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGCGCACTGCTTTGAAACCTATAGTGACCTTAGTGATCCC 
TCCGGGTGGATGGTCCAGTTTGGCCAGCTGACTTCCATGCCATCCTTCTGGAGCCTGCAGGCCTACTACACCCGTTA 
CTTCGTATCGAATATCTATCTGAGCCCTCGCTACCTGGGGAATTCACCCTATGACATTGCCTTGGTGAAGCTGTCTG 
CACCTGTCACCTACACTAAACACATCCAGCCCATCTGTCTCCAGGCCTCCACATTTGAGTTTGAGAACCGGACAGAC 
TGCTGGGTGACTGGCTGGGGGTACATCAAAGAGGATGAGGTGCTCACCAATGCCCCAGGCATCAGGCTCCTGGGCTG 
CCTCTCCATGCCTCCCACACCCACCCTAGCTCTGGCCGATTCTCCTGCAGCACTGAGCCCATTCCTCTCCCCAGAAA 
CTTCCAAGCCATGCTCAACCGCAGCTCCCACGGAAACCCCTCTGGGGGTTCCTCTGGTGGGCCTGCCCTGGCACCTG 
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CGTGTCCCCCAACACACATGCCCTGAAAGAAGTGGGCCCAGCATCCGGAGGAGCCCCGGCAGCCCCAGACTGGGCGT 
GTTCCCTGTATCAGGAATCCCTTCCCTCT 

<210> SEQ ID NO 48 

<211> Length : 3,110 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 48 
>R6 617 8__T2 

GCAGCGAGCGCGGCTCCACATTGTTGCGGATCGCCGGCACCCGGCAGAGCGGCGGCGGCTGGGACGCGCGGCGCCTC 
CGACCCGTTCTCCTCGCGCCCGGCCGCGCAGCCAGAGCCACCCGGGCCGCCGACCGCCGAGCCCCGCGCCCGCCGCC 
TGGGCCCCGAGCCTTCTGCGCCGCCCGGGTGCGTCCCGCCACCCTCGGAGGACGGCCGGCCATGGACGCCTGCAAGT 
TGGAGCCGAGCGGGAGGGTGTGAGCGGGCCGGGGCCAGGAGCCCGCGCCGCGCAACCGGGCAGCCGGGCGCGCCGGG 
GGTGGGTCCCTCTCCCCAGCCCCGCTCTGCGTGGAAGAAGAGGGCGGGGACCGGCGCCGGGAGGAGAGCGGAGGAGG 
CGAAGGGGCATGACTCGTGCAACTTGCGGCGGGCATCTGCCGAGCCTCTGAGCCGGCGGCGGCCCGGGGCCCGGACT 
GCGGCCGCGCGGATCCACCCAGCCCACCCCGCCCCGGCCGACGGCTGCAGCTGACCTGGATCCTTCGAGCGCCCGCC 
GACCGCCAGCGATCTTCCCTCATCTTCCGGGCTGGTTTCTGCTGCGCGAGGAGCGCTGCCCTCGCCGCCCCTCTCGC 
CGGACCCCCGGCCCCCGATGGCTCGGATGGGGCTTGCGGGCGCCGCTGGACGCTGGTGGGGACTCGCTCTCGGCTTG 
ACCGCATTCTTCCTCCCAGGCGTCCACTCCCAGGTGGTCCAGGTGAACGACTCCATGTATGGCTTCATCGGCACAGA 
CGTGGTTCTGCACTGCAGCTTTGCCAACCCGCTTCCCAGCGTGAAGATCACCCAGGTCACATGGCAGAAGTCCACCA 
ATGGCTCCAAGCAGAACGTGGCCATCTACAACCCATCCATGGGCGTGTCCGTGCTGGCTCCCTACCGCGAGCGTGTG 
GAATTCCTGCGGCCCTCCTTCACCGATGGCACTATCCGCCTCTCCCGCCTGGAGCTGGAGGATGAGGGTGTCTACAT 
CTGCGAGTTTGCTACCTTCCCTACGGGCAATCGAGAAAGCCAGCTCAATCTCACGGTGATGGCCAAACCCACCAATT 
GGATAGAGGGTACCCAGGCAGTGCTTCGAGCCAAGAAGGGGCAGGATGACAAGGTCCTGGTGGCCACCTGCACCTCA 
GCCAATGGGAAGCCTCCCAGTGTGGTATCCTGGGAAACTCGGTTAAAAGGTGAGGCAGAGTACCAGGAGATCCGGAA 
CCCCAATGGCACAGTGACGGTCATCAGCCGCTACCGCCTGGTGCCCAGCAGGGAAGCCCACCAGCAGTCCTTGGCCT 
GCATCGTCAACTACCACATGGACCGCTTCAAGGAAAGCCTCACTCTCAACGTGCAGTATGAGCCTGAGGTAACCATT 
GAGGGGTTTGATGGCAACTGGTACCTGCAGCGGATGGACGTGAAGCTCACCTGCAAAGCTGATGCTAACCCCCCAGC 
CACTGAGTACCACTGGACCACGCTAAATGGCTCTCTCCCCAAGGGTGTGGAGGCCCAGAACAGAACCCTCTTCTTCA 
AGGGACCCATCAACTACAGCCTGGCAGGGACCTACATCTGTGAGGCCACCAACCCCATCGGTACACGCTCAGGCCAG 
GTGGAGGTCAATATCACAGGTGAGGGTCACAGTCTGCCCAXCAGTCCTGGAGTTCTTCAGACCCAGAATTGCGGGCC 
TTGAAGGCCAGTGTCCTGGAAGGGAGGCAGGTGTGAGCCTGCAAGTGTGCATGCCCAGCCATGGTATGGACATGTGT 
CTCTGGGCATGTAAATGTGAACCAGTGTGAACAGGCTCCGTCTGCATGCTGAGTGTGCATGTGGGAGCCCGTGGCTG 
TGCCGTGGCAACGTGCCATTCTCTGAGCCAGCGAATGGCAGTGTGTTGGGAGGTCTGAGAAGGCAGCTGCATCCGTG 
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CCTCTGGGAGGATTCGGTTCTCCCCCAGCTTGCCGAGGCCCTGCCTGATGGTCTGACACGAGGCACAGCTGCTGCAG 
CTGCAGATGGACAGAAGGGCTTCCCAGAGGTGGACCCCAGGCCTCCCCACTCTCCCTGTGGCTGGCTGCACTGCATG 
CTGGGGGGTGTAGTTCTTGCAGCTTCCAGGCCTAATCTGATGCCGGAGCATTTCCTGCCTGAGGAAGCGCCAGGCAT 
TGGTTTCGGAGGCAAACCCAAACATTCTCTTTGACCCCAAACCTCCAGATCCTAGATCCAGACTGTAAGCCCTAACA 
CTTCACTCCACCTCAGATCTATCCAAAGCCCCCAGCACCAGCCCACCCACCTCAGTCAGAGAACCAGGACCCCAAAG 
GCATGCAGAGCCCCCACTTCCCCACTGTCTTGGCCAGCCAGGGACCCCAGAAGAGAGGTTACAACCCTTCAGGAATA 
GGGACAAGCTGCTCCCTTTGTAAGAGGATGTGAGGGAGGCTGGCTGGGCCCCTGCCAGCAAACACAAATGACCCTGC 
GGCCTGGCTCTTCTCTCTCCTCCCAGCTGCGGCCCTTGCAGCTCTGCTCCTGGCACAGAGACAGGAGCTACTGGCTG 
AGTGTAACAGCTGGAGGGATGGAGGGGGAGGGGAGGACGCTCCACTCCACGCCAGACAGCCCCTTCTGCTTGCAAAT 
GAGTTAGATCCCCATGCTTCTCCTTTCTTCTCTCCCTCACTCAGTATCCTCACTCGAAAGTCTTTGATGCTGGAAGG 
TCATCCCAGCCATTCTGCTGCTGCTACACAGGCCCAGCCCTAAACAAAATAACCGGGGTTCTTTGGTCCCAAAAGAT 
CCCCAGGAAAGAGTAAACCTCCTTAAGACTTAAGGAAAAAAGTTGGCTAGGCGTGGTGGCTCACACCTGTAATCCCA 
GCACTTTGGGAGGCCGAGGTGGGCAGACCACTTGAGATCAGGAGTTCGAGACCAGCCTGGCCAACGTTGTGAAACCC 
CGTCTCTACAAAATATTAGCCGGGTGTGGCAGTGCGTGCCTGTAGTCCCAACTACTCAGGAGGCTGAGGCACAAGAA 
TTGCTTGAGCCTGGGAGGCGGAGGTTGCAGTGAGCCGAGATCGTGCCACTGCACTCCAGCCCAGGCGATGGAGCGAG 
ACTCTGTCC C GC AAA A AAAAAAAA AA AA AA 

<210> SEQ ID NO 49 

<211> Length : 1, 903 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 49 
>R66178_T3 

GCAGCGAGCGCGGCTCCACATTGTTGCGGATCGCCGGCACCCGGCAGAGCGGCGGCGGCTGGGACGCGCGGCGCCTC 
CGACCCGTTCTCCTCGCGCCCGGCCGCGCAGCCAGAGCCACCCGGGCCGCCGACCGCCGAGCCCCGCGCCCGCCGCC 
TGGGCCCCGAGCCTTCTGCGCCGCCCGGGTGCGTCCCGCCACCCTCGGAGGACGGCCGGCCATGGACGCCTGCAAGT 
TGGAGCCGAGCGGGAGGGTGTGAGCGGGCCGGGGCCAGGAGCCCGCGCCGCGCAACCGGGCAGCCGGGCGCGCCGGG 
GGTGGGTCCCTCTCCCCAGCCCCGCTCTGCGTGGAAGAAGAGGGCGGGGACCGGCGCCGGGAGGAGAGCGGAGGAGG 
CGAAGGGGCATGACTCGTGCAACTTGCGGCGGGCATCTGCCGAGCCTCTGAGCCGGCGGCGGCCCGGGGCCCGGACT 
GCGGCCGCGCGGATCCACCCAGCCCACCCCGCCCCGGCCGACGGCTGCAGCTGACCTGGATCCTTCGAGCGCCCGCC 
GACCGCCAGCGATCTTCCCTCATCTTCCGGGCTGGTTTCTGCTGCGCGAGGAGCGCTGCCCTCGCCGCCCCTCTCGC 
CGGACCCCCGGCCCCCGATGGCTCGGATGGGGCTTGCGGGCGCCGCTGGACGCTGGTGGGGACTCGCTCTCGGCTTG 
ACCGCATTCTTCCTCCCAGGCGTCCACTCCCAGGTGGTCCAGGTGAACGACTCCATGTATGGCTTCATCGGCACAGA 
CGTGGTTCTGCACTGCAGCTTTGCCAACCCGCTTCCCAGCGTGAAGATCACCCAGGTCACATGGCAGAAGTCCACCA 
ATGGCTCCAAGCAGAACGTGGCCATCTACAACCCATCCATGGGCGTGTCCGTGCTGGCTCCCTACCGCGAGCGTGTG 
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GAATTCCTGCGGCCCTCCTTCACCGATGGCACTATCCGCCTCTCCCGCCTGGAGCTGGAGGATGAGGGTGTCTACAT 
CTGCGAGTTTGCTACCTTCCCTACGGGCAATCGAGAAAGCCAGCTCAATCTCACGGTGATGGCCAAACCCACCAATT 
GGATAGAGGGTACCCAGGCAGTGCTTCGAGCCAAGAAGGGGCAGGATGACAAGGTCCTGGTGGCCACCTGCACCTCA 
GCCAATGGGAAGCCTCCCAGTGTGGTATCCTGGGAAACTCGGTTAAAAGGTGAGGCAGAGTACCAGGAGATCCGGAA 
CCCCAATGGCACAGTGACGGTCATCAGCCGCTACCGCCTGGTGCCCAGCAGGGAAGCCCACCAGCAGTCCTTGGCCT 
GCATCGTCAACTACCACATGGACCGCTTCAAGGAAAGCCTCACTCTCAACGTGCAGTATGAGCCTGAGGTAACCATT 
GAGGGGTTTGATGGCAACTGGTACCTGCAGCGGATGGACGTGAAGCTCACCTGCAAAGCTGATGCTAACCCCCCAGC 
CACTGAGTACCACTGGACCACGCTAAATGGCTCTCTCCCCAAGGGTGTGGAGGCCCAGAACAGAACCCTCTTCTTCA 
AGGGACCCATCAACTACAGCCTGGCAGGGACCTACATCTGTGAGGCCACCAACCCCATCGGTACACGCTCAGGCCAG 
GTGGAGGTCAATATCACAGCTTTCTGTCAACTTATCTATCCGGGCAAAGGGAGGACAAGAGCTAGGATGTTCTGAGG 
AGAGACTTCACCTGGGACGTGAAAGGAGCATGGGCTTGATGTCAGACAGCTGTGACCCTGGACAGGGCCCCCCCCAC 
CATCTGTAAAACGGGGACAGTATGAXGTACCTTGAAGGGCTGTTGTCAGAATTCTACGTGATGTAAGTCAAGCACCT 
AGCACAGATCAGTCTGTCAATAAATGGCCAATGTTCCGTGATTATTACCCTGACC 

<210> SEQ ID NO 50 

<211> Length : 2, 364 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 50 
>R66178_T7 

GCAGCGAGCGCGGCTCCACATTGTTGCGGATCGCCGGCACCCGGCAGAGCGGCGGCGGCTGGGACGCGCGGCGCCTC 
CGACCCGTTCTCCTCGCGCCCGGCCGCGCAGCCAGAGCCACCCGGGCCGCCGACCGCCGAGCCCCGCGCCCGCCGCC 
TGGGCCCCGAGCCTTCTGCGCCGCCCGGGTGCGTCCCGCCACCCTCGGAGGACGGCCGGCCATGGACGCCTGCAAGT 
TGGAGCCGAGCGGGAGGGTGTGAGCGGGCCGGGGCCAGGAGCCCGCGCCGCGCAACCGGGCAGCCGGGCGCGCCGGG 
GGTGGGTCCCTCTCCCCAGCCCCGCTCTGCGTGGAAGAAGAGGGCGGGGACCGGCGCCGGGAGGAGAGCGGAGGAGG 
CGAAGGGGCATGACTCGTGCAACTTGCGGCGGGCATCTGCCGAGCCTCTGAGCCGGCGGCGGCCCGGGGCCCGGACT 
GCGGCCGCGCGGATCCACCCAGCCCACCCCGCCCCGGCCGACGGCTGCAGCTGACCTGGATCCTTCGAGCGCCCGCC 
GACCGCCAGCGATCTTCCCTCATCTTCCGGGCTGGTTTCTGCTGCGCGAGGAGCGCTGCCCTCGCCGCCCCTCTCGC 
CGGACCCCCGGCCCCCGATGGCTCGGATGGGGCTTGCGGGCGCCGCTGGACGCTGGTGGGGACTCGCTCTCGGCTTG 
ACCGCATTCTTCCTCCCAGGCGTCCACTCCCAGGTGGTCCAGGTGAACGACTCCATGTATGGCTTCATCGGCACAGA 
CGTGGTTCTGCACTGCAGCTTTGCCAACCCGCTTCCCAGCGTGAAGATCACCCAGGTCACATGGCAGAAGTCCACCA 
ATGGCTCCAAGCAGAACGTGGCCATCTACAACCCATCCATGGGCGTGTCCGTGCTGGCTCCCTACCGCGAGCGTGTG 
GAATTCCTGCGGCCCTCCTTCACCGATGGCACTATCCGCCTCTCCCGCCTGGAGCTGGAGGATGAGGGTGTCTACAT 
CTGCGAGTTTGCTACCTTCCCTACGGGCAATCGAGAAAGCCAGCTCAATCTCACGGTGATGGCCAAACCCACCAATT 
GGATAGAGGGTACCCAGGCAGTGCTTCGAGCCAAGAAGGGGCAGGATGACAAGGTCCTGGTGGCCACCTGCACCTCA 
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GCCAATGGGAAGCCTCCCAGTGTGGTATCCTGGGAAACTCGGTTAAAAGGTGAGGCAGAGTACCAGGAGATCCGGAA 
CCCCAATGGCACAGTGACGGTCATCAGCCGCTACCGCCTGGTGCCCAGCAGGGAAGCCCACCAGCAGTCCTTGGCCT 
GCATCGTCAACTACCACATGGACCGCTTCAAGGAAAGCCTCACTCTCAACGTGCAGTATGAGCCTGAGGTAACCATT 
GAGGGGTTTGATGGCAACTGGTACCTGCAGCGGATGGACGTGAAGCTCACCTGCAAAGCTGATGCTAACCCCCCAGC 
CACTGAGTACCACTGGACCACGCTAAATGGCTCTCTCCCCAAGGGTGTGGAGGCCCAGAACAGAACCCTCTTCTTCA 
AGGGACCCATCAACTACAGCCTGGCAGGGACCTACATCTGTGAGGCCACCAACCCCATCGGTACACGCTCAGGCCAG 
GTGGAGAATTCCCCTACACCCCGTCTCCTCCCGAACATGGGCGGCGCGCCGGGCCGGTGCCCACGGCCATCATTGGG 
GGCGTGGCGGGGAGCATCCTGCTGGTGTTGATTGTGGTCGGCGGGATCGTGGTCGCCCTGCGTCGGCGCCGGCACAC 
CTTCAAGGGTGACTACAGCACCAAGAAGCACGTGTATGGCAACGGCTACAGCAAGGCAGGCATCCCCCAGCACCACC 
CACCAATGGCACAGAACCTGCAGTACCCCGACGACTCAGACGACGAGAAGAAGGCCGGCCCACTGGGTGGAAGCAGC 
TATGAGGAGGAGGAGGAGGAGGAGGAGGGCGGTGGAGGGGGCGAGCGCAAGGTGGGCGGCCCCCACCCCAAATATGA 
CGAGGACGCCAAGCGGCCCTACTTCACCGTGGATGAGGCCGAGGCCCGTCAGGACGGCTACGGGGACCGGACTCTGG 
GCTACCAGTACGACCCTGAGCAGCTGGACTTGGCTGAGAACATGGTTTCTCAGAACGACGGGTCTTTCATTTCCAAG 
AAGGAGTGGTACGTGTAGCCCCCCTTCCAGAGCCTCTGTCTGTGACCGCTCCTAACCAGCCCCTCCCCGCACGCCCC 
CTGCCCACCCCCCACCTCCCACTCCAGGAGCTGAACAGAGACTTGCCCAGCTGCCCAAAGCCAGCCCCGAACTCCTG 
GGGGGCCAGGGGAGCCCAGGGCAGCCACGACTTGGCTTTGTGTTTTATTTCCTC 

<210> SEQ ID NO 51 

<211> Length : 3, 125 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 51 
>HUMPHOSLIP_PEA__2_T6 

GGGTCTCCCACTTGTCCAGACAGCGGCCGGGCTTGTCACGGGGCTCTGTGCAGCCTTTTCCACTCTCCCGGCTGCCA 
GCGTCCCGCCCCGTCCCCTCCCAGCCCCCAAGGGAGGAGGGGAGAGCTGCAGAGAGGAGGAGGGGTCGGGGAGGCCC 
GGCTTTATAAAGGCGGCTGGAACAACCCTGCCCGCCAGACCCCGTCGCCCGGATCCCCTGAGCTGCCCGCCATCCCA 
CGTGACCGCGCCGCCCCCCAGCTCCACCGCTGAGCCCGCTCGCCATGGCCCTCTTCGGGGCCCTCTTCCTAGCGCTG 
CTGGCAGGCGCACATGCAGAGTTCCCAGGGAGGGGCTGCGCTTTCTGGAGCAAGAGCTGGAGACTATCACCATTCCG 
GACCTGCGGGGCAAAGAAGGCCACTTCTACTACAACATCTCTGAGGTGAAGGTCACAGAGCTGCAACTGACATCTTC 
CGAGCTCGATTTCCAGCCACAGCAGGAGCTGATGCTTCAAATCACCAATGCCTCCTTGGGGCTGCGCTTCCGGAGAC 
AGCTGCTCTACTGGTTCTTCTATGATGGGGGCTACATCAACGCCTCAGCTGAGGGTGTGTCCATCCGCACTGGTCTG 
GAGCTCTCCCGGGATCCCGCTGGACGGATGAAAGTGTCCAATGTCTCCTGCCAGGCCTCTGTCTCCAGAATGCACGC 
GGCCTTCGGGGGAACCTTCAAGAAGGTGTATGATTTTCTCTCCACGTTCATCACCTCAGGGATGCGCTTCCTCCTCA 
ACCAGCAGATCTGCCCTGTCCTCTACCACGCAGGGACGGTCCTGCTCAACTCCCTCCTGGACACCGTGCCTGTGCGC 
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AGTTCTGTGGACGAGCTTGTTGGCATTGACTATTCCCTCATGAAGGATCCTGTGGCTTCCACCAGCAACCTGGACAT 
GGACTTCCGGGGGGCCTTCTTCCCCCTGACTGAGAGGAACTGGAGCCTCCCCAACCGGGCAGTGGAGCCCCAGCTGC 
AGGAGGAAGAGCGGATGGTGTATGTGGCCTTCTCTGAGTTCTTCTTCGACTCTGCCATGGAGAGCTACTTCCGGGCG 
GGGGCCCTGCAGCTGTTGCTGGTGGGGGACAAGGTGCCCCACGACCTGGACATGCTGCTGAGGGCCACCTACTTTGG 
GAGCATTGTCCTGCTGAGCCCAGCAGTGATTGACTCCCCATTGAAGCTGGAGCTGCGGGTCCTGGCCCCACCGCGCT 
GCACCATCAAGCCCTCTGGCACCACCATCTCTGTCACTGCTAGCGTCACCATTGCCCTGGTCCCACCAGACCAGCCT 
GAGGTCCAGCTGTCCAGCATGACTATGGACGCCCGTCTCAGCGCCAAGATGGCTCTCCGGGGGAAGGCCCTGCGCAC 
GCAGCTGGACCTGCGCAGGTTCCGAATCTATTCCAACCATTCTGCACTGGAGTCGCTGGCTCTGATCCCATTACAGG 
CCCCTCTGAAGACCATGCTGCAGATTGGGGTGATGCCCATGCTCAATGAGCGGACCTGGCGTGGGGTGCAGATCCCA 
CTACCTGAGGGCATCAACTTTGTGCATGAGGTGGTGACGAACCATGCGGGATTCCTCACCATCGGGGCTGATCTCCA 
CTTTGCCAAAGGGCTGCGAGAGGTGATTGAGAAGAACCGGCCTGCTGATGTCAGGGCGTCCACTGCCCCCACACCGT 
CCACAGCAGCTGTCTGAGCCCTCAATCCCCAAGCTGGCAGCTGTCATTCAGGACCCCAACCCCTCTCAGCCCCTCTT 
TTCCCACATTCATAGCCTGTAGTGCCCCCTCTAACCCCCAGTGCCACAGAGAAGACGGGATTTGAAGCTGTACCCAA 
TTTAATTCCATAATCAATCTATCAATTACAGTCCGTCCACCACCTCCCTGTGGGCTGTCCTGAGCTCTGTTGGGTTC 
CTGGGATGGAATCAGTGCATCATAAAGGGCATTCTTTAAGCAGAGAAGGGGCCAGGCCACCCCATTCAGGAACTGCT 
GCGGGAATAAAGTGCTAACTTGCCCCCAGGCTGTCTATGGGAGACCCTGGGCCCAGTCTGGGATGTACAGGGCTCTG 
GGAAGGGGGCAGTCCTGGCGGCAGAACCCGGCCTGCAGGGGCACTTTGCTTAGAAGAGGACTCTCCTAGCGGGAGAG 
GCTGGGAGGGGCTGCATCAGGCCGTGGAGCTGGTTGCTGTGGTCATCAGTATGGCTGCTTGTTCAGGAAGCGGGAGA 
ACATGGTGAAGGCAGCGAGGGGCTTGTCGGTGGGAACCATGTGGCCGGCGCCCTGAGGAGCAATGTTCGTGAGCTCC 
TGACCCCACCATTCCCTCCTCCCCATATAACTGCTCACTCGGGGGCAATTCCTTCATCCCAAACCCTTTATTCTTCC 
CAGAACCCTCCCCACCCCTCTCCAAAAAAACTTGCCCATACAGGGGCCAGATGGTGACCCATGACCCAGCCTAAAAG 
GCAGCCAGAGGGAAAGGACGGGTGGGTCCTGCTCCTTTGCCTCCGGCCCAGTTATCTCTCAGCAGGCCCAGTCCCTA 
CCTTGATCGTGAGAAAGGCGATGTGGGAGAACTCCTTCACGAAGCCGGCAATCTGCTCCCCGCTGTCCCCGTACTTC 
ACTAACCAGGGCCGGCGCTGCACCTCCATCTGCCCCACCAGGAAAGACATCAGCCTACAGCAGCTGCATCCTTGCTC 
ACAGCTACCAGCAAGACCTTAGGGCTGGGAATTCCTCCACACTTGCCCTCTGTGGGCCAGAGCCAGGCAGCCAGCTG 
GCCACTCCCAGGCATACCCGCTCCCAATCCTCCACAGCAGCCCCTATCCCAGGGCCAGGAATCTCTACCTTACCTTC 
TGGTTGAGGGAATCCACAAACCACTCATCCCCCATGAAATTGCAGGCCATGTCTACATCTCCATTATATAATAGGAT 
CTGGTATTTCTAAAGCAGGATGGGGTAAAAATGAGGGGTGTGGAACAAGCCCAGTCCCCAGCCCTTCCCTAGTTCAA 
GGCCTACCCCTCAGGAAATTCAAGGGGCCAAGCTAGATAACACGAACCAGGGAATTTTCATGTTTTCTAACGACTTA 
CTGCATGTCCAGTATTCTACTAAATGTTTTATCTGTGAAAGTAGA 

<210> SEQ ID NO 52 

<211> Length : 3,263 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 
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<400> sequence : 52 
>HUMPHOSLlP_PEA_2_T7 

GGGTCTCCCACTTGTCCAGACAGCGGCCGGGCTTGTCACGGGGCTCTGTGCAGCCTTTTCCACTCTCCCGGCTGCCA 
GCGTCCCGCCCCGTCCCCTCCCAGCCCCCAAGGGAGGAGGGGAGAGCTGCAGAGAGGAGGAGGGGTCGGGGAGGCCC 
GGCTTTATAAAGGCGGCTGGAACAACCCTGCCCGCCAGACCCCGTCGCCCGGATCCCCTGAGCTGCCCGCCATCCCA 
CGTGACCGCGCCGCCCCCCAGCTCCACCGCTGAGCCCGCTCGCCATGGCCCTCTTCGGGGCCCTCTTCCTAGCGCTG 
CTGGCAGGCGCACATGCAGAGTTCCCAGGCTGCAAGATCCGCGTCACCTCCAAGGCGCTGGAGCTGGTGAAGCAGGA 
GGGGCTGCGCTTTCTGGAGCAAGAGCTGGAGACTATCACCATTCCGGACCTGCGGGGCAAAGAAGGCCACTTCTACT 
ACAACATCTCTGAGCCTGGACTTGAAAGGGGAGCAGACAAATTTCCTGTCGTTGGGGGAAGTTCCCTCTTCTTGGCC 
CTGGATCTGACCCTGAGGCCTCCTGTAGGGTGAAGGTCACAGAGCTGCAACTGACATCTTCCGAGCTCGATTTCCAG 
CCACAGCAGGAGCTGATGCTTCAAATCACCAATGCCTCCTTGGGGCTGCGCTTCCGGAGACAGCTGCTCTACTGGTT 
CTTCTATGATGGGGGCTACATCAACGCCTCAGCTGAGGGTGTGTCCATCCGCACTGGTCTGGAGCTCTCCCGGGATC 
CCGCTGGACGGATGAAAGTGTCCAATGTCTCCTGCCAGGCCTCTGTCTCCAGAATGCACGCGGCCTTCGGGGGAACC 
TTCAAGAAGGTGTATGATTTTCTCTCCACGTTCATCACCTCAGGGATGCGCTTCCTCCTCAACCAGCAGATCTGCCC 
TGTCCTCTACCACGCAGGGACGGTCCTGCTCAACTCCCTCCTGGACACCGTGCCTGTGCGCAGTTCTGTGGACGAGC 
TTGTTGGCATTGACTATTCCCTCATGAAGGATCCTGTGGCTTCCACCAGCAACCTGGACATGGACTTCCGGGGGGCC 
TTCTTCCCCCTGACTGAGAGGAACTGGAGCCTCCCCAACCGGGCAGTGGAGCCCCAGCTGCAGGAGGAAGAGCGGAT 
GGTGTATGTGGCCTTCTCTGAGTTCTTCTTCGACTCTGCCATGGAGAGCTACTTCCGGGCGGGGGCCCTGCAGCTGT 
TGCTGGTGGGGGACAAGGTGCCCCACGACCTGGACATGCTGCTGAGGGCCACCTACTTTGGGAGCATTGTCCTGCTG 
AGCCCAGCAGTGATTGACTCCCCATTGAAGCTGGAGCTGCGGGTCCTGGCCCCACCGCGCTGCACCATCAAGCCCTC 
TGGCACCACCATCTCTGTCACTGCTAGCGTCACCATTGCCCTGGTCCCACCAGACCAGCCTGAGGTCCAGCTGTCCA 
GCATGACTATGGACGCCCGTCTCAGCGCCAAGATGGCTCTCCGGGGGAAGGCCCTGCGCACGCAGCTGGACCTGCGC 
AGGTTCCGAATCTATTCCAACCATTCTGCACTGGAGTCGCTGGCTCTGATCCCATTACAGGCCCCTCTGAAGACCAT 
GCTGCAGATTGGGGTGATGCCCATGCTCAATGAGCGGACCTGGCGTGGGGTGCAGATCCCACTACCTGAGGGCATCA 
ACTTTGTGCATGAGGTGGTGACGAACCATGCGGGATTCCTCACCATCGGGGCTGATCTCCACTTTGCCAAAGGGCTG 
CGAGAGGTGATTGAGAAGAACCGGCCTGCTGATGTCAGGGCGTCCACTGCCCCCACACCGTCCACAGCAGCTGTCTG 
AGCCCTCAATCCCCAAGCTGGCAGCTGTCATTCAGGACCCCAACCCCTCTCAGCCCCTCTTTTCCCACATTCATAGC 
CTGTAGTGCCCCCTCTAACCCCCAGTGCCACAGAGAAGACGGGATTTGAAGCTGTACCCAATTTAATTCCATAATCA 
ATCTATCAATTACAGTCCGTCCACCACCTCCCTGTGGGCTGTCCTGAGCTCTGTTGGGTTCCTGGGATGGAATCAGT 
GCATCATAAAGGGCATTCTTTAAGCAGAGAAGGGGCCAGGCCACCCCATTCAGGAACTGCTGCGGGAATAAAGTGCT 
AACTTGCCCCCAGGCTGTCTATGGGAGACCCTGGGCCCAGTCTGGGATGTACAGGGCTCTGGGAAGGGGGCAGTCCT 
GGCGGCAGAACCCGGCCTGCAGGGGCACTTTGCTTAGAAGAGGACTCTCCTAGCGGGAGAGGCTGGGAGGGGCTGCA 
TCAGGCCGTGGAGCTGGTTGCTGTGGTCATCAGTATGGCTGCTTGTTCAGGAAGCGGGAGAACATGGTGAAGGCAGC 
GAGGGGCTTGTCGGTGGGAACCATGTGGCCGGCGCCCXGAGGAGCAATGTTCGTGAGCTCCTGACCCCACCATTCCC 
TCCTCCCCATATAACTGCTCACTCGGGGGCAATTCCTTCATCCCAAACCCTTTATTCTTCCCAGAACCCTCCCCACC 
CCTCTCCAAAAAAACTTGCCCATACAGGGGCCAGATGGTGACCCATGACCCAGCCTAAAAGGCAGCCAGAGGGAAAG 
GACGGGTGGGTCCTGCTCCTTTGCCTCCGGCCCAGTTATCTCTCAGCAGGCCCAGTCCCXACCTTGATCGTGAGAAA 
GGCGATGTGGGAGAACTCCTTCACGAAGCCGGCAATCTGCTCCCCGCTGTCCCCGTACTTCACTAACCAGGGCCGGC 
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GCTGCACCTCCATCTGCCCCACCAGGAAAGACATCAGCCTACAGCAGCTGCATCCTTGCTCACAGCTACCAGCAAGA 
CCTTAGGGCTGGGAATTCCTCCACACTTGCCCTCTGTGGGCCAGAGCCAGGCAGCCAGCXGGCCACTCCCAGGCATA 
CCCGCTCCCAATCCTCCACAGCAGCCCCTATCCCAGGGCCAGGAATCTCTACCTTACCTTCTGGTTGAGGGAATCCA 
CAAACCACTCATCCCCCATGAAATTGCAGGCCATGTCTACATCTCCATTATATAATAGGATCTGGTATTTCTAAAGC 
AGGATGGGGTAAAAATGAGGGGTGTGGAACAAGCCCAGTCCCCAGCCCTTCCCTAGTTCAAGGCCTACCCCTCAGGA 
AATTCAAGGGGCCAAGCTAGATAACACGAACCAGGGAATTTTCATGTTTTCTAACGACTTACTGCATGTCCAGTATT 
CTACTAAATGTTTTATCTGTGAAAGTAGA 

<210> SEQ ID NO 53 

<211> Length : 3,256 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 53 
>HUMPHOSLI P_PEA_2_T1 4 

GGGTCTCCCACTTGTCCAGACAGCGGCCGGGCTTGTCACGGGGCTCTGTGCAGCCTTTTCCACTCTCCCGGCTGCCA 
GCGTCCCGCCCCGTCCCCTCCCAGCCCCCAAGGGAGGAGGGGAGAGCTGCAGAGAGGAGGAGGGGTCGGGGAGGCCC 
GGCTTTATAAAGGCGGCTGGAACAACCCTGCCCGCCAGACCCCGTCGCCCGGATCCCCTGAGCTGCCCGCCATCCCA 
CGTGACCGCGCCGCCCCCCAGCTCCACCGCTGAGCCCGCTCGCCATGGCCCTCTTCGGGGCCCTCTTCCTAGCGCTG 
CTGGCAGGCGCACATGCAGAGTTCCCAGGCTGCAAGATCCGCGTCACCTCCAAGGCGCTGGAGCTGGTGAAGCAGGA 
GGGGCTGCGCTTTCTGGAGCAAGAGCTGGAGACTATCACCATTCCGGACCTGCGGGGCAAAGAAGGCCACTTCTACT 
ACAACATCTCTGAGGTGAAGGTCACAGAGCTGCAACTGACATCTTCCGAGCTCGATTTCCAGCCACAGCAGGAGCTG 
ATGCTTCAAATCACCAATGCCTCCTTGGGGCTGCGCTTCCGGAGACAGCTGCTCTACTGGTTCTTCTATGATGGGGG 
CTACATCAACGCCTCAGCTGAGGGTGTGTCCATCCGCACTGGTCTGGAGCTCTCCCGGGATCCCGCTGGACGGATGA 
AAGTGTCCAATGTCTCCTGCCAGGCCTCTGTCTCCAGAATGCACGCGGCCTTCGGGGGAACCTTCAAGAAGGTGTAT 
GATTTTCTCTCCACGTTCATCACCTCAGGGATGCGCTTCCTCCTCAACCAGCAGGTGTGGGCAGCGACAGGTCGCAG 
GGTGGCAAGGGTGGGCATGCTCTCACTTTGAGAAGGCCCTGACTCTGGCTCCCACCTCGCAGATCTGCCCTGTCCTC 
TACCACGCAGGGACGGTCCTGCTCAACTCCCTCCTGGACACCGTGCCTGTGCGCAGTTCTGTGGACGAGCTTGTTGG 
CATTGACTATTCCCTCATGAAGGATCCTGTGGCTTCCACCAGCAACCTGGACATGGACTTCCGGGGGGCCTTCTTCC 
CCCTGACTGAGAGGAACTGGAGCCTCCCCAACCGGGCAGTGGAGCCCCAGCTGCAGGAGGAAGAGCGGATGGTGTAT 
GTGGCCTTCTCTGAGTTCTTCTTCGACTCTGCCATGGAGAGCTACTTCCGGGCGGGGGCCCTGCAGCTGTTGCTGGT 
GGGGGACAAGGTGCCCCACGACCTGGACATGCTGCTGAGGGCCACCTACTTTGGGAGCATTGTCCTGCTGAGCCCAG 
CAGTGATTGACTCCCCATTGAAGCTGGAGCTGCGGGTCCTGGCCCCACCGCGCTGCACCATCAAGCCCTCTGGCACC 
ACCATCTCTGTCACTGCTAGCGTCACCATTGCCCTGGTCCCACCAGACCAGCCTGAGGTCCAGCTGTCCAGCATGAC 
TATGGACGCCCGTCTCAGCGCCAAGATGGCXCTCCGGGGGAAGGCCCTGCGCACGCAGCTGGACCTGCGCAGGTTCC 
GAATCTATTCCAACCATTCTGCACTGGAGTCGCTGGCTCTGATCCCATTACAGGCCCCTCTGAAGACCATGCTGCAG 
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ATTGGGGTGATGCCCATGCTCAATGAGCGGACCTGGCGTGGGGTGCAGATCCCACTACCTGAGGGCATCAACTTTGT 

GCATGAGGTGGTGACGAACCATGCGGGATTCCTCACCATCGGGGCTGATCTCCACTTTGCCAAAGGGCTGCGAGAGG 

TGATTGAGAAGAACCGGCCTGCTGATGTCAGGGCGTCCACTGCCCCCACACCGTCCACAGCAGCTGTCTGAGCCCTC 

AATCCCCAAGCTGGCAGCTGTCATTCAGGACCCCAACCCCTCTCAGCCCCTCTTTTCCCACATTCATAGCCTGTAGT 

GCCCCCTCTAACCCCCAGTGCCACAGAGAAGACGGGATTTGAAGCTGTACCCAATTTAATTCCATAATCAATCTATC 

AATTACAGTCCGTCCACCACCTCCCTGTGGGCTGTCCTGAGCTCTGTTGGGTTCCTGGGATGGAATCAGTGCATCAT 

AAAGGGCATTCTTTAAGCAGAGAAGGGGCCAGGCCACCCCATTCAGGAACTGCTGCGGGAATAAAGTGCTAACTTGC 

CCCCAGGCTGTCTATGGGAGACCCTGGGCCCAGTCTGGGATGTACAGGGCTCTGGGAAGGGGGCAGTCCTGGCGGCA 

GAACCCGGCCTGCAGGGGCACTTTGCTTAGAAGAGGACTCTCCTAGCGGGAGAGGCTGGGAGGGGCTGCATCAGGCC 

GTGGAGCTGGTTGCTGTGGTCATCAGTATGGCTGCTTGTTCAGGAAGCGGGAGAACATGGTGAAGGCAGCGAGGGGC 

TTGTCGGTGGGAACCATGTGGCCGGCGCCCTGAGGAGCAATGTTCGTGAGCTCCTGACCCCACCATTCCCTCCTCCC 

CATATAACTGCTCACTCGGGGGCAATTCCTTCATCCCAAACCCTTTATTCTTCCCAGAACCCTCCCCACCCCTCTCC 

AAAAAAACTTGCCCATACAGGGGCCAGATGGTGACCCATGACCCAGCCTAAAAGGCAGCCAGAGGGAAAGGACGGGT 

GGGTCCTGCTCCTTTGCCTCCGGCCCAGTTATCTCTCAGCAGGCCCAGTCCCTACCTTGATCGTGAGAAAGGCGATG 

TGGGAGAACTCCTTCACGAAGCCGGCAATCTGCTCCCCGCTGTCCCCGTACTTCACTAACCAGGGCCGGCGCTGCAC 

CTCCATCTGCCCCACCAGGAAAGACATCAGCCTACAGCAGCTGCATCCTTGCTCACAGCTACCAGCAAGACCTTAGG 

GCTGGGAATTCCTCCACACTTGCCCTCTGTGGGCCAGAGCCAGGCAGCCAGCTGGCCACTCCCAGGCATACCCGCTC 

CCAATCCTCCACAGCAGCCCCTATCCCAGGGCCAGGAATCTCTACCTTACCTTCTGGTTGAGGGAATCCACAAACCA 

CTCATCCCCCATGAAATTGCAGGCCATGTCTACATCTCCATTATATAATAGGATCTGGTATTTCTAAAGCAGGATGG 

GGTAAAAATGAGGGGTGTGGAACAAGCCCAGTCCCCAGCCCTTCCCTAGTTCAAGGCCTACCCCTCAGGAAATTCAA 

GGGGCCAAGCTAGATAACACGAACCAGGGAATTTTCATGTTTTCTAACGACTTACTGCATGTCCAGTATTCTACTAA 

ATGTTTTATCTGTGAAAGTAGA 

<210> SEQ ID NO 54 

<211> Length : 3, 164 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 54 
>HUMP HO S L I P_PEA_2_T 1 6 

GGGTCTCCCACTTGTCCAGACAGCGGCCGGGCTTGTCACGGGGCTCTGTGCAGCCTTTTCCACTCTCCCGGCTGCCA 
GCGTCCCGCCCCGTCCCCTCCCAGCCCCCAAGGGAGGAGGGGAGAGCTGCAGAGAGGAGGAGGGGTCGGGGAGGCCC 
GGCTTTATAAAGGCGGCTGGAACAACCCTGCCCGCCAGACCCCGTCGCCCGGATCCCCTGAGCTGCCCGCCATCCCA 
CGTGACCGCGCCGCCCCCCAGCTCCACCGCTGAGCCCGCTCGCCATGGCCCTCTTCGGGGCCCTCTTCCTAGCGCTG 
CTGGCAGGCGCACATGCAGAGTTCCCAGGCTGCAAGATCCGCGTCACCTCCAAGGCGCTGGAGCTGGTGAAGCAGGA 
GGGGCTGCGCTTTCTGGAGCAAGAGCTGGAGACTATCACCATTCCGGACCTGCGGGGCAAAGAAGGCCACTTCTACT 
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ACAACATCTCTGAGGTGAAGGTCACAGAGCTGCAACTGACATCTTCCGAGCTCGATTTCCAGCCACAGCAGGAGCTG 

ATGCTTCAAATCACCAATGCCTCCTTGGGGCTGCGCTTCCGGAGACAGCTGCTCTACTGGTTCTTCTATGATGGGGG 

CTACATCAACGCCTCAGCTGAGGGTGTGTCCATCCGCACTGGTCTGGAGCTCTCCCGGGATCCCGCTGGACGGATGA 

AAGTGTCCAATGTCTCCTGCCAGGCCTCTGTCTCCAGAATGCACGCGGCCTTCGGGGGAACCTTCAAGAAGGTGTAT 

GATTTTCTCTCCACGTTCATCACCTCAGGGATGCGCTTCCTCCTCAACCAGCAGATCTGCCCTGTCCTCTACCACGC 

AGGGACGGTCCTGCTCAACTCCCTCCTGGACACCGTGCCTGTTCTGTGGACGAGCTTGTTGGCATTGACTATTCCCT 

CATGAAGGATCCTGTGGCTTCCACCAGCAACCTGGACATGGACTTCCGGGGGGCCTTCTTCCCCCTGACTGAGAGGA 

ACTGGAGCCTCCCCAACCGGGCAGTGGAGCCCCAGCTGCAGGAGGAAGAGCGGATGGTGTATGTGGCCTTCTCTGAG 

TTCTTCTTCGACTCTGCCATGGAGAGCTACTTCCGGGCGGGGGCCCTGCAGCTGTTGCTGGTGGGGGACAAGGTGCC 

CCACGACCTGGACATGCTGCTGAGGGCCACCTACTTTGGGAGCATTGTCCTGCTGAGCCCAGCAGTGATTGACTCCC 

CATTGAAGCTGGAGCTGCGGGTCCTGGCCCCACCGCGCTGCACCATCAAGCCCTCTGGCACCACCATCTCTGTCACT 

GCTAGCGTCACCATTGCCCTGGTCCCACCAGACCAGCCTGAGGTCCAGCTGTCCAGCATGACTATGGACGCCCGTCT 

CAGCGCCAAGATGGCTCTCCGGGGGAAGGCCCTGCGCACGCAGCTGGACCTGCGCAGGTTCCGAATCTATTCCAACC 

ATTCTGCACTGGAGTCGCTGGCTCTGATCCCATTACAGGCCCCTCTGAAGACCATGCTGCAGATTGGGGTGATGCCC 

ATGCTCAATGAGCGGACCTGGCGTGGGGTGCAGATCCCACTACCTGAGGGCATCAACTTTGTGCATGAGGTGGTGAC 

GAACCATGCGGGATTCCTCACCATCGGGGCTGATCTCCACTTTGCCAAAGGGCTGCGAGAGGTGATTGAGAAGAACC 

GGCCTGCTGATGTCAGGGCGTCCACTGCCCCCACACCGTCCACAGCAGCTGTCTGAGCCCTCAATCCCCAAGCTGGC 

AGCTGTCATTCAGGACCCCAACCCCTCTCAGCCCCTCTTTTCCCACATTCATAGCCTGTAGTGCCCCCTCTAACCCC 

CAGTGCCACAGAGAAGACGGGATTTGAAGCTGTACCCAATTTAATTCCATAATCAATCTATCAATTACAGTCCGTCC 

ACCACCTCCCTGTGGGCTGTCCTGAGCTCTGTTGGGTTCCTGGGATGGAATCAGTGCATCATAAAGGGCATTCTTTA 

AGCAGAGAAGGGGCCAGGCCACCCCATTCAGGAACTGCTGCGGGAATAAAGTGCTAACTTGCCCCCAGGCTGTCTAT 

GGGAGACCCTGGGCCCAGTCTGGGATGTACAGGGCTCTGGGAAGGGGGCAGTCCTGGCGGCAGAACCCGGCCTGCAG 

GGGCACTTTGCTTAGAAGAGGACTCTCCTAGCGGGAGAGGCTGGGAGGGGCTGCATCAGGCCGTGGAGCTGGTTGCT 

GTGGTCATCAGTATGGCTGCTTGTTCAGGAAGCGGGAGAACATGGTGAAGGCAGCGAGGGGCTTGTCGGTGGGAACC 

ATGTGGCCGGCGCCCTGAGGAGCAATGTTCGTGAGCTCCTGACCCCACCATTCCCTCCTCCCCATATAACTGCTCAC 

TCGGGGGCAATTCCTTCATCCCAAACCCTTTATTCTTCCCAGAACCCTCCCCACCCCTCTCCAAAAAAACTTGCCCA 

TACAGGGGCCAGATGGTGACCCATGACCCAGCCTAAAAGGCAGCCAGAGGGAAAGGACGGGTGGGTCCTGCTCCTTT 

GCCTCCGGCCCAGTTATCTCTCAGCAGGCCCAGTCCCTACCTTGATCGTGAGAAAGGCGATGTGGGAGAACTCCTTC 

ACGAAGCCGGCAATCTGCTCCCCGCTGTCCCCGTACTTCACTAACCAGGGCCGGCGCTGCACCTCCATCTGCCCCAC 

CAGGAAAGACATCAGCCTACAGCAGCTGCATCCTTGCTCACAGCTACCAGCAAGACCTTAGGGCTGGGAATTCCTCC 

ACACTTGCCCTCTGTGGGCCAGAGCCAGGCAGCCAGCTGGCCACTCCCAGGCATACCCGCTCCCAATCCTCCACAGC 

AGCCCCTATCCCAGGGCCAGGAATCTCTACCTTACCTTCTGGTTGAGGGAATCCACAAACCACTCATCCCCCATG7VA 

ATTGCAGGCCATGTCTACATCTCCATTATATAATAGGATCTGGTATTTCTAAAGCAGGATGGGGTAAAAATGAGGGG 

TGTGGAACAAGCCCAGTCCCCAGCCCTTCCCXAGTTCAAGGCCTACCCCTCAGGAAATTCAAGGGGCCAAGCTAGAT 

AACACGAACCAGGGAATTTTCATGTTTTCTAACGACTTACTGCATGTCCAGTATTCTACTAAATGTTTTATCTGTGA 

AAGTAGA 
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<210> SEQ ID NO 55 

<211> Length : 2,886 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 55 
>HUMPHOSLIP_PEA_2_T17 

GGGTCTCCCACTTGTCCAGACAGCGGCCGGGCTTGTCACGGGGCTCTGTGCAGCCTTTTCCACTCTCCCGGCTGCCA 
GCGTCCCGCCCCGTCCCCTCCCAGCCCCCAAGGGAGGAGGGGAGAGCTGCAGAGAGGAGGAGGGGTCGGGGAGGCCC 
GGCTTTATAAAGGCGGCTGGAACAACCCTGCCCGCCAGACCCCGTCGCCCGGATCCCCTGAGCTGCCCGCCATCCCA 
CGTGACCGCGCCGCCCCCCAGCTCCACCGCTGAGCCCGCTCGCCATGGCCCTCTTCGGGGCCCTCTTCCTAGCGCTG 
CTGGCAGGCGCACATGCAGAGTTCCCAGGCTGCAAGATCCGCGTCACCTCCAAGGCGCTGGAGCTGGTGAAGCAGGA 
GGGGCTGCGCTTTCTGGAGCAAGAGCTGGAGACTATCACCATTCCGGACCTGCGGGGCAAAGAAGGCCACTTCTACT 
ACAACATCTCTGAGAAGGTGTATGATTTTCTCTCCACGTTCATCACCTCAGGGATGCGCTTCCTCCTCAACCAGCAG 
ATCTGCCCTGTCCTCTACCACGCAGGGACGGTCCTGCTCAACTCCCTCCTGGACACCGTGCCTGTGCGCAGTTCTGT 
GGACGAGCTTGTTGGCATTGACTATTCCCTCATGAAGGATCCTGTGGCTTCCACCAGCAACCTGGACATGGACTTCC 
GGGGGGCCTTCTTCCCCCTGACTGAGAGGAACTGGAGCCTCCCCAACCGGGCAGTGGAGCCCCAGCTGCAGGAGGAA 
GAGCGGATGGTGTATGTGGCCTTCTCTGAGTTCTTCTTCGACTCTGCCATGGAGAGCTACTTCCGGGCGGGGGCCCT 
GCAGCTGTTGCTGGTGGGGGACAAGGTGCCCCACGACCTGGACATGCTGCTGAGGGCCACCTACTTTGGGAGCATTG 
TCCTGCTGAGCCCAGCAGTGATTGACTCCCCATTGAAGCTGGAGCTGCGGGTCCTGGCCCCACCGCGCTGCACCATC 
AAGCCCTCTGGCACCACCATCTCTGTCACTGCTAGCGTCACCATTGCCCTGGTCCCACCAGACCAGCCTGAGGTCCA 
GCTGTCCAGCATGACTATGGACGCCCGTCTCAGCGCCAAGATGGCTCTCCGGGGGAAGGCCCTGCGCACGCAGCTGG 
ACCTGCGCAGGTTCCGAATCTATTCCAACCATTCTGCACTGGAGTCGCTGGCTCTGATCCCATTACAGGCCCCTCTG 
AAGACCATGCTGCAGATTGGGGTGATGCCCATGCTCAATGAGCGGACCTGGCGTGGGGTGCAGATCCCACTACCTGA 
GGGCATCAACTTTGTGCATGAGGTGGTGACGAACCATGCGGGATTCCTCACCATCGGGGCTGATCTCCACTTTGCCA 
AAGGGCTGCGAGAGGTGATTGAGAAGAACCGGCCTGCTGATGTCAGGGCGTCCACTGCCCCCACACCGTCCACAGCA 
GCTGTCTGAGCCCTCAATCCCCAAGCTGGCAGCTGTCATTCAGGACCCCAACCCCTCTCAGCCCCTCTTTTCCCACA 
TTCATAGCCTGTAGTGCCCCCTCTAACCCCCAGTGCCACAGAGAAGACGGGATTTGAAGCTGTACCCAATTTAATTC 
CATAATCAATCTATCAATTACAGTCCGTCCACCACCTCCCTGTGGGCTGTCCTGAGCTCTGTTGGGTTCCTGGGATG 
GAATCAGTGCATCATAAAGGGCATTCTTTAAGCAGAGAAGGGGCCAGGCCACCCCATTCAGGAACTGCTGCGGGAAT 
AAAGTGCTAACTTGCCCCCAGGCTGTCTATGGGAGACCCTGGGCCCAGTCTGGGATGTACAGGGCTCTGGGAAGGGG 
GCAGTCCTGGCGGCAGAACCCGGCCTGCAGGGGCACTTTGCTTAGAAGAGGACTCTCCTAGCGGGAGAGGCTGGGAG 
GGGCTGCATCAGGCCGTGGAGCTGGTTGCTGTGGTCATCAGTATGGCTGCTTGTTCAGGAAGCGGGAGAACATGGTG 
AAGGCAGCGAGGGGCTTGTCGGTGGGAACCATGTGGCCGGCGCCCTGAGGAGCAATGTTCGTGAGCTCCTGACCCCA 
CCATTCCCTCCTCCCCATATAACTGCTCACTCGGGGGCAATTCCTTCATCCCAAACCCTTTATTCTTCCCAGAACCC 
T C C C C AC C CCT CT C C AAAAAAAC T T G C C CAT AC AG G G GC GAG AT G G T G AC C CAT G AC C C AG C C T A AAAG G CAGC C AG 
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AGGGAAAGGACGGGTGGGTCCTGCTCCTTTGCCTCCGGCCCAGTTATCTCTCAGCAGGCCCA.GTCCCTACCTTGATC 
GTGAGAAAGGCGATGTGGGAGAACTCCTTCACGAAGCCGGCAATCTGCTCCCCGCTGTCCCCGTACTTCACTAACCA 
GGGCCGGCGCTGCACCTCCATCTGCCCCACCAGGAAAGACATCAGCCTACAGCAGCTGCATCCTTGCTCACAGCTAC 
CAGCAAGACCTTAGGGCTGGGAATTCCTCCACACTTGCCCTCTGTGGGCCAGAGCCAGGCAGCCAGCTGGCCACTCC 
CAGGCATACCCGCTCCCAATCCTCCACAGCAGCCCCTATCCCAGGGCCAGGAATCTCTACCTTACCTTCTGGTTGAG 
GGAATCCACAAACCACTCATCCCCCATGAAATTGCAGGCCATGTCTACATCTCCATTATATAATAGGATCTGGTATT 
TCTAAAGCAGGATGGGGTAAAAATGAGGGGTGTGGAACAAGCCCAGTCCCCAGCCCTTCCCTAGTTCAAGGCCTACC 
CCTCAGGAAATTCAAGGGGCCAAGCTAGATAACACGAACCAGGGAATTTTCATGTTTTCTAACGACTTACTGCATGT 
CCAGTATTCTACTAAATGTTTTATCTGTGAAAGTAGA 

<210> SEQ ID NO 56 

<211> Length : 3,100 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 56 
>HUMPHO S L I P_PE A_2_T 1 8 

GGGTCTCCCACTTGTCCAGACAGCGGCCGGGCTTGTCACGGGGCTCTGTGCAGCCTTTTCCACTCTCCCGGCTGCCA 
GCGTCCCGCCCCGTCCCCTCCCAGCCCCCAAGGGAGGAGGGGAGAGCTGCAGAGAGGAGGAGGGGTCGGGGAGGCCC 
GGCTTTATAAAGGCGGCTGGAACAACCCTGCCCGCCAGACCCCGTCGCCCGGATCCCCTGAGCTGCCCGCCATCCCA 
CGTGACCGCGCCGCCCCCCAGCTCCACCGCTGAGCCCGCTCGCCATGGCCCTCTTCGGGGCCCTCTTCCTAGCGCTG 
CTGGCAGGCGCACATGCAGAGTTCCCAGGCTGCAAGATCCGCGTCACCTCCAAGGCGCTGGAGCTGGTGAAGCAGGA 
GGGGCTGCGCTTTCTGGAGCAAGAGCTGGAGACTATCACCATTCCGGACCTGCGGGGCAAAGAAGGCCACTTCTACT 
ACAACATCTCTGAGGTGAAGGTCACAGAGCTGCAACTGACATCTTCCGAGCTCGATTTCCAGCCACAGCAGGAGCTG 
ATGCTTCAAATCACCAATGCCTCCTTGGGGCTGCGCTTCCGGAGACAGCTGCTCTACTGGTTCTTGAAGGTGTATGA 
TTTTCTCTCCACGTTCATCACCTCAGGGATGCGCTTCCTCCTCAACCAGCAGGTGTGGGCAGCGACAGGTCGCAGGG 
TGGCAAGGGTGGGCATGCTCTCACTTTGAGAAGGCCCTGACTCTGGCTCCCACCTCGCAGATCTGCCCTGTCCTCTA 
CCACGCAGGGACGGTCCTGCTCAACTCCCTCCTGGACACCGTGCCTGTGCGCAGTTCTGTGGACGAGCTTGTTGGCA 
TTGACTATTCCCTCATGAAGGATCCTGTGGCTTCCACCAGCAACCTGGACATGGACTTCCGGGGGGCCTTCTTCCCC 
CTGACTGAGAGGAACTGGAGCCTCCCCAACCGGGCAGTGGAGCCCCAGCTGCAGGAGGAAGAGCGGATGGTGTATGT 
GGCCTTCTCTGAGTTCTTCTTCGACTCTGCCATGGAGAGCTACTTCCGGGCGGGGGCCCTGCAGCTGTTGCTGGTGG 
GGGACAAGGTGCCCCACGACCTGGACATGCTGCTGAGGGCCACCTACTTTGGGAGCATTGTCCTGCTGAGCCCAGCA 
GTGATTGACTCCCCATTGAAGCTGGAGCTGCGGGTCCTGGCCCCACCGCGCTGCACCATCAAGCCCTCTGGCACCAC 
CATCTCTGTCACTGCTAGCGTCACCATTGCCCTGGTCCCACCAGACCAGCCTGAGGTCCAGCTGTCCAGCATGACTA 
TGGACGCCCGTCTCAGCGCCAAGATGGCTCTCCGGGGGAAGGCCCTGCGCACGCAGCTGGACCTGCGCAGGTTCCGA 
ATCTATTCCAACCATTCTGCACTGGAGTCGCTGGCTCTGATCCCATTACAGGCCCCTCTGAAGACCATGCTGCAGAT 
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TGGGGTGATGCCCATGCTCAATGAGCGGACCTGGCGTGGGGTGCAGATCCCACTACCTGAGGGCATCAACTTTGTGC 
ATGAGGTGGTGACGAACCATGCGGGATTCCTCACCATCGGGGCTGATCTCCACTTTGCCAAAGGGCTGCGAGAGGTG 
ATTGAGAAGAACCGGCCTGCTGATGTCAGGGCGTCCACTGCCCCCACACCGTCCACAGCAGCTGTCTGAGCCCTCAA 
TCCCCAAGCTGGCAGCTGTCATTCAGGACCCCAACCCCTCTCAGCCCCTCTTTTCCCACATTCATAGCCTGTAGTGC 
CCCCTCTAACCCCCAGTGCCACAGAGAAGACGGGATTTGAAGCTGTACCCAATTTAATTCCATAATCAATCTATCAA 
TTACAGTCCGTCCACCACCTCCCTGTGGGCTGTCCTGAGCTCTGTTGGGTTCCTGGGATGGAATCAGTGCATCATAA 
AGGGCATTCTTTAAGCAGAGAAGGGGCCAGGCCACCCCATTCAGGAACTGCTGCGGGAATAAAGTGCTAACTTGCCC 
CCAGGCTGTCTATGGGAGACCCTGGGCCCAGTCTGGGATGTACAGGGCTCTGGGAAGGGGGCAGTCCTGGCGGCAGA 
ACCCGGCCTGCAGGGGCACTTTGCTTAGAAGAGGACTCTCCTAGCGGGAGAGGCTGGGAGGGGCTGCATCAGGCCGT 
GGAGCTGGTTGCTGTGGTCATCAGTATGGCTGCTTGTTCAGGAAGCGGGAGAACATGGTGAAGGCAGCGAGGGGCTT 
GTCGGTGGGAACCATGTGGCCGGCGCCCTGAGGAGCAATGTTCGTGAGCTCCTGACCCCACCATTCCCTCCTCCCCA 
TATAACTGCTCACTCGGGGGCAATTCCTTCATCCCAAACCCTTTATTCTTCCCAGAACCCTCCCCACCCCTCTCCAA 
AAAAACTTGCCCATACAGGGGCCAGATGGTGACCCATGACCCAGCCTAAAAGGCAGCCAGAGGGAAAGGACGGGTGG 
GTCCTGCTCCTTTGCCTCCGGCCCAGTTATCTCTCAGCAGGCCCAGTCCCTACCTTGATCGTGAGAAAGGCGATGTG 
GGAGAACTCCTTCACGAAGCCGGCAATCTGCTCCCCGCTGTCCCCGTACTTCACTAACCAGGGCCGGCGCTGCACCT 
CCATCTGCCCCACCAGGAAAGACATCAGCCTACAGCAGCTGCATCCTTGCTCACAGCTACCAGCAAGACCTTAGGGC 
TGGGAATTCCTCCACACTTGCCCTCTGTGGGCCAGAGCCAGGCAGCCAGCTGGCCACTCCCAGGCATACCCGCTCCC 
AATCCTCCACAGCAGCCCCTATCCCAGGGCCAGGAATCTCTACCTTACCTTCTGGTTGAGGGAATCCACAAACCACT 
CATCCCCCATGAAATTGCAGGCCATGTCTACATCTCCATTATATAATAGGATCTGGTATTTCTAAAGCAGGATGGGG 
TAAAAATGAGGGGTGTGGAACAAGCCCAGTCCCCAGCCCTTCCCTAGTTCAAGGCCTACCCCTCAGGAAATTCAAGG 
GGCCAAGCTAGATAACACGAACCAGGGAATTTTCATGTTTTCTAACGACTTACTGCATGTCCAGTATTCTACTAAAT 
GTTTTATCTGTGAAAGTAGA 

<210> SEQ ID NO 57 

<211> Length : 3,254 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 57 
>HUMPHOSLIP_PEA_2 JT1 9 

GGGTCTCCCACTTGTCCAGACAGCGGCCGGGCTTGTCACGGGGCTCTGTGCAGCCTTTTCCACTCTCCCGGCTGCCA 
GCGTCCCGCCCCGTCCCCTCCCAGCCCCCAAGGGAGGAGGGGAGAGCTGCAGAGAGGAGGAGGGGTCGGGGAGGCCC 
GGCTTTATAAAGGCGGCTGGAACAACCCTGCCCGCCAGACCCCGTCGCCCGGATCCCCTGAGCTGCCCGCCATCCCA 
CGTGACCGCGCCGCCCCCCAGCTCCACCGCTGAGCCCGCTCGCCATGGCCCTCTTCGGGGCCCTCTTCCTAGCGCTG 
CTGGCAGGCGCACATGCAGAGTTCCCAGGCTGCAAGATCCGCGTCACCTCCAAGGCGCTGGAGCTGGTGAAGCAGGA 
GGGGCTGCGCTTTCTGGAGCAAGAGCTGGAGACTATCACCATTCCGGACCTGCGGGGCAAAGAAGGCCACTTCTACT 
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ACAACATCTCTGAGGTGAAGGTCACAGAGCTGCAACTGACATCTTCCGAGCTCGATTTCCAGCCACAGCAGGAGCTG 
ATGCTTCAAATCACCAATGCCTCCTTGGGGCTGCGCTTCCGGAGACAGCTGCTCTACTGGTTCTTCTATGATGGGGG 
CTACATCAACGCCTCAGCTGAGGGTGTGTCCATCCGCACTGGTCTGGAGCTCTCCCGGGATCCCGCTGGACGGATGA 
AAGTGTCCAATGTCTCCTGCCAGGCCTCTGTCTCCAGAATGCACGCGGCCTTCGGGGGAACCTTCAAGAAGGTGTAT 
GATTTTCTCTCCACGTTCATCACCTCAGGGATGCGCTTCCTCCTCAACCAGCAGATCTGCCCTGTCCTCTACCACGC 
AGGGACGGTCCTGCTCAACTCCCTCCTGGACACCGTGCCTGTGCGCAGTTCTGTGGACGAGCTTGTTGGCATTGACT 
ATTCCCTCATGAAGGATCCTGTGGCTTCCACCAGCAACCTGGACATGGACTTCCGGGGGGCCTTCTTCCCCCTGACT 
GAGAGGAACTGGAGCCTCCCCAACCGGGCAGTGGAGCCCCAGCTGCAGGAGGAAGAGCGGATGGTGTATGTGGCCTT 
CTCTGAGTTCTTCTTCGACTCTGCCATGGAGAGCTACTTCCGGGCGGGGGCCCTGCAGCTGTTGCTGGTGGGGGACA 
AGGTGCCCCACGACCTGGACATGCTGCTGAGGGCCACCTACTTTGGGAGCATTGTCCTGCTGAGCCCAGCAGTGATT 
GACTCCCCATTGAAGCTGGAGCTGCGGGTCCTGGCCCCACCGCGCTGCACCATCAAGCCCTCTGGCACCACCATCTC 
TGTCACTGCTAGCGTCACCATTGCCCTGGTCCCACCAGACCAGCCTGAGGTCCAGCTGTCCAGCATGACTATGGACG 
CCCGTCTCAGCGCCAAGATGGCTCTCCGGGGGAAGGCCCTGCGCACGCAGCTGGACCTGCGCAGGTTCCGAATCTAT 
TCCAACCATTCTGCACTGGAGTCGCTGGCTCTGATCCCATTACAGGCCCCTCTGAAGACCATGCTGCAGATTGGGGT 
GATGCCCATGCTCAATGGTAAGGCTGGGGTGTGAGGATGGAGGAAGAAAGGAGGGGTGAACTGGGCGGGCCCAGACT 
GAGCGGGGTGCTCCCACCCACAGAGCGGACCTGGCGTGGGGTGCAGATCCCACTACCTGAGGGCATCAACTTTGTGC 
ATGAGGTGGTGACGAACCATGCGGGATTCCTCACCATCGGGGCTGATCTCCACTTTGCCAAAGGGCTGCGAGAGGTG 
ATTGAGAAGAACCGGCCTGCTGATGTCAGGGCGTCCACTGCCCCCACACCGTCCACAGCAGCTGTCTGAGCCCTCAA 
TCCCCAAGCTGGCAGCTGTCATTCAGGACCCCAACCCCTCTCAGCCCCTCTTTTCCCACATTCATAGCCTGTAGTGC 
CCCCTCTAACCCCCAGTGCCACAGAGAAGACGGGATTTGAAGCTGTACCCAATTTAATTCCATAATCAATCTATCAA 
TTACAGTCCGTCCACCACCTCCCTGTGGGCTGTCCTGAGCTCTGTTGGGTTCCTGGGATGGAATCAGTGCATCATAA 
AGGGCATTCTTTAAGCAGAGAAGGGGCCAGGCCACCCCATTCAGGAACTGCTGCGGGAATAAAGTGCTAACTTGCCC 
CCAGGCTGTCTATGGGAGACCCTGGGCCCAGTCTGGGATGTACAGGGCTCTGGGAAGGGGGCAGTCCTGGCGGCAGA 
ACCCGGCCTGCAGGGGCACTTTGCTTAGAAGAGGACTCTCCTAGCGGGAGAGGCTGGGAGGGGCTGCATCAGGCCGT 
GGAGCTGGTTGCTGTGGTCATCAGTATGGCTGCTTGTTCAGGAAGCGGGAGAACATGGTGAAGGCAGCGAGGGGCTT 
GTCGGTGGGAACCATGTGGCCGGCGCCCTGAGGAGCAATGTTCGTGAGCTCCTGACCCCACCATTCCCTCCTCCCCA 
TATAACTGCTCACTCGGGGGCAATTCCTTCATCCCAAACCCTTTATTCTTCCCAGAACCCTCCCCACCCCTCTCCAA 
AAAAACTTGCCCATACAGGGGCCAGATGGTGACCCATGACCCAGCCTAAAAGGCAGCCAGAGGGAAAGGACGGGTGG 
GTCCTGCTCCTTTGCCTCCGGCCCAGTTATCTCTCAGCAGGCCCAGTCCCTACCTTGATCGTGAGAAAGGCGATGTG 
GGAGAACTCCTTCACGAAGCCGGCAATCTGCTCCCCGCTGTCCCCGTACTTCACTAACCAGGGCCGGCGCTGCACCT 
CCATCTGCCCCACCAGGAAAGACATCAGCCTACAGCAGCTGCATCCTTGCTCACAGCTACCAGCAAGACCTTAGGGC 
TGGGAATTCCTCCACACTTGCCCTCTGTGGGCCAGAGCCAGGCAGCCAGCTGGCCACTCCCAGGCATACCCGCTCCC 
AATCCTCCACAGCAGCCCCTATCCCAGGGCCAGGAATCTCTACCTTACCTTCTGGTTGAGGGAATCCACAAACCACT 
CATCCCCCATGAAATTGCAGGCCATGTCTACATCTCCATTATATAATAGGATCTGGTATTTCTAAAGCAGGATGGGG 
TAAAAATGAGGGGTGTGGAACAAGCCCAGTCCCCAGCCCTTCCCTAGTTCAAGGCCTACCCCTCAGGAAATTCAAGG 
GGCCAAGCTAGATAACACGAACCAGGGAATTTTCATGTTTTCTAACGACTTACTGCATGTCCAGTATTCTACTAAAT 
GTTTTATCTGTGAAAGTAGA 
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<210> SEQ ID NO 58 

<211> Length : 1,533 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 58 
>AI07 6020JT0 

CGCTCAGTCCGGCAGCGCAGCAGGAGGGAGGCGCGAGGCAGGAGCCGGCGGCTGGGCTCCGCAGCGCAGCCAGCGCA 
GCGCGGCGCCCCGGGCCCGGCCCCATGCCCGCAGCCCCGCCGGACCGTCCTTGAGCGCGGGCGCCTAGCCCGCGCCC 
CCTGCCCGCCGGCACCATTGCCCCGACGGCGCGGCCGGGCGGCCCGGCGCTCCCCAGGCTCCGCGCGGGCCGAAAGA 
CGCTGCTAGCGGCCGCCGCGGGTGTGGTGATGCTGCTGGTGCTGGTGGTGCTCATCCCCGTGCTGGTGAGCTCGGGC 
GGCCCGGAAGGCCACTATGAGATGCTGGGCACCTGCCGCATGGTGTGCGACCCCTACCCCGCGCGGGGCCCCGGCGC 
CGGCGCGCGGACCGACGGCGGCGACGCCCTGAGCGAGCAGAGCGGCGCGCCCCCGCCTTCCACGCTGGTGCAGGGCC 
CCCAGGGGAAGCCGGGCCGCACCGGCAAGCCCGGCCCTCCGGGGCCTCCCGGGGACCCAGGTCCTCCCGGCCCTGTG 
GGGCCGCCGGGGGAGAAGGGTGAGCCAGGCAAGCCGGGCCCTCCGGGGCTGCCGGGCGCGGGGGGCAGCGGCGCCAT 
CAGCACTGCCACCTACACCACGGTGCCGCGCGTGGCCTTCTACGCCGGCCTCAAGAACCCCCACGAGGGTTACGAGG 
TACTCAAGTTTGACGACGTGGTCACCAACCTAGGCAACAACTACGACGCGGCCAGCGGCAAGTTTACGTGCAACATT 
CCCGGCACCTACTTTTTCACCTACCATGTCCTCATGCGCGGCGGCGACGGCACCAGTATGTGGGCAGACCTCTGCAA 
GAATGGCCAGGTGCGGGCCAGTGCTATTGCCCAGGACGCGGACCAGAACTACGACTACGCCAGCAACAGCGTGATCC 
TGCACCTGGACGCCGGCGACGAGGTCTTCATCAAGCTGGATGGAGGCAAAGCACACGGCGGCAACAGCAACAAATAC 
AGCACGTTCTCTGGCTTCATCATCTACTCCGACTGAGCTCCCCACGTCTCCCTCCACCCACGTCCCTCACCCGCCGG 
GGTCCCCTCCGGGCGGGGCAGACGATGACTCGCCCCTCGCCCACCCGCTCGCTGCCCGGCCCTCCCCGGCTATGACG 
CCCCCGGCCCGTGCTCAACACCGCCTGGGCCACAGCTAGGCCCTCCCACCGGCTCGCTGCAGAGCCGGGCCCAGCGC 
GCCCTGTCCCCGTGCCAGGGAACCGGGGTTGACCGCCCCCGCCCAGCCCGCGCTATATATTTGTACAATAGGACTGT 
TTACTGCCCACCTCCGCCTGCCAGCCCACCCCAGCCTGGGGAGAGGTCGCGGCGGCGGGTTTGCTTCCTGCGCTCTG 
AGATGAGCTGCCCTCGGCTCCCTCCGGGGTGGCGCGCCCGGGGGAGGGGGGAGTTGGGGGCTGGATAGCTTCCCAGC 
ACCCTCAGAGCCCCCGCCCGGCTGTGCCCCGTCTGACCAAAGTTATAATAAAAACATTTTCACCCCGCAG 

<210> SEQ ID NO 59 

<211> Length : 6,659 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 
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<400> sequence : 59 
>M7 9 2 1 7__PEA_1_T1 

TGATGTCTGCTAATGGAAGATAAATGAGAAGCAAACTGGGAAATACATTTTGTCTCAGGATTACTGCATCTACTACT 

GGATAAAGATCAAAAGAGTTCTCTTCAACCCTTTCAACTCTACATTTAACAAATTGAGCTTTTCAGAGTCTTTTTTT 

GTAAAGTATTTCCAAAGAAGACTTATAGGTTAGGAATAAACATAAACTACCCAGGTTGGCTAGGAAGGTATTTCTGT 

TCATCTAAAGATGATGCCCAGGTGTGGAACAGGATAAGAAAAGACCATGGACATCTTTGTCCCATGAATTTAGTTGG 

TCATCGTGTTACAGGGCTATAATGCCGCTCTAGATCCAGTTAAATAAGAAGTGGGGAGGGGTTGTAAACTGCAGCTT 

TTTGGGGCACTTATCCATTTATTACCCCAAGTAAAAGACCTATACCAAACAGCAT^ACAACATCTCTGCATTGTCATT 

ATAATGTTCTTTGAGACACAGCCAGTGTTCCAGCCATTGTTCCATCTAAGATTTAAGCATTTTCTAGAAATGTATGG 

TGGCAGGGGTGTTGAACATAACTTCTTCAAGACTGACATGGTTCTCTTTCTTTTGCAGGCCTGATTGTTGGCAAAGG 

CATCATAAGAAGCTGGCATTTATTTCTGTTCTAACCTATTACTGTATAACTGTGAATAGACACTATGCATATTTGTT 

GGTCAGCAAAACCAAGAAACAAGAGCTATGGCATTTGAAAAAGTCTGTCTGATTCCAGGGTGTTTTTCCTGGGTTTC 

ATCATCAGGTACCTCCTCCCTTTCATCTCAGCAAGAATGTGGCACCTTTTATCGTTTGATAAAGATTAAGGACATGT 

TCTTTGGTCAACAGCCAGAACTTAAAATCTGCTGGAATAGGGTCAGAGACCATTTCAGCTGCAGCTGAGGAAAATGA 

AATGTTCATTTTATTTGGTGCCTTGTCTGGGGAGCACACTAACTCTTCTGGAAACGTGTCAGTGAAACAGAGATCGT 

TTTGTGGAATAGCTAACCCATGGTTATGGCGAGTGACCCGACGTGATCTGGGGGGCAGGCTGCAGAGGACTCATGAC 

AGGCTATACCATGCTGCGGAATGGGGGCGCGGGGAACGGAGGTCAGACCTGCATGCTGCGCTGGTCCAACCGCATCC 

GCCTCACGTGGCTCAGCTTCACGCTCTTTGTCATCCTGGTCTTCTTCCCGCTCATCGCCCACTATTACCTCACCACT 

CTGGATGAGGCTGATGAGGCAGGCAAGCGGATTTTTGGTCCCCGGGTGGGGAACGAGCTGTGCGAGGTGAAGCACGT 

GCTGGATCTGTGCCGCATCCGGGAGTCGGTGAGTGAAGAGCTCCTGCAGCTGGAGGCCAAGCGCCAAGAGCTGAACA 

GCGAGATCGCCAAGCTGAATCTGAAGATCGAAGCCTGTAAGAAGAGCATTGAGAACGCCAAGCAGGACCTGCTCCAG 

CTCAAGAATGTCATCAGCCAGACCGAGCATTCCTACAAGGAGCTCATGGCCCAGAACCAGCCCAAGCTGTCCCTGCC 

CATCCGACTGCTCCCAGAGAAGGACGATGCCGGCCTCCCTCCCCCGAAGGCCACTCGGGGCTGCCGGCTACACAACT 

GCTTTGATTATTCTCGTTGCCCTCTCACCTCTGGCTTCCCGGTCTACGTCTATGACAGTGACCAGTTTGTCTTTGGC 

AGCTACCTGGATCCCTTGGTCAAGCAGGCTTTTCAGGCGACAGCACGAGCTAACGTTTATGTTACAGAAAATGCAGA 

CATCGCCTGCCTTTACGTGATACTAGTGGGAGAGATGCAGGAGCCGGTGGTGCTGCGGCCTGCTGAGCTGGAGAAGC 

AGTTGTATTCCCTGCCACACTGGCGGACGGATGGACACAACCATGTCATCATCAATCTGTCACGTAAGTCAGATACA 

CAGAACCTTCTCTATAACGTCAGTACTGGCCGTGCCATGGTGGCCCAGTCCACCTTCTACACTGTCCAGTACAGACC 

TGGCTTTGACTTGGTCGTATCACCGCTGGTCCATGCCATGTCTGAGCCCAACTTCATGGAAATCCCACCACAGGTGC 

CGGTGAAGCGGAAATATCTCTTCACCTTCCAGGGCGAGAAGATTGAGTCTCTGAGGTCTAGCCTTCAGGAGGCCCGC 

TCCTTCGAAGAGGAAATGGAGGGCGACCCTCCCGCCGACTACGATGACCGGATCATTGCCACCCTGAAGGCGGTGCA 

GGACAGCAAGCTGGATCAGGTCCTGGTGGAATTCACCTGCAAAAACCAGCCCAAACCCAGCCTGCCAACTGAGTGGG 

CACTGTGTGGAGAGCGGGAGGACCGCTTGGAATTGCTGAAGCTCTCCACCTTCGCCCTCATCATTACCCCCGGGGAC 

CCTCGCTTGGTTATTTCCTCTGGGTGTGCAACACGGCTCTTCGAAGCCCTGGAAGTCGGTGCCGTCCCGGTGGTGCT 

GGGGGAGCAGGTCCAGCTTCCCTACCAGGACATGCTGCAGTGGAACGAGGCGGCCCTGGTGGTGCCAAAGCCTCGTG 

TTACCGAGGTTCATTTCCTGCTCAGAAGCCTCTCCGATAGTGACCTCCTGGCTATGAGGCGGCAAGGCCGCTTTCTC 
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TGGGAGACTTACTTCTCCACTGCTGACAGTATTTTTAATACCGTGCTGGCTATGATTAGGACTCGCATCCAGATCCC 

AGCCGCTCCCATCCGGGAAGAGGCGGCAGCTGAGATCCCCCACCGTTCAGGCAAGGCGGCTGGAACTGACCCCAACA 

TGGCTGACAACGGGGACCTGGACCTGGGGCCAGTGGAGACGGAGCCGCCCTACGCCTCACCCAGATACCTCCGCAAT 

TTCACTCTGACTGTCACTGACTTTTACCGCAGCTGGAACTGTGCTCCAGGGCCTTTCCATCTTTTCCCCCACACTCC 

CTTTGACCCTGTGTTGCCCTCAGAGGCCAAATTCTTGGGCTCAGGGACTGGCTTTCGGCCTATTGGTGGTGGAGCTG 

GGGGTTCTGGCAAGGAATTTCAGGCAGCGCTTGGAGGCAATGTTCCCCGAGAGCAGTTCACGGTGGTGATGTTGACT 

TATGAGCGGGAGGAAGTGCTTATGAACTCTTTAGAGAGGCTGAATGGCCTCCCTTACCTGAACAAGGTCGTGGTGGT 

GTGGAATTCTCCCAAGCTGCCATCAGAGGACCTTCTGTGGCCTGACATTGGCGTCCCCATCATGGTGGTCCGTACTG 

AGAAGAACAGTTTGAACAACCGATTCTTACCCTGGAATGAAATTGAGACAGAGGCCATCCTGTCCATTGATGACGAT 

GCTCACCTCCGCCATGACGAAATCATGTTTGGGTTCCGGGTGTGGAGAGAAGCTCGGGACCGCATCGTGGGCTTCCC 

TGGCCGTTACCACGCATGGGACATCCCCCATCAGTCCTGGCTCTACAACTCCAACTACTCCTGTGAGCTGTCCATGG 

TGCTGACAGGTGCTGCCTTCTTTCACAAGTATTATGCCTACCTGTATTCTTATGTGATGCCCCAGGCCATCCGGGAC 

ATGGTGGATGAATACATCAACTGTGAGGACATTGCCATGAACTTCCTTGTCTCCCACATCACTCGGAAGCCCCCCAT 

CAAGGTGACCTCACGGTGGACATTCCGATGCCCAGGATGCCCTCAGGCCCTGTCTCATGATGACTCCCACTTCCACG 

AGCGGCACAAGTGCATCAACTTCTTCGTGAAGGTGTACGGCTACATGCCCCTCCTGTACACGCAGTTCAGGGTGGAT 

TCTGTGCTCTTCAAGACACGCCTGCCCCATGACAAGACCAAGTGCTTCAAGTTCATCTAGGGGCAGCGCACGGTCTG 

GGGAAGAGGATGAGCAGAGGGAGGAAGATGGCTCCCAAGGTTCCTAGGCATTGCAGGACCTTGGGCACATCTGCTGG 

TGGGTGGCCCAGAGCCTCTGCTGGAAGGGGCAGCAGGAGGAGTGGAAGGAAACCGCTGCCTTTATCTTGAAGTCAGC 

CACACTGGGCCTGGAGCCCTGGGCGGAGTCCCCGGGGTTCCCCACACAGGGCACTGACTGATAGCTTACACTGAGGA 

CTGTGGCGACTCTGCAGAGTCACTCACACCGTTCGTACGCCCAGGACAGCTGGTTCGTGGTTTTTACATTCAATAAC 

AACTATTATGATTATTTAAAAAGAGAAAGTTTCAGATTTGCCATTCAAGGCTTATTTATATATATGTGTGTGTATAT 

AAATACATGCACACACTTGCATACATATATATTTTTGGCTGGGGGAGTGTGAGTTTTGCCTTTCTAAGGGAGGGACC 

GCGCAGGCTCCTTTGTTCTGTATTCTGGCGGAGATGGGTCCTGGCCTTGTGTCACTGGCTTATCCTTAAAGATCATC 

TCCCATCCTCCCCAGCGCCATCTGTGTGCAGCAACCAGAAAGGGATGAACTTGGCCCTCTTGCGGGCCTGGACAAGG 

TCTCTTCCTTACCCTTTCTGTTGCCAGTCAGCAACCTGTAACTCACATTCTCTTCCCAGTGAATCCCTGGGAGCGCC 

TGACCCTGGTGGGCTGTTCAGCTTCCTGCTGCTGGGGCCAGCGATTTTTGAGGATTTATCTTTAGGCCAGGCTTGCC 

TCCGTACTTATCCCTGCTCTCCCATTTCTCTCTTGTTTGAGAGAGAATGAGGAAGCAAAGAGTGAGAAAGAATAGGG 

GCTGAAGACGCCACTCCCAGATGGCTCTTTCTATCCTGCTCTTCTGTTGAAACACACGTGCTGTGGGCCTCAGGCGT 

TTCTGAAGTGCTCTTTCTTGGATTGGACAGGAGATCAGCAGCGTGCACATCTGCTGTGGTCTGAAGTGGTTTGCAGG 

TCAGCCTCCTCTCCCTAGTGTAGAGCAAGCCAGTGTCCTTCGAGGAACCCACCCGGCTGGCCGGGAAGTTTTACAGC 

AAGGCGCCTGCCTTGGGATAATTCCTTGGTGAAATTCACCTTCCCCCCGCCTCTGTCTGGAGCCCCATCCTGTGTTA 

TCTGTGGTTTTTGGACCCCTAATGTCAGCTTGGCTGTAGGACTCCCCGAGGTTTGGTATGTGCTAGAACAATGGGAG 

GCTGTGATTTGCTGTGTAAGCTCACATCCAGCCTTGGAATCTAACGGGCATTCACAACCCGAGTTACCACTTTCCAC 

TCCCTGCTTAGGATTCTGTTCCCTGGGCTGAAACTGAAATAAGCTAATTTTTTGGGTCACGGTGGCAGTAGGGGAAC 

CTAGGAGGGTGTGAGTGGCATTTGTCAGGGATTTAGCCCATGACGTGTTTCTTGAACCCTACTTTCTGGAAGTGGAG 

TTGACTCTGGAAGTTTTCTAGCAACTGAACAAAAGCTCAGGTTTGTCCTGGTCATGCACATGCCTTAAGCCAGTTCC 

GTCTTCCCTAGACCTTGGCATCCTGTGCTTCTATTTCTTGGAATACGTTCTCCTCTGACCTGCCTGTACCACGTGGG 

TCCTCTTCAAGTACTGTTTTGAAGCTGGGCTCTTTTGTGTAGCTCCCACCCACCTGTAGGGCTAGCTCGGCTTAAGG 
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GAACTCTCCCCATTGGCAAACCGGACCCGGCCGCCGCCAGGACTGTGTTTCCAAAGGTTCCCCGCCCCCAACCCCAG 
CATCAGCCTGTAGCTCCCCTGCTGAGGCAGTGTGGTTATGTTCCCAGCAGTGGGGGTCAGACGCCCTTCCTCAGAAC 
TTTCTAGTTGCCCTCTACCTGACTCCTGACTTGTATTCCTTTTAGCAGTAGCCTTCTTCCCTCGGGGAGCCAAAGAG 
TGTGGTGTGTGGCGCTATATTGTGGCTGCTATTTCATCTGGTTTCTTTTAATGTGAGGAACTCACATACTGACTTCA 
GTGGGACTCGGTGAGCCGGGGCCGTCTGTGTGGTGGGACCCCCTTTAGCGGGACTCAGTGAGCTGGGGCCGTCTGTG 
TGGTGGAGCCAGGGCCTCTCCCTTTAGTGGAGCCAGGTTGTCGGGCCCCGAATGTCACTGGTGGATCTAAGAAGGGC 
TGAGTGGTCTGACACCAAAACATGCCGCAGGGAGGGCTGTGGTGCCGGTGCTTCCAACAAGGACAGCCCTCCTTGAC 
CCTGAAAGGAACACTGGCTTGAAGGACTGCAGACAGGCTCTGAGGGGCACGCCCTCCTCAGCGAGAGGCAGCAAGGT 
GGCCACAGTGTCACTGGTCAGGTGCTTCTCACCACGGGAAAGCCGCCGACCTGTGACTCGCTTGAGATGGGAAAGCG 
GCGCCACAGACCCCGGGTCTCCTTGGCTGTCTGTGGGCCGCCCCTGGCCACCTTGTCCTGGCTCGCAGGGTGCAGGA 
GCGCCTCGTTCTCTGGGTGGCCGGCTTGCTGCTCCGGTTTGGGCTGTCTTACCATAACACCGTCCCAGGGCTCTGCA 
GGCCACTGTGAGCGCTGGCTCCCTGGGCAGTGCTCCTCCGTGTGGACTGTGCCTCAGGCCAGGGCTCACCAGCTGGG 
GTCCTGTCCGGAAGGATGGGATCTTTCTGGGAGCTGCGCCGGACAGAGTGGGGAGCTCCTAGTTTGTGGGGGGAAGC 
TTTGATATCCATGCCACGTCCATCCACCCCACCCCTTTTCGTCACGAGCACAATGGTCTTACATTGGATTTTTGTAA 
AAAAATAAAAATAAATGGAGACTTTAACTCAAGCAGC 

<210> SEQ ID NO 60 

<211> Length : 6,333 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 60 
>M7 92 1 7_PEA_1_T3 

GGCTGCAGTGGCCGCCGCTGGAGACCGCGGGACCTGGCGCTGCAGCGAGATTTTCAAACACCGCATTCCCCTTAAAC 
CTCCCAGGCTCCCATCACTCCCAGCAAAGAGCCAGCCCGCTACTTTATGGAGACAGGAGAGACTGTCAGACAGTTCC 
TGCTGCTGCCCAGGTGCATCTGAGAGAGCAAGCCCTGGAGGTTCACTCTTTCAAGAAGTCGTGTGCTGAGGTGTAAT 
GCTACACAAGTCAGAGGAAGGAAGGGTCCTGAAACACATGGCCTGATTGTTGGCAAAGGCATCATAAGAAGCTGGCA 
TTTATTTCTGTTCTAACCTATTACTGTATAACTGTGAATAGACACTATGCATATTTGTTGGTCAGCAAAACCAAGAA 
ACAAGAGCTATGGCATTTGAAAAAGTCTGTCTGATTCCAGGGTGTTTTTCCTGGGTTTCATCATCAGGTACCTCCTC 
CCTTTCATCTCAGCAAGAATGTGGCACCTTTTATCGTTTGATAAAGATTAAGGACATGTTCTTTGGTCAACAGCCAG 
AACTTAAAATCTGCTGGAATAGGGTCAGAGACCATTTCAGCTGCAGCTGAGGAAAATGAAATGTTCATTTTATTTGG 
TGCCTTGTCTGGGGAGCACACTAACTCTTCTGGAAACGTGTCAGTGAAACAGAGATCGTTTTGTGGAATAGCTAACC 
CATGGTTATGGCGAGTGACCCGACGTGATCTGGGGGGCAGGCTGCAGAGGACTCATGACAGGCTATACCATGCTGCG 
GAATGGGGGCGCGGGGAACGGAGGTCAGACCTGCATGCTGCGCTGGTCCAACCGCATCCGCCTCACGTGGCTCAGCT 
TCACGCTCTTTGTCATCCTGGTCTTCTTCCCGCTCATCGCCCACTATTACCTCACCACTCTGGATGAGGCTGATGAG 
GCAGGCAAGCGGATTTTTGGTCCCCGGGTGGGGAACGAGCTGTGCGAGGTGAAGCACGTGCTGGATCTGTGCCGCAT 
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CCGGGAGTCGGTGAGTGAAGAGCTCCTGCAGCTGGAGGCCAAGCGCCAAGAGCTGAACAGCGAGATCGCCAAGCTGA 

ATCTGAAGATCGAAGCCTGTAAGAAGAGCATTGAGAACGCCAAGCAGGACCTGCTCCAGCTCAAGAATGTCATCAGC 

CAGACCGAGCATTCCTACAAGGAGCTCATGGCCCAGAACCAGCCCAAGCTGTCCCTGCCCATCCGACTGCTCCCAGA 

GAAGGACGATGCCGGCCTCCCTCCCCCGAAGGCCACTCGGGGCTGCCGGCTACACAACTGCTTTGATTATTCTCGTT 

GCCCTCTCACCTCTGGCTTCCCGGTCTACGTCTATGACAGTGACCAGTTTGTCTTTGGCAGCTACCTGGATCCCTTG 

GTCAAGCAGGCTTTTCAGGCGACAGCACGAGCTAACGTTTATGTTACAGAAAATGCAGACATCGCCTGCCTTTACGT 

GATACTAGTGGGAGAGATGCAGGAGCCGGTGGTGCTGCGGCCTGCTGAGCTGGAGAAGCAGTTGTATTCCCTGCCAC 

ACTGGCGGACGGATGGACACAACCATGTCATCATCAATCTGTCACGTAAGTCAGATACACAGAACCTTCTCTATAAC 

GTCAGTACTGGCCGTGCCATGGTGGCCCAGTCCACCTTCTACACTGTCCAGTACAGACCTGGCTTTGACTTGGTCGT 

ATCACCGCTGGTCCATGCCATGTCTGAGCCCAACTTCATGGAAATCCCACCACAGGTGCCGGTGAAGCGGAAATATC 

TCTTCACCTTCCAGGGCGAGAAGATTGAGTCTCTGAGGTCTAGCCTTCAGGAGGCCCGCTCCTTCGAAGAGGAAATG 

GAGGGCGACCCTCCCGCCGACTACGATGACCGGATCATTGCCACCCTGAAGGCGGTGCAGGACAGCAAGCTGGATCA 

GGTCCTGGTGGAATTCACCTGCAAAAACCAGCCCAAACCCAGCCTGCCAACTGAGTGGGCACTGTGTGGAGAGCGGG 

AGGACCGCTTGGAATTGCTGAAGCTCTCCACCTTCGCCCTCATCATTACCCCCGGGGACCCTCGCTTGGTTATTTCC 

TCTGGGTGXGCAACACGGCTCTTCGAAGCCCTGGAAGTCGGTGCCGTCCCGGTGGTGCTGGGGGAGCAGGTCCAGCT 

TCCCTACCAGGACATGCTGCAGTGGAACGAGGCGGCCCTGGTGGTGCCAAAGCCTCGTGTTACCGAGGTTCATTTCC 

TGCTCAGAAGCCTCTCCGATAGTGACCTCCTGGCTATGAGGCGGCAAGGCCGCTTTCTCTGGGAGACTTACTTCTCC 

ACTGCTGACAGTATTTTTAATACCGTGCTGGCTATGATTAGGACTCGCATCCAGATCCCAGCCGCTCCCATCCGGGA 

AGAGGCGGCAGCTGAGATCCCCCACCGTTCAGGCAAGGCGGCTGGAACTGACCCCAACATGGCTGACAACGGGGACC 

TGGACCTGGGGCCAGTGGAGACGGAGCCGCCCTACGCCTCACCCAGATACCTCCGCAATTTCACTCTGACTGTCACT 

GACTTTTACCGCAGCTGGAACTGTGCTCCAGGGCCTTTCCATCTTTTCCCCCACACTCCCTTTGACCCTGTGTTGCC 

CTCAGAGGCCAAATTCTTGGGCTCAGGGACTGGCTTTCGGCCTATTGGTGGTGGAGCTGGGGGTTCTGGCAAGGAAT 

TTCAGGCAGCGCTTGGAGGCAATGTTCCCCGAGAGCAGTTCACGGTGGTGATGTTGACTTATGAGCGGGAGGAAGTG 

CTTATGAACTCTTTAGAGAGGCTGAATGGCCTCCCTTACCTGAACAAGGTCGTGGTGGTGTGGAATTCTCCCAAGCT 

GCCATCAGAGGACCTTCTGTGGCCTGACATTGGCGTCCCCATCATGGTGGTCCGTACTGAGAAGAACAGTTTGAACA 

ACCGATTCTTACCCTGGAATGAAATTGAGACAGAGGCCATCCTGTCCATTGATGACGATGCTCACCTCCGCCATGAC 

GAAATCATGTTTGGGTTCCGGGTGTGGAGAGAAGCTCGGGACCGCATCGTGGGCTTCCCTGGCCGTTACCACGCATG 

GGACATCCCCCATCAGTCCTGGCTCTACAACTCCAACTACTCCTGTGAGCTGTCCATGGTGCTGACAGGTGCTGCCT 

TCTTTCACAAGTATTATGCCTACCTGTATTCTTATGTGATGCCCCAGGCCATCCGGGACATGGTGGATGAATACATC 

AACTGTGAGGACATTGCCATGAACTTCCTTGTCTCCCACATCACTCGGAAGCCCCCCATCAAGGTGACCTCACGGTG 

GACATTCCGATGCCCAGGATGCCCTCAGGCCCTGTCTCATGATGACTCCCACTTCCACGAGCGGCACAAGTGCATCA 

ACTTCTTCGTGAAGGTGTACGGCTACATGCCCCTCCTGTACACGCAGTTCAGGGTGGATTCTGTGCTCTTCAAGACA 

CGCCTGCCCCATGACAAGACCAAGTGCTTCAAGTTCATCTAGGGGCAGCGCACGGTCTGGGGAAGAGGATGAGCAGA 

GGGAGGAAGATGGCTCCCAAGGTTCCTAGGCATTGCAGGACCTTGGGCACATCTGCTGGTGGGTGGCCCAGAGCCTC 

TGCTGGAAGGGGCAGCAGGAGGAGTGGAAGGAAACCGCTGCCTTTATCTTGAAGTCAGCCACACTGGGCCTGGAGCC 

CTGGGCGGAGTCCCCGGGGTTCCCCACACAGGGCACTGACTGATAGCTTACACTGAGGACTGTGGCGACTCTGCAGA 

GTCACTCACACCGTTCGTACGCCCAGGACAGCTGGTTCGTGGTTTTTACATTCAATAACAACTATTATGATTATTTA 

AAAAGAGAAAGTTTCAGATTTGCCATTCAAGGCTTATTTATATATATGTGTGTGTATATAAATACATGCACACACTT 
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GCATACATATATATTTTTGGCTGGGGGAGTGTGAGTTTTGCCTTTCTAAGGGAGGGACCGCGCAGGCTCCTTTGTTC 

TGTATTCTGGCGGAGATGGGTCCTGGCCTTGTGTCACTGGCTTATCCTTAAAGATCATCTCCCATCCTCCCCAGCGC 

CATCTGTGTGCAGCAACCAGAAAGGGATGAACTTGGCCCTCTTGCGGGCCTGGACAAGGTCTCTTCCTTACCCTTTC 

TGTTGCCAGTCAGCAACCTGTAACTCACATTCTCTTCCCAGTGAATCCCTGGGAGCGCCTGACCCTGGTGGGCTGTT 

CAGCTTCCTGCTGCTGGGGCCAGCGATTTTTGAGGATTTATCTTTAGGCCAGGCTTGCCTCCGTACTTATCCCTGCT 

CTCCCATTTCTCTCTTGTTTGAGAGAGAAT GAGGAAGCAAAGAGTGAGAAAGAATAGGGGCTGAAGACGCCACTCCC 

AGATGGCTCTTTCTATCCTGCTCTTCTGTTGAAACACACGTGCTGTGGGCCTCAGGCGTTTCTGAAGTGCTCTTTCT 

TGGATTGGACAGGAGATCAGCAGCGTGCACATCTGCTGTGGTCTGAAGTGGTTTGCAGGTCAGCCTCCTCTCCCTAG 

TGTAGAGCAAGCCAGTGTCCTTCGAGGAACCCACCCGGCTGGCCGGGAAGTTTTACAGCAAGGCGCCTGCCTTGGGA 

TAATTCCTTGGTGAAATTCACCTTCCCCCCGCCTCTGTCTGGAGCCCCATCCTGTGTTATCTGTGGTTTTTGGACCC 

CTAATGTCAGCTTGGCTGTAGGACTCCCCGAGGTTTGGTATGTGCTAGAACAATGGGAGGCTGTGATTTGCTGTGTA 

AGCTCACATCCAGCCTTGGAATCTAACGGGCATTCACAACCCGAGTTACCACTTTCCACTCCCTGCTTAGGATTCTG 

TTCCCTGGGCTGAAACTGAAATAAGCTAATTTTTTGGGTCACGGTGGCAGTAGGGGAACCTAGGAGGGTGTGAGTGG 

CATTTGTCAGGGATTTAGCCCATGACGTGTTTCTTGAACCCTACTTTCTGGAAGTGGAGTTGACTCTGGAAGTTTTC 

TAGCAACTGAACAAAAGCTCAGGTTTGTCCTGGTCATGCACATGCCTTAAGCCAGTTCCGTCTTCCCTAGACCTTGG 

CATCCTGTGCTTCTATTTCTTGGAATACGTTCTCCTCTGACCTGCCTGTACCACGTGGGTCCTCTTCAAGTACTGTT 

TTGAAGCTGGGCTCTTTTGTGTAGCTCCCACCCACCTGTAGGGCTAGCTCGGCTTAAGGGAACTCTCCCCATTGGCA 

AACCGGACCCGGCCGCCGCCAGGACTGTGTTTCCAAAGGTTCCCCGCCCCCAACCCCAGCATCAGCCTGTAGCTCCC 

CTGCTGAGGCAGTGTGGTTATGTTCCCAGCAGTGGGGGTCAGACGCCCTTCCTCAGAACTTTCTAGTTGCCCTCTAC 

CTGACTCCTGACTTGTATTCCTTTTAGCAGTAGCCTTCTTCCCTCGGGGAGCCAAAGAGTGTGGTGTGTGGCGCTAT 

ATTGTGGCTGCTATTTCATCTGGTTTCTTTTAATGTGAGGAACTCACATACTGACTTCAGTGGGACTCGGTGAGCCG 

GGGCCGTCTGTGTGGTGGGACCCCCTTTAGCGGGACTCAGTGAGCTGGGGCCGTCTGTGTGGTGGAGCCAGGGCCTC 

TCCCTTTAGTGGAGCCAGGTTGTCGGGCCCCGAATGTCACTGGTGGATCTAAGAAGGGCTGAGTGGTCTGACACCAA 

AACATGCCGCAGGGAGGGCTGTGGTGCCGGTGCTTCCAACAAGGACAGCCCTCCTTGACCCTGAAAGGAACACTGGC 

TTGAAGGACTGCAGACAGGCTCTGAGGGGCACGCCCTCCTCAGCGAGAGGCAGCAAGGTGGCCACAGTGTCACTGGT 

CAGGTGCTTCTCACCACGGGAAAGCCGCCGACCTGTGACTCGCTTGAGATGGGAAAGCGGCGCCACAGACCCCGGGT 

CTCCTTGGCTGTCTGTGGGCCGCCCCTGGCCACCTTGTCCTGGCTCGCAGGGTGCAGGAGCGCCTCGTTCTCTGGGT 

GGCCGGCTTGCTGCTCCGGTTTGGGCTGTCTTACCATAACACCGTCCCAGGGCTCTGCAGGCCACTGTGAGCGCTGG 

CTCCCTGGGCAGTGCTCCTCCGTGTGGACTGTGCCTCAGGCCAGGGCTCACCAGCTGGGGTCCTGTCCGGAAGGATG 

GGATCTTTCTGGGAGCTGCGCCGGACAGAGTGGGGAGCTCCTAGTTTGTGGGGGGAAGCTTTGATATCCATGCCACG 

TCCATCCACCCCACCCCTTTTCGTCACGAGCACAATGGTCTTACATTGGATTTTTGTAAAAAAATAAAAATAAATGG 

AGACTTTAACTCAAGCAGC 

<210> SEQ ID NO 61 
<211> Length : 6,297 
<212> Type : DNA 
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<213> ORGANISM : Homo sapiens 

<400> sequence : 61 
>M7 9217_PEA_1_T8 

GGCGCCCTTCGGCAAGTTCCGCAGTCGCCTGTCGGAAATGGCTGCCGGCCGGCAGGGGGAGCGGCGGATCAGGCGCG 

GCCTGGAAGGCGGGCGGCCGGCAGCCAGAACGGCTTCTGGGACGCCGACTTTCGCGCAGGCGGCGGCGGCGGCGGCG 

GCGGGTCCCTGAGCTGGAAGCCGGAGAGCAAGCCCTGGAGGTTCACTCTTTCAAGAAGTCGTGTGCTGAGGTGTAAT 

GCTACACAAGTCAGAGGAAGGAAGGGTCCTGAAACACATGGCCTGATTGTTGGCAAAGGCATCATAAGAAGCTGGCA 

TTTATTTCTGTTCTAACCTATTACTGTATAACTGTGAATAGACACTATGCATATTTGTTGGTCAGCAAAACCAAGAA 

ACAAGAGCTATGGCATTTGAAAAAGTCTGTCTGATTCCAGGGTGTTTTTCCTGGGTTTCATCATCAGGTACCTCCTC 

CCTTTCATCTCAGCAAGAATGTGGCACCTTTTATCGTTTGATAAAGATTAAGGACATGTTCTTTGGTCAACAGCCAG 

AACTTAAAATCTGCTGGAATAGGGTCAGAGACCATTTCAGCTGCAGCTGAGGAAAATGAAATGTTCATTTTATTTGG 

TGCCTTGTCTGGGGAGCACACTAACTCTTCTGGAAACGTGTCAGTGAAACAGAGATCGTTTTGTGGAATAGCTAACC 

CATGGTTATGGCGAGTGACCCGACGTGATCTGGGGGGCAGGCTGCAGAGGACTCATGACAGGCTATACCATGCTGCG 

GAATGGGGGCGCGGGGAACGGAGGTCAGACCTGCATGCTGCGCTGGTCCAACCGCATCCGCCTCACGTGGCTCAGCT 

TCACGCTCTTTGTCATCCTGGTCTTCTTCCCGCTCATCGCCCACTATTACCTCACCACTCTGGATGAGGCTGATGAG 

GCAGGCAAGCGGATTTTTGGTCCCCGGGTGGGGAACGAGCTGTGCGAGGTGAAGCACGTGCTGGATCTGTGCCGCAT 

CCGGGAGTCGGTGAGTGAAGAGCTCCTGCAGCTGGAGGCCAAGCGCCAAGAGCTGAACAGCGAGATCGCCAAGCTGA 

ATCTGAAGATCGAAGCCTGTAAGAAGAGCATTGAGAACGCCAAGCAGGACCTGCTCCAGCTCAAGAATGTCATCAGC 

CAGACCGAGCATTCCTACAAGGAGCTCATGGCCCAGAACCAGCCCAAGCTGTCCCTGCCCATCCGACTGCTCCCAGA 

GAAGGACGATGCCGGCCTCCCTCCCCCGAAGGCCACTCGGGGCTGCCGGCTACACAACTGCTTTGATTATTCTCGTT 

GCCCTCTCACCTCTGGCTTCCCGGTCTACGTCTATGACAGTGACCAGTTTGTCTTTGGCAGCTACCTGGATCCCTTG 

GTCAAGCAGGCTTTTCAGGCGACAGCACGAGCTAACGTTTATGTTACAGAAAATGCAGACATCGCCTGCCTTTACGT 

GATACTAGTGGGAGAGATGCAGGAGCCGGTGGTGCTGCGGCCTGCTGAGCTGGAGAAGCAGTTGTATTCCCTGCCAC 

ACTGGCGGACGGATGGACACAACCATGTCATCATCAATCTGTCACGTAAGTCAGATACACAGAACCTTCTCTATAAC 

GTCAGTACTGGCCGTGCCATGGTGGCCCAGXCCACCTTCTACACTGTCCAGTACAGACCTGGCTTTGACTTGGTCGT 

ATCACCGCTGGTCCATGCCATGTCTGAGCCCAACTTCATGGAAATCCCACCACAGGTGCCGGTGAAGCGGAAATATC 

TCTTCACCTTCCAGGGCGAGAAGATTGAGTCTCTGAGGTCTAGCCTTCAGGAGGCCCGCTCCTTCGAAGAGGAAATG 

GAGGGCGACCCTCCCGCCGACTACGATGACCGGATCATTGCCACCCTGAAGGCGGTGCAGGACAGCAAGCTGGATCA 

GGTCCTGGTGGAATTCACCTGCAAAAACCAGCCCAAACCCAGCCTGCCAACTGAGTGGGCACTGTGTGGAGAGCGGG 

AGGACCGCTTGGAATTGCTGAAGCTCTCCACCTTCGCCCTCATCATTACCCCCGGGGACCCTCGCTTGGTTATTTCC 

TCTGGGTGTGCAACACGGCTCTTCGAAGCCCTGGAAGTCGGTGCCGTCCCGGTGGTGCTGGGGGAGCAGGTCCAGCT 

TCCCTACCAGGACATGCTGCAGTGGAACGAGGCGGCCCTGGTGGTGCCAAAGCCTCGTGTTACCGAGGTTCATTTCC 

TGCTCAGAAGCCTCTCCGATAGTGACCTCCTGGCTATGAGGCGGCAAGGCCGCTTTCTCTGGGAGACTTACTTCTCC 

ACTGCTGACAGTATTTTTAATACCGTGCTGGCTATGATTAGGACTCGCATCCAGATCCCAGCCGCTCCCATCCGGGA 

AGAGGCGGCAGCTGAGATCCCCCACCGTTCAGGCAAGGCGGCTGGAACTGACCCCAACATGGCTGACAACGGGGACC 

TGGACCTGGGGCCAGTGGAGACGGAGCCGCCCTACGCCTCACCCAGATACCTCCGCAATTTCACTCTGACTGTCACT 
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GACTTTTACCGCAGCTGGAACTGTGCTCCAGGGCCTTTCCATCTTTTCCCCCACACTCCCTTTGACCCTGTGTTGCC 
CTCAGAGGCCAAATTCTTGGGCTCAGGGACTGGCTTTCGGCCTATTGGTGGTGGAGCTGGGGGTTCTGGCAAGGAAT 
TTCAGGCAGCGCTTGGAGGCAATGTTCCCCGAGAGCAGTTCACGGTGGTGATGTTGACTTATGAGCGGGAGGAAGTG 
CTTATGAACTCTTTAGAGAGGCTGAATGGCCTCCCTTACCTGAACAAGGTCGTGGTGGTGTGGAATTCTCCCAAGCT 
GCCATCAGAGGACCTTCTGTGGCCTGACATTGGCGTCCCCATCATGGTGGTCCGTACTGAGAAGAACAGTTTGAACA 
ACCGATTCTTACCCTGGAATGAAATTGAGACAGAGGCCATCCTGTCCATTGATGACGATGCTCACCTCCGCCATGAC 
GAAATCATGTTTGGGTTCCGGGTGTGGAGAGAAGCTCGGGACCGCATCGTGGGCTTCCCTGGCCGTTACCACGCATG 
GGACATCCCCCATCAGTCCTGGCTCTACAACTCCAACTACTCCTGTGAGCTGTCCATGGTGCTGACAGGTGCTGCCT 
TCTTTCACAAGGCCATCCGGGACATGGTGGATGAATACATCAACTGTGAGGACATTGCCATGAACTTCCTTGTCTCC 
CACATCACTCGGAAGCCCCCCATCAAGGTGACCTCACGGTGGACATTCCGATGCCCAGGATGCCCTCAGGCCCTGTC 
TCATGATGACTCCCACTTCCACGAGCGGCACAAGTGCATCAACTTCTTCGTGAAGGTGTACGGCTACATGCCCCTCC 
TGTACACGCAGTTCAGGGTGGATTCTGTGCTCTTCAAGACACGCCTGCCCCATGACAAGACCAAGTGCTTCAAGTTC 
ATCTAGGGGCAGCGCACGGTCTGGGGAAGAGGATGAGCAGAGGGAGGAAGATGGCTCCCAAGGTTCCTAGGCATTGC 
AGGACCTTGGGCACATCTGCTGGTGGGTGGCCCAGAGCCTCTGCTGGAAGGGGCAGCAGGAGGAGTGGAAGGAAACC 
GCTGCCTTTATCTTGAAGTCAGCCACACTGGGCCTGGAGCCCTGGGCGGAGTCCCCGGGGTTCCCCACACAGGGCAC 
TGACTGATAGCTTACACTGAGGACTGTGGCGACTCTGCAGAGTCACTCACACCGTTCGTACGCCCAGGACAGCTGGT 
TCGTGGTTTTTACATTCAATAACAACTATTATGATTATTTAAAAAGAGAAAGTTTCAGATTTGCCATTCAAGGCTTA 
TTTATATATATGTGTGTGTATATAAATACATGCACACACTTGCATACATATATATTTTTGGCTGGGGGAGTGTGAGT 
TTTGCCTTTCTAAGGGAGGGACCGCGCAGGCTCCTTTGTTCTGTATTCTGGCGGAGATGGGTCCTGGCCTTGTGTCA 
CTGGCTTATCCTTAAAGATCATCTCCCATCCTCCCCAGCGCCATCTGTGTGCAGCAACCAGAAAGGGATGAACTTGG 
CCCTCTTGCGGGCCTGGACAAGGTCTCTTCCTTACCCTTTCTGTTGCCAGTCAGCAACCTGTAACTCACATTCTCTT 
CCCAGTGAATCCCTGGGAGCGCCTGACCCTGGTGGGCTGTTCAGCTTCCTGCTGCTGGGGCCAGCGATTTTTGAGGA 
TTTATCTTTAGGCCAGGCTTGCCTCCGTACTTATCCCTGCTCTCCCATTTCTCTCTTGTTTGAGAGAGAATGAGGAA 
GCAAAGAGTGAGAAAGAATAGGGGCTGAAGACGCCACTCCCAGATGGCTCTTTCTATCCTGCTCTTCTGTTGAAACA 
CACGTGCTGTGGGCCTCAGGCGTTTCTGAAGTGCTCTTTCTTGGATTGGACAGGAGATCAGCAGCGTGCACATCTGC 
TGTGGTCTGAAGTGGTTTGCAGGTCAGCCTCCTCTCCCTAGTGTAGAGCAAGCCAGTGTCCTTCGAGGAACCCACCC 
GGCTGGCCGGGAAGTTTTACAGCAAGGCGCCTGCCTTGGGATAATTCCTTGGTGAAATTCACCTTCCCCCCGCCTCT 
GTCTGGAGCCCCATCCTGTGTTATCTGTGGTTTTTGGACCCCTAATGTCAGCTTGGCTGTAGGACTCCCCGAGGTTT 
GGTATGTGCTAGAACAATGGGAGGCTGTGATTTGCTGTGTAAGCTCACATCCAGCCTTGGAATCTAACGGGCATTCA 
CAACCCGAGTTACCACTTTCCACTCCCTGCTTAGGATTCTGTTCCCTGGGCTGAAACTGAAATAAGCTAATTTTTTG 
GGTCACGGTGGCAGTAGGGGAACCTAGGAGGGTGTGAGTGGCATTTGTCAGGGATTTAGCCCATGACGTGTTTCTTG 
AACCCTACTTTCTGGAAGTGGAGTTGACTCTGGAAGTTTTCTAGCAACTGAACAAAAGCTCAGGTTTGTCCTGGTCA 
TGCACATGCCTTAAGCCAGTTCCGTCTTCCCTAGACCTTGGCATCCTGTGCTTCTATTTCTTGGAATACGTTCTCCT 
CTGACCTGCCTGTACCACGTGGGTCCTCTTCAAGTACTGTTTTGAAGCTGGGCTCTTTTGTGTAGCTCCCACCCACC 
TGTAGGGCTAGCTCGGCTTAAGGGAACTCTCCCCATTGGCAAACCGGACCCGGCCGCCGCCAGGACTGTGTTTCCAA 
AGGTTCCCCGCCCCCAACCCCAGCATCAGCCTGTAGCTCCCCTGCTGAGGCAGTGTGGTTATGTTCCCAGCAGTGGG 
GGTCAGACGCCCTTCCTCAGAACTTTCTAGTTGCCCTCTACCTGACTCCTGACTTGTATTCCTTTTAGCAGTAGCCT 
TCTTCCCTCGGGGAGCCAAAGAGTGTGGTGTGTGGCGCTATATTGTGGCTGCTATTTCATCTGGTTTCTTTTAATGT 
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GAGGAACTCACATACTGACTTCAGTGGGACTCGGTGAGCCGGGGCCGTCTGTGTGGTGGGACCCCCTTTAGCGGGAC 
TCAGTGAGCTGGGGCCGTCTGTGTGGTGGAGCCAGGGCCTCTCCCTTTAGTGGAGCCAGGTTGTCGGGCCCCGAATG 
TCACTGGTGGATCTAAGAAGGGCTGAGTGGTCTGACACCAAAACATGCCGCAGGGAGGGCTGTGGTGCCGGTGCTTC 
CAACAAGGACAGCCCTCCTTGACCCTGAAAGGAACACTGGCTTGAAGGACTGCAGACAGGCTCTGAGGGGCACGCCC 
TCCTCAGCGAGAGGCAGCAAGGTGGCCACAGTGTCACTGGTCAGGTGCTTCTCACCACGGGAAAGCCGCCGACCTGT 
GACTCGCTTGAGATGGGAAAGCGGCGCCACAGACCCCGGGTCTCCTTGGCTGTCTGTGGGCCGCCCCTGGCCACCTT 
GTCCTGGCTCGCAGGGTGCAGGAGCGCCTCGTTCTCTGGGTGGCCGGCTTGCTGCTCCGGTTTGGGCTGTCTTACCA 
TAACACCGTCCCAGGGCTCTGCAGGCCACTGTGAGCGCTGGCTCCCTGGGCAGTGCTCCTCCGTGTGGACTGTGCCT 
CAGGCCAGGGCTCACCAGCTGGGGTCCTGTCCGGAAGGATGGGATCTTTCTGGGAGCTGCGCCGGACAGAGTGGGGA 
GCTCCTAGTTTGTGGGGGGAAGCTTTGATATCCATGCCACGTCCATCCACCCCACCCCTTTTCGTCACGAGCACAAT 
GGTCTTACATTGGATTTTTGTAAAAAAATAAAAATAAATGGAGACTTTAACTCAAGCAGC 

<210> SEQ ID NO 62 

<211> Length : 3,466 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 62 
>M7 9217_PEA_1_T10 

TCCTGAGCTCAGGCAACCTGCCCGCCTCGGCCTCCCAGAGTGCTGGGATTACAGGCATGAGCCACGGTGCCCAGCCC 
AGATGGGTTCTCACTTTATTGTTCAAGCTGGTCTGAAACTCCTGGCCTCGAGCAAGCCTCCCAAGTGCTGGGATTAC 
AGGGTGTGGAGAGAAGCTCGGGACCGCATCGTGGGCTTCCCTGGCCGTTACCACGCATGGGACATCCCCCATCAGTC 
CTGGCTCTACAACTCCAACTACTCCTGTGAGCTGTCCATGGTGCTGACAGGTGCTGCCTTCTTTCACAAGTATTATG 
CCTACCTGTATTCTTATGTGATGCCCCAGGCCATCCGGGACATGGTGGATGAATACATCAACTGTGAGGACATTGCC 
ATGAACTTCCTTGTCTCCCACATCACTCGGAAGCCCCCCATCAAGGTGACCTCACGGTGGACATTCCGATGCCCAGG 
ATGCCCTCAGGCCCTGTCTCATGATGACTCCCACTTCCACGAGCGGCACAAGTGCATCAACTTCTTCGTGAAGGTGT 
ACGGCTACATGCCCCTCCTGTACACGCAGTTCAGGGTGGATTCTGTGCTCTTCAAGACACGCCTGCCCCATGACAAG 
ACCAAGTGCTTCAAGTTCATCTAGGGGCAGCGCACGGTCTGGGGAAGAGGATGAGCAGAGGGAGGAAGATGGCTCCC 
AAGGTTCCTAGGCATTGCAGGACCTTGGGCACATCTGCTGGTGGGTGGCCCAGAGCCTCTGCTGGAAGGGGCAGCAG 
GAGGAGTGGAAGGAAACCGCTGCCTTTATCTTGAAGTCAGCCACACTGGGCCTGGAGCCCTGGGCGGAGTCCCCGGG 
GTTCCCCACACAGGGCACTGACTGATAGCTTACACTGAGGACTGTGGCGACTCTGCAGAGTCACTCACACCGTTCGT 
ACGCCCAGGACAGCTGGTTCGTGGTTTTTACATTCAATAACAACTATTATGATTATTTAAAAAGAGAAAGTTTCAGA 
TTTGCCATTCAAGGCTTATTTATATATATGTGTGTGTATATAAATACATGCACACACTTGCATACATATATATTTTT 
GGCTGGGGGAGTGTGAGTTTTGCCTTTCTAAGGGAGGGACCGCGCAGGCTCCTTTGTTCTGTATTCTGGCGGAGATG 
GGTCCTGGCCTTGTGTCACTGGCTTATCCTTAAAGATCATCTCCCATCCTCCCCAGCGCCATCTGTGTGCAGCAACC 
AGAAAGGGAXGAACTTGGCCCTCTTGCGGGCCTGGACAAGGTCTCTTCCTTACCCTTTCTGTTGCCAGTCAGCAACC 
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TGTAACTCACATTCTCTTCCCAGTGAATCCCTGGGAGCGCCTGACCCTGGTGGGCTGTTCAGCTTCCTGCTGCTGGG 
GCCAGCGATTTTTGAGGATTTATCTTTAGGCCAGGCTTGCCTCCGTACTTATCCCTGCTCTCCCATTTCTCTCTTGT 
TTGAGAGAGAATGAGGAAGCAAAGAGTGAGAAAGAATAGGGGCTGAAGACGCCACTCCCAGATGGCTCTTTCTATCC 
TGCTCTTCTGTTGAAACACACGTGCTGTGGGCCTCAGGCGTTTCTGAAGTGCTCTTTCTTGGATTGGACAGGAGATC 
AGCAGCGTGCACATCTGCTGTGGTCTGAAGTGGTTTGCAGGTCAGCCTCCTCTCCCTAGTGTAGAGCAAGCCAGTGT 
CCTTCGAGGAACCCACCCGGCTGGCCGGGAAGTTTTACAGCAAGGCGCCTGCCTTGGGATAATTCCTTGGTGAAATT 
CACCTTCCCCCCGCCTCTGTCTGGAGCCCCATCCTGTGTTATCTGTGGTTTTTGGACCCCTAATGTCAGCTTGGCTG 
TAGGACTCCCCGAGGTTTGGTATGTGCTAGAACAATGGGAGGCTGTGATTTGCTGTGTAAGCTCACATCCAGCCTTG 
GAATCTAACGGGCATTCACAACCCGAGTTACCACTTTCCACTCCCTGCTTAGGATTCTGTTCCCTGGGCTGAAACTG 
AAATAAGCTAATTTTTTGGGTCACGGTGGCAGTAGGGGAACCTAGGAGGGTGTGAGTGGCATTTGTCAGGGATTTAG 
CCCATGACGTGTTTCTTGAACCCTACTTTCTGGAAGTGGAGTTGACTCTGGAAGTTTTCTAGCAACTGAACAAAAGC 
TCAGGTTTGTCCTGGTCATGCACATGCCTTAAGCCAGTTCCGTCTTCCCTAGACCTTGGCATCCTGTGCTTCTATTT 
CTTGGAATACGTTCTCCTCTGACCTGCCTGTACCACGTGGGTCCTCTTCAAGTACTGTTTTGAAGCTGGGCTCTTTT 
GTGTAGCTCCCACCCACCTGTAGGGCTAGCTCGGCTTAAGGGAACTCTCCCCATTGGCAAACCGGACCCGGCCGCCG 
CCAGGACTGTGTTTCCAAAGGTTCCCCGCCCCCAACCCCAGCATCAGCCTGTAGCTCCCCTGCTGAGGCAGTGTGGT 
TATGTTCCCAGCAGTGGGGGTCAGACGCCCTTCCTCAGAACTTTCTAGTTGCCCTCTACCTGACTCCTGACTTGTAT 
TCCTTTTAGCAGTAGCCTTCTTCCCTCGGGGAGCCAAAGAGTGTGGTGTGTGGCGCTATATTGTGGCTGCTATTTCA 
TCTGGTTTCTTTTAATGTGAGGAACTCACATACTGACTTCAGTGGGACTCGGTGAGCCGGGGCCGTCTGTGTGGTGG 
GACCCCCTTTAGCGGGACTCAGTGAGCTGGGGCCGTCTGTGTGGTGGAGCCAGGGCCTCTCCCTTTAGTGGAGCCAG 
GTTGTCGGGCCCCGAATGTCACTGGTGGATCTAAGAAGGGCTGAGTGGTCTGACACCAAAACATGCCGCAGGGAGGG 
CTGTGGTGCCGGTGCTTCCAACAAGGACAGCCCTCCTTGACCCTGAAAGGAACACTGGCTTGAAGGACTGCAGACAG 
GCTCTGAGGGGCACGCCCTCCTCAGCGAGAGGCAGCAAGGTGGCCACAGTGTCACTGGTCAGGTGCTTCTCACCACG 
GGAAAGCCGCCGACCTGTGACTCGCTTGAGATGGGAAAGCGGCGCCACAGACCCCGGGTCTCCTTGGCTGTCTGTGG 
GCCGCCCCTGGCCACCTTGTCCTGGCTCGCAGGGTGCAGGAGCGCCTCGTTCTCTGGGTGGCCGGCTTGCTGCTCCG 
GTTTGGGCTGTCTTACCATAACACCGTCCCAGGGCTCTGCAGGCCACTGTGAGCGCTGGCTCCCTGGGCAGTGCTCC 
TCCGTGTGGACTGTGCCTCAGGCCAGGGCTCACCAGCTGGGGTCCTGTCCGGAAGGATGGGATCTTTCTGGGAGCTG 
CGCCGGACAGAGTGGGGAGCTCCTAGTTTGTGGGGGGAAGCTTTGATATCCATGCCACGTCCATCCACCCCACCCCT 
TTTCGTCACGAGCACAATGGTCTTACATTGGATTTTTGTAAAAAAATAAAAATAAATGGAGACTTTAACTCAAGCAG 
C 

<210> SEQ ID NO 63 

<211> Length : 3,580 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 63 
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>M7 9 2 1 7_PE A_1__T 1 5 

GGCGCCCTTCGGCAAGTTCCGCAGTCGCCTGTCGGAAATGGCTGCCGGCCGGCAGGGGGAGCGGCGGATCAGGCGCG 
GCCTGGAAGGCGGGCGGCCGGCAGCCAGAACGGCTTCTGGGACGCCGACTTTCGCGCAGGCGGCGGCGGCGGCGGCG 
GCGGGTCCCTGAGCTGGAAGCCGGAGAGCAAGCCCTGGAGGTTCACTCTTTCAAGAAGTCGTGTGCTGAGGTGTAAT 
GCTACACAAGTCAGAGGAAGGAAGGGTCCTGAAACACATGGCCTGATTGTTGGCAAAGGCATCATAAGAAGCTGGCA 
TTTATTTCTGTTCTAACCTATTACTGTATAACTGTGAATAGACACTATGCATATTTGTTGGTCAGCAAAACCAAGAA 
ACAAGAGCTATGGCATTTGAAAAAGTCTGTCTGATTCCAGGGTGTTTTTCCTGGGTTTCATCATCAGGTACCTCCTC 
CCTTTCATCTCAGCAAGAATGTGGCACCTTTTATCGTTTGATAAAGATTAAGGACATGTTCTTTGGTCAACAGCCAG 
AACTTAAAATCTGCTGGAATAGGGTCAGAGACCATTTCAGCTGCAGCTGAGGAAAATGAAATGTTCATTTTATTTGG 
TGCCTTGTCTGGGGAGCACACTAACTCTTCTGGAAACGTGTCAGTGAAACAGAGATCGTTTTGTGGAATAGCTAACC 
CATGGTTATGGCGAGTGACCCGACGTGATCTGGGGGGCAGGCTGCAGAGGACTCATGACAGGCTATACCATGCTGCG 
GAATGGGGGCGCGGGGAACGGAGGTCAGACCTGCATGCTGCGCTGGTCCAACCGCATCCGCCTCACGTGGCTCAGCT 
TCACGCTCTTTGTCATCCTGGTCTTCTTCCCGCTCATCGCCCACTATTACCTCACCACTCTGGATGAGGCTGATGAG 
GCAGGCAAGCGGATTTTTGGTCCCCGGGTGGGGAACGAGCTGTGCGAGGTGAAGCACGTGCTGGATCTGTGCCGCAT 
CCGGGAGTCGGTGAGTGAAGAGCTCCTGCAGCTGGAGGCCAAGCGCCAAGAGCTGAACAGCGAGATCGCCAAGCTGA 
ATCTGAAGATCGAAGCCTGTAAGAAGAGCATTGAGAACGCCAAGCAGGACCTGCTCCAGCTCAAGAATGTCATCAGC 
CAGACCGAGCATTCCTACAAGGAGCTCATGGCCCAGAACCAGCCCAAGCTGTCCCTGCCCATCCGACTGCTCCCAGA 
GAAGGACGATGCCGGCCTCCCTCCCCCGAAGGCGACTCGGGGCTGCCGGCTACACAACTGCTTTGATTATTCTCGTT 
GCCCTCTCACCTCTGGCTTCCCGGTCTACGTCTATGACAGTGACCAGTTTGTCTTTGGCAGCTACCTGGATCCCTTG 
GTCAAGCAGGCTTTTCAGGCGACAGCACGAGCTAACGTTTATGTTACAGAAAATGCAGACATCGCCTGCCTTTACGT 
GATACTAGTGGGAGAGATGCAGGAGCCGGTGGTGCTGCGGCCTGCTGAGCTGGAGAAGCAGTTGTATTCCCTGCCAC 
ACTGGCGGACGGATGGACACAACCATGTCATCATCAATCTGTCACGTAAGTCAGATACACAGAACCTTCTCTATAAC 
GTCAGTACTGGCCGTGCCATGGTGGCCCAGTCCACCTTCTACACTGTCCAGTACAGACCTGGCTTTGACTTGGTCGT 
ATCACCGCTGGTCCATGCCATGTCTGAGCCCAACTTCATGGAAATCCCACCACAGGTGCCGGTGAAGCGGAAATATC 
TCTTCACCTTCCAGGGCGAGAAGATTGAGTCTCTGAGGTCTAGCCTTCAGGAGGCCCGCTCCTTCGAAGAGGAAATG 
GAGGGCGACCCTCCCGCCGACTACGATGACCGGATCATTGCCACCCTGAAGGCGGTGCAGGACAGCAAGCTGGATCA 
GGTCCTGGTGGAATTCACCTGCAAAAACCAGCCCAAACCCAGCCTGCCAACTGAGTGGGCACTGTGTGGAGAGCGGG 
AGGACCGCTTGGAATTGCTGAAGCTCTCCACCTTCGCCCTCATCATTACCCCCGGGGACCCTCGCTTGGTTATTTCC 
TCTGGGTGTGCAACACGGCTCTTCGAAGCCCTGGAAGTCGGTGCCGTCCCGGTGGTGCTGGGGGAGCAGGTCCAGCT 
TCCCTACCAGGACATGCTGCAGTGGAACGAGGCGGCCCTGGTGGTGCCAAAGCCTCGTGTTACCGAGGTTCATTTCC 
TGCTCAGAAGCCTCTCCGATAGTGACCTCCTGGCTATGAGGCGGCAAGGCCGCTTTCTCTGGGAGACTTACTTCTCC 
ACTGCTGACAGTATTTTTAATACCGTGCTGGCTATGATTAGGACTCGCATCCAGATCCCAGCCGCTCCCATCCGGGA 
AGAGGCGGCAGCTGAGATCCCCCACCGTTCAGGCAAGGCGGCTGGAACTGACCCCAACATGGCTGACAACGGGGACC 
TGGACCTGGGGCCAGTGGAGACGGAGCCGCCCTACGCCTCACCCAGATACCTCCGCAATTTCACTCTGACTGTCACT 
GACTTTTACCGCAGCTGGAACTGTGCTCCAGGGCCTTTCCATCTTTTCCCCCACACTCCCTTTGACCCTGTGTTGCC 
CTCAGAGGCCAAATTCTTGGGCTCAGGGACTGGCTTTCGGCCTATTGGTGGTGGAGCTGGGGGTTCTGGCAAGGAAT 
TTCAGGCAGCGCTTGGAGGCAATGTTCCCCGAGAGCAGTTCACGGTGGTGATGTTGACTTATGAGCGGGAGGAAGTG 
CTTATGAACTCTTTAGAGAGGCTGAATGGCCTCCCTTACCTGAACAAGGTCGTGGTGGTGTGGAATTCTCCCAAGCT 
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GCCATCAGAGGACCTTCTGTGGCCTGACATTGGCGTCCCCATCATGGTGGTCCGTACTGAGAAGAACAGTTTGAACA 
ACCGATTCTTACCCTGGAATGAAATTGAGACAGAGGCCATCCTGTCCATTGATGACGATGCTCACCTCCGCCATGAC 
GAAATCATGTTTGGGTTCCGGGTGTGGAGAGAAGCTCGGGACCGCATCGTGGGCTTCCCTGGCCGTTACCACGCATG 
GGACATCCCCCATCAGTCCTGGCTCTACAACTCCAACTACTCCTGTGAGCTGTCCATGGTGCTGACAGGTGCTGCCT 
TCTTTCACAAGGTAAGAAAAAGCTGGTAATAATGGCATCGACTTGGTGAGAGTTTCACCTTTGTGTGGTAGCGGAAT 
GCTGCCCTCAGCTTAGCTCTCCTAACGCTTCTTACATGTTTCTTTTGTGCTAGAAGTCAGTTTTTTCTATTTTTACA 
GACAATGATCAAGATGCTTAGAGCAACTCTGGGATAAAAAGTCAAGATGAGAGGGCTGCCTGTACAGTTGCACATAG 
GCCATTTGGAAACCACTTTATCTTTCTGGGCGTTGGCTCTCCGTTTGTAAAACTGAGGGCACTGGGCTAAAGACACC 
TCAAATACCTTCCAGTTTTAACACTGCCACCCTAGATATGGCCCAGCCATCAGAAGGTGACCTGGGCACTTTTCTGA 
CTTAGATATACCATGCCTGTCCCGGGCCCCACGATGAG 

<210> SEQ ID NO 64 

<2ia> Length : 1,786 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 64 
>M7 921 7_PE A__l JT1 8 

GGCGCCCTTCGGCAAGTTCCGCAGTCGCCTGTCGGAAATGGCTGCCGGCCGGCAGGGGGAGCGGCGGATCAGGCGCG 
GCCTGGAAGGCGGGCGGCCGGCAGCCAGAACGGCTTCTGGGACGCCGACTTTCGCGCAGGCGGCGGCGGCGGCGGCG 
GCGGGTCCCTGAGCTGGAAGCCGGAGAGCAAGCCCTGGAGGTTCACTCTTTCAAGAAGTCGTGTGCTGAGGTGTAAT 
GCTACACAAGTCAGAGGAAGGAAGGGTCCTGAAACACATGGCCTGATTGTTGGCAAAGGCATCATAAGAAGCTGGCA 
TTTATTTCTGTTCTAACCTATTACTGTATAACTGTGAATAGACACTATGCATATTTGTTGGTCAGCAAAACCAAGAA 
ACAAGAGCTATGGCATTTGAAAAAGTCTGTCTGATTCCAGGGTGTTTTTCCTGGGTTTCATCATCAGGTACCTCCTC 
CCTTTCATCTCAGCAAGAATGTGGCACCTTTTATCGTTTGATAAAGATTAAGGACATGTTCTTTGGTCAACAGCCAG 
AACTTAAAATCTGCTGGAATAGGGTCAGAGACCATTTCAGCTGCAGCTGAGGAAAATGAAATGTTCATTTTATTTGG 
TGCCTTGTCTGGGGAGCACACTAACTCTTCTGGAAACGTGTCAGTGAAACAGAGATCGTTTTGTGGAATAGCTAACC 
CATGGTTATGGCGAGTGACCCGACGTGATCTGGGGGGCAGGCTGCAGAGGACTCATGACAGCCTGTAGCTCCCCTGC 
TGAGGCAGTGTGGTTATGTTCCCAGCAGTGGGGGTCAGACGCCCTTCCTCAGAACTTTCTAGTTGCCCTCTACCTGA 
CTCCTGACTTGTATTCCTTTTAGCAGTAGCCTTCTTCCCTCGGGGAGCCAAAGAGTGTGGTGTGTGGCGCTATATTG 
TGGCTGCTATTTCATCTGGTTTCTTTTAATGTGAGGAACTCACATACTGACTTCAGTGGGACTCGGTGAGCCGGGGC 
CGTCTGTGTGGTGGGACCCCCTTTAGCGGGACTCAGTGAGCTGGGGCCGTCTGTGTGGTGGAGCCAGGGCCTCTCCC 
TTTAGTGGAGCCAGGTTGTCGGGCCCCGAATGTCACTGGTGGATCTAAGAAGGGCTGAGTGGTCTGACACCAAAACA 
TGCCGCAGGGAGGGCTGTGGTGCCGGTGCTTCCAACAAGGACAGCCCTCCTTGACCCTGAAAGGAACACTGGCTTGA 
AGGACTGCAGACAGGCTCTGAGGGGCACGCCCTCCTCAGCGAGAGGCAGCAAGGTGGCCACAGTGTCACTGGTCAGG 
TGCTTCTCACCACGGGAAAGCCGCCGACCTGTGACTCGCTTGAGATGGGAAAGCGGCGCCACAGACCCCGGGTCTCC 
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TTGGCTGTCTGTGGGCCGCCCCTGGCCACCTTGTCCTGGCTCGCAGGGTGCAGGAGCGCCTCGTTCTCTGGGTGGCC 
GGCTTGCTGCTCCGGTTTGGGCTGTCTTACCATAACACCGTCCCAGGGCTCTGCAGGCCACTGTGAGCGCTGGCTCC 
CTGGGCAGTGCTCCTCCGTGTGGACTGTGCCTCAGGCCAGGGCTCACCAGCTGGGGTCCTGTCCGGAAGGATGGGAT 
CTTTCTGGGAGCTGCGCCGGACAGAGTGGGGAGCTCCTAGTTTGTGGGGGGAAGCTTTGATATCCATGCCACGTCCA 
TCCACCCCACCCCTTTTCGTCACGAGCACAATGGTCTTACATTGGATTTTTGTAAAAAAATAAAAATAAATGGAGAC 
TTTAACTCAAGCAGC 

<210> SEQ ID NO 65 

<211> Length : 7, 128 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 65 
>M62 0 9 6_PEA_1_T4 

CGGCAGAGCGCGCTGGTGCTGATGCAGGATGGCTGAGCGCGCAGGAGCCCGGGAGGTCTGAGCCGGGCGAGGCTCGC 
TCCCTGCGCATCGCCTCCTCCGCCCGCCGCGTGGTCGCGGGCAGGTGGGCCGGGGGGCGCTGGGCAGGGGCGGGGCA 
GGGCCAGGGCAGGCCGGTCTGCAGCCGGAGGGGCCGGAGCGGAGAAGCTGCCCACCTTCCCGGGCTCGGAGCGGCCG 
GGGCTGCTCAGCCGGCCGGGCTCGCGATGACCTGCTGAGAAGCGTCGTCGGAGGCTGCAGGAGGCGGCCTAGCTGTG 
GGCGGTGCAGCTCGCGGCCTCCTCCCTCGTCGTTCCCGGCCCCGGCCCCCCACCCATCCCCGTGCCCCCTCCCTACC 
GCCGGCCGAGATGGCGGATCCAGCCGAATGCAGCATCAAAGTGATGTGCCGGTTCCGGCCCCTCAACGAAGCGGAGA 
TCCTCCGCGGGGACAAATTCATCCCCAAATTTAAAGGCGATGAGACCGTGGTGATCGGGCAAGGGAAGCCATATGTC 
TTCGACAGAGTGCTACCTCCCAACACGACCCAAGAGCAGGTTTACAATGCATGTGCGAAGCAAATTGTCAAAGATGT 
CCTTGAAGGTTATAACGGGACGATTTTTGCGTATGGGCAGACTTCATCAGGAAAAACCCACACCATGGAGGGGAAGC 
TGCATGACCCCCAGCTCATGGGGATCATCCCACGAATTGCCCATGATATCTTTGACCATATCTACTCCATGGATGAG 
AACCTGGAGTTTCACATAAAGGTTTCCTATTTTGAGATCTACTTGGACAAAATAAGGGACTTACTTGATGTATCCAA 
GACCAACTTGGCTGTTCATGAAGATAAAAACAGAGTCCCGTATGTAAAGGGGTGCACTGAGCGGTTTGTGTCGAGCC 
CTGAGGAAGTCATGGATGTAATAGATGAAGGCAAAGCAAACCGACACGTGGCTGTGACAAACATGAATGAACACAGC 
TCTAGAAGTCACAGTATCTTCCTGATAAATATTAAACAAGAGAATGTAGAGACTGAAAAAAAACTCAGTGGGAAACT 
TTATTTGGTTGATTTGGCTGGGAGCGAAAAGGTCAGCAAAACTGGTGCCGAGGGAGCTGTTCTTGACGAAGCTAAAA 
ATATCAATAAGTCTTTGTCTGCTCTTGGAAATGTGATCTCTGCTTTGGCAGAAGGGACAAAAACACATGTGCCATAC 
CGGGACAGCAAGATGACTCGGATTCTTCAGGACTCTTTGGGTGGGAACTGCAGAACCACCATCGTCATTTGCTGTTC 
TCCTTCTGTCTTCAATGAGGCTGAGACCAAGTCCACACTGATGTTCGGACAGAGAGCTAAGACCATCAAGAATACAG 
TCTCTGTGAACCTAGAACTGACAGCAGAAGAATGGAAGAAGAAATATGAAAAAGAGAAAGAGAAAAACAAGACTTTG 
AAGAATGTTATCCAGCATCTGGAGATGGAGCTAAACAGGTGGAGGAATGATTTTCTAGCAGCACACGTGTTTGGAAA 
GCTACTAGAATAATTGAATAATTCAGCACCTGAGGCTGGTGGATGATTCTTTGCAATTTGGCAGGAATGGGAGAGTC 
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GGGAGCAGTAGTTGGCAAGGTGGGGAGTAGCCATATGAAGTTTTATTTCGGGAATCCTCCAGGAGAAGCTGTGCCTG 
AGGATGAACAGATCAGTGCCAAGGACCAGAAGAACCTGGAGCCTTGTGATAACACCCCCATCATAGACAATATTGCT 
CCTGTTGTTGCTGGCATCTCTACAGAGGAGAAAGAGAAGTACGATGAGGAGATCTCCAGTCTCTACAGACAACTGGA 
TGACAAGGATGATGAAATTAACCAGCAGAGCCAGCTGGCTGAAAAGCTGAAGCAACAGATGTTGGATCAGGATGAGC 
TTTTAGCTTCCACAAGAAGAGACTATGAGAAGATACAGGAGGAGCTGACACGTCTCCAGATTGAAAATGAGGCAGCC 
AAGGATGAGGTGAAAGAAGTTCTCCAGGCCCTGGAGGAGCTGGCTGTCAATTATGACCAGAAATCACAGGAAGTGGA 
GGATAAGACCCGGGCCAATGAGCAGCTGACAGACGAGCTGGCCCAGAAAACGACTACATTGACAACCACACAGAGAG 
AGCTGAGCCAGCTACAAGAGCTTAGCAACCACCAGAAGAAAAGGGCAACTGAGATCCTGAATTTGCTGTTGAAAGAT 
CTGGGGGAGATAGGTGGAATTATTGGCACCAATGATGTGAAAACTTTGGCAGATGTGAATGGAGTCATTGAGGAGGA 
GTTTACCATGGCCCGCCTGTACATCAGCAAGATGAAGTCAGAGGTCAAGTCCCTGGTGAACCGCAGCAAACAGCTCG 
AGAGCGCCCAGATGGACTCCAACAGGAAGATGAATGCCAGCGAGCGGGAGCTGGCAGCCTGCCAGCTGCTCATCTCC 
CAGCACGAAGCCAAGATCAAGTCTCTGACAGACTACATGCAGAACATGGAACAGAAGAGGAGGCAGCTAGAAGAGTC 
CCAGGACTCGCTCAGCGAAGAGCTGGCAAAGCTCCGAGCCCAGGAAAAAATGCACGAAGTCAGCTTCCAGGATAAGG 
AGAAGGAACATCTGACGCGGTTGCAGGATGCTGAAGAAATGAAGAAGGCGCTGGAGCAGCAGATGGAGAGCCACCGG 
GAAGCTCACCAGAAGCAGCTGTCCAGACTCCGAGACGAAATTGAGGAGAAGCAGAAAATCATTGATGAGATTCGGGA 
TTTGAATCAGAAACTGCAACTGGAACAGGAGAAGCTTAGTTCTGATTATAACAAGCTGAAAATAGAGGACCAAGAGA 
GAGAAATGAAGCTGGAAAAGCTCTTATTGCTCAACGATAAAAGGGAACAAGCCAGAGAAGACCTCAAAGGGCTGGAG 
GAGACAGTGTCTAGAGAATTGCAGACACTGCACAACCTTCGGAAACTCTTTGTCCAGGATCTGACCACCCGAGTTAA 
AAAAAGTGTGGAGTTGGACAACGATGATGGAGGGGGCAGTGCTGCCCAGAAGCAGAAAATTTCCTTCTTGGAGAATA 
ACCTGGAGCAGCTCACCAAAGTTCACAAGCAGCTGGTCCGGGACAACGCAGACCTGCGCTGTGAACTGCCCAAGCTG 
GAGAAGCGGCTGCGTGCCACGGCGGAGCGCGTCAAGGCTCTGGAGAGCGCGCTGAAGGAGGCCAAGGAGAACGCCAT 
GCGGGACCGTAAGCGCTACCAGCAGGAGGTGGATCGTATCAAGGAGGCCGTGCGGGCCAAGAACATGGCCAGAAGGG 
CCCATTCAGCCCAGATCGCCAAGCCCATCCGCCCCGGACACTACCCGGCCTCATCTCCAACGGCCGTCCATGCCATT 
CGAGGGGGAGGAGGCAGCTCTTCAAATTCCACTCACTACCAGAAATAAATACAAAATATGACTCCACGTAGCATGTC 
AAGGACTACATTAATCACCAATTCCTTTATTTTTCCCCCCCTACAGTTTCCATTTTTTTTTTATACTTGCTTACTCC 
AGCCATCTGCAGTACACCAGTTTCAGGTCTTTTGAGCTGTGTAGAGTTTCTGTGTGTACAGATGTGTGCTCGGACTT 
TTCTCTTTTTGAGAAATCTGAAGGAGATGGTTGCAGAAGATCCACTTACTACTGAGAACCATTACCACCGACTCGGC 
CTCCGGGGTGTTGGGTGGTTTCTGGGTGGTTCCTGGAGCCTCCTCTGGGCAGTGCACTGTCCCATCTGTACGCCCTA 
ATGTGCCATTCCCTAGAGGGGAACAACCAAGTGCCGTGGAGGCAGATGATCATGGTCTGCCTCAACTGTCTGGTTTC 
CTGTAAAATAAACACATTGTTTTATATTTTTAGGGAACAAAAAGTGCTGCTATAGGGTTCAAAGTTTTCCTTCTGAA 
CACTTTTCCGAAACAAATTACCCCAAAGACACATTTTGAATATCCTGGTCACATCTTTGGATCTGTAAAATATACCT 
TTTAGTATGGCACCTGTTAAAATGCAAAGCAAATTTCTTTGGGGCAGAAAAACAATCTGACAGTAGCAGTGTAGAAT 
TTGTTCATTCAAATACATCTGTGTAAATGCAAAAAGTCATAAAATTCACCTCCGAGCTGCTTGCTTTTGAACCTGCA 
GCAACTAGTCTTAGCCGGCCCGGTTTGAACATCGTTCTTTCAGAAGTGCTGAAAATGCTGCAAAGTTGGATAAGTGG 
AAATGTGGCTGCCCCTCTCCTCACTACTTCCTCTCTGATCGTTCTGAAGCTTGCATTGGGAATGGCTGCTTTCTCTA 
ACCATTTTCAGCTTGAGTGGGTATTGCTGAAGAAATCCAACATCATTCCAGCAGTTGAAAAAGGAAGCCTTCGGGAG 
AAAGTGCTTGTCAAAATTTTGTTCTTTGTGCTTGTGTATGAGTAAGTTGCCATGAATAAGTTATTATTTTAACCCAT 
AATTGGCGACTGTTTATATGAATTCTTTCTTTGGCACCAAATAGGTTTCATCTTCTTAGGCACAATTAGAAAAAATC 
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CACATAGATGGATATTTTACATTTAGTTATTGCTTTATCCAAATACATGAATCTAAAGCTGAATCAACCCTTACTTC 

CAGTTGTGCTTATTAAGAAGATCAATTTCCAAGTAGTAAAGTTTTCAGGGAAACTGACTGTGCTGCTATTTGTTTTG 

ACAAATTTGGGGGTAAGTCAATGACAACCAAACCAATCTCGGTGGAAACTCCTATCCTATCATGTTGTGTGCCCAAG 

ATGAGTGAGCTGGCACTGTGCCCTGAAGCTTTCACCACTGTAATGAAATATATGCCAGGGGAGACTTTGGGCTTTTC 

TCATGACTGTGTGGGTCGAAGGTAGCTCAAGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTATGTG 

TGTAAAGTGCTAAGAACTGTGCATTGACATCCAAACATTTCTTGTACAAAATTTCCCTAGCAAAGCAAACCTGCTTT 

GACTTAATTTATTTGTTAAATGTTGCACTTTGTTTATGTATGTXTTGTTTTTGGTGGGGAATAAGGAGAGAGAGGAC 

GACAAATTCTATTGAAGTATTTATTTTGTGAAGATGGCAATTTTGCATTTGTTTAAATTTTTTTCATTCTTTAATTT 

TGTTATCAGTGCCAGCCCAATATACCTGCTCTACCATTATTTGCGGTCTGATAAAAGGGTCCTTGTGGGGCAGGTTT 

TGCAAAGCTTATCAGGTAATAACATATGCCACATAACCTTGTTGATATGTTTGCTTCTGATTTGGGAAGCTAAACAT 

TGGTGTTTGAGAGGATTGCCAATTATTAATTGTCATTACCACTACTCTCCATTACTTTTTGTTTGGAAATTGAACAA 

AGGTCAGTAATGGTTTTTGGCTCTTGTTAATATCCATCATAAAATAGATTGTTTTAGATTCTTTCCAGGGTGATTTT 

TCCCTGGGTACCCCGTTTCTACTTCTAAAGAATTGCTTGGCACTTTCATGTTTCAAAGGGAAACATTCGCTTGTAGT 

TCCATTTTACTTGATCTCTACAAGGGACTGACAACATTTGCTTTACTTTTATTCACAGAGAAAGTTGGCTTTGATGT 

CTCTTAAAGATAATTCTGCTAGTTGCTGATCAGCCAGTCAGTTCACCTAGCTTCAATCTTTATAGGACTTCTAATCT 

AATTTTCCTATAGTGTGACTAAAAGGGAGGCAAATTATTGGAACGGATTATTCAAATGGATCCTTAAATATTGCTAT 

GTATAATAAGCCAGTTATTATATCAGGACCATGTTCTCTGTAGGCCACTTTCTAAAAAAGCCACATATGTGCAATTT 

TCAGGTTTTTAGACTATTGCTCCCTGTACTTTAAATGTAAAAACCACACTTCTGAACAACTAAGCTCATGAATATGA 

TTTTGGTTATATGCAGCTTTTGACTAGCATGTATTGTGTCTTTTTCTCCTCTATGAATAATTTTATATTTCATGCTA 

CTTCTTGAAAGTTTACTCTTTGATGCTCTAAGAGAACAGCCAGATGGTTTATATGAATAATCTTTATCTGCAGGATG 

GTGGATTGGTAAATTAGGAGAATGTTGTTTGAGATATCAAGATTTATGTCTGGGAACTAAAATATATAATGCCAAAT 

GTGTTTTTGTCAATTACTAGAGAATTCTGTGCAAACATATCATCTCTTCAAATGCTGCACACTTTGCTTTTGTTAAA 

CAGCAGGTAGTAGACAGAACAATAACAGTTTCGCGTTAAGACTTTTAAAGGAAATAGAATCGTGATTAAGAAATCAG 

AATTTATAGATATATTGGGATAAATGAAGAAATAAAAATGTTTGTCTAGAATGTAGCATCTAGTGACTTTTTAAAGC 

CCTAACGTTTACATAAAGAAGCTCTAGTTCTTATAGAAATAACAAAGCAAATAAAAGTTCTTAACAATCCCCTCTTT 

CGAAGTGCATTTTTTTAAAGCAGGGCAGGAGACATTTGGACTCTAGCTATATGACATACTGGGAAAGGCAGAGGGTG 

GAGGGAAGATTTCACTTCATTGTCTAGCCCAGAATCTTGAGCAAGCTAAAGAAACCATCATAATCTAAAATTGCTTC 

ATTTAACACTAACAATTTAGACTTTTTAAACCAAGCATTGAATAATGGCTGGATAACTGCCGAAGTAAGCGCCGCTC 

CATGAAGTCTGCTTACTTATTTAAAAATTGTGTATCAGTTTTAAATACTGTTCATTGTGTGCAGATATAAGGGGAAT 

AGGGCATTCTGTAGAATTATACATGTCTAGTTTGTAAAGTGTGTCCTGTGTACTGCAGATGTGTGTTCTCTGGGCTT 

TATGTATCTGTACAGTAGCTTTCACATTAAAAAAATTGTGGACAAACTTGTCCGGGGGGTTTGAGGGGAGAATGGTG 

GTTTATATCAATAACGATGCTGTACTATAGTCCATGTAACAAAAGATCTGGAAGTCACCCTCCTCTGGCCCACGGAA 

AATTTTGGTAATCTTCTAGGTTCTAAAATGAAGATGTATGGGTACTCTGGCAGACTGCATGTTGTATAATTTGAAAA 

ATACTAAAAGTGGAAAATAAAATTGAATTAAACTTTGGCTGGTC 



<210> SEQ ID NO 66 
<211> Length : 7, 004 
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<212> Type : DNA 

<213> ORGANISM : Homo sapiens 



<400> sequence : 66 
>M620 9 6_PEA_1_T5 

CGGCAGAGCGCGCTGGTGCTGATGCAGGATGGCTGAGCGCGCAGGAGCCCGGGAGGTCTGAGCCGGGCGAGGCTCGC 
TCCCTGCGCATCGCCTCCTCCGCCCGCCGCGTGGTCGCGGGCAGGTGGGCCGGGGGGCGCTGGGCAGGGGCGGGGCA 
GGGCCAGGGCAGGCCGGTCTGCAGCCGGAGGGGCCGGAGCGGAGAAGCTGCCCACCTTCCCGGGCTCGGAGCGGCCG 
GGGCTGCTCAGCCGGCCGGGCTCGCGATGACCTGCTGAGAAGCGTCGTCGGAGGCTGCAGGAGGCGGCCTAGCTGTG 
GGCGGTGCAGCTCGCGGCCTCCTCCCTCGTCGTTCCCGGCCCCGGCCCCCCTCCCTACCGCCGGCCGAGATGGCGGA 
TCCAGCCGAATGCAGCATCAAAGTGATGTGCCGGTTCCGGCCCCTCAACGAAGCGGAGATCCTCCGCGGGGACAAAT 
TCATCCCCAAATTTAAAGGCGATGAGACCGTGGTGATCGGGCAAGGGAAGCCATATGTCTTCGACAGAGTGCTACCT 
CCCAACACGACCCAAGAGCAGGTTTACAATGCATGTGCGAAGCAAATTGTCAAAGATGTCCTTGAAGGTTATAACGG 
GACGATTTTTGCGTATGGGCAGACTTCATCAGGAAAAACCCACACCATGGAGGGGAAGCTGCATGACCCCCAGCTCA 
TGGGGATCATCCCACGAATTGCCCATGATATCTTTGACCATATCTACTCCATGGATGAGAACCTGGAGTTTCACATA 
AAGGTTTCCTATTTTGAGATCTACTTGGACAAAATAAGGGACTTACTTGATGTATCCAAGACCAACTTGGCTGTTCA 
TGAAGATAAAAACAGAGTCCCGTATGTAAAGGGGTGCACTGAGCGGTTTGTGTCGAGCCCTGAGGAAGTCATGGATG 
TAATAGATGAAGGCAAAGCAAACCGACACGTGGCTGTGACAAACATGAATGAACACAGCTCTAGAAGTCACAGTATC 
TTCCTGATAAATATTAAACAAGAGAATGTAGAGACTGAAAAAAAACTCAGTGGGAAACTTTATTTGGTTGATTTGGC 
TGGGAGCGAAAAGGTCAGCAAAACTGGTGCCGAGGGAGCTGTTCTTGACGAAGCTAAAAATATCAATAAGTCTTTGT 
CTGCTCTTGGAAATGTGATCTCTGCTTTGGCAG7VAGGGACAAAAACACATGTGCCATACCGGGACAGCAAGATGACT 
CGGATTCTTCAGGACTCTTTGGGTGGGAACTGCAGAACCACCATCGTCATTTGCTGTTCTCCTTCTGTCTTCAATGA 
GGCTGAGACCAAGTCCACACTGATGTTCGGACAGAGGGTATGAAATCAAGGCTTAGGTGCAAAGCCATTGGATACCA 
TACCTGAGACCACACAGCCAAGCTAAGACCATCAAGAATACAGTCTCTGTGAACCTAGAACTGACAGCAGAAGAATG 
GAAGAAGAAATATGAAAAAGAGAAAGAGAAAAACAAGACTTTGAAGAATGTTATCCAGCATCTGGAGATGGAGCTAA 
ACAGGTGGAGGAATGGAGAAGCTGTGCCTGAGGATGAACAGATCAGTGCCAAGGACCAGAAGAACCTGGAGCCTTGT 
GATAACACCCCCATCATAGACAATATTGCTCCTGTTGTTGCTGGCATCTCTACAGAGGAGAAAGAGAAGTACGATGA 
GGAGATCTCCAGTCTCTACAGACAACTGGATGACAAGGATGATGAAATTAACCAGCAGAGCCAGCTGGCTGAAAAGC 
TGAAGCAACAGATGTTGGATCAGGATGAGCTTTTAGCTTCCACAAGAAGAGACTATGAGAAGATACAGGAGGAGCTG 
ACACGTCTCCAGATTGAAAATGAGGCAGCCAAGGATGAGGTGAAAGAAGTTCTCCAGGCCCTGGAGGAGCTGGCTGT 
CAATTATGACCAGAAATCACAGGAAGTGGAGGATAAGACCCGGGCCAATGAGCAGCTGACAGACGAGCTGGCCCAGA 
AAACGACTACATTGACAACCACACAGAGAGAGCTGAGCCAGCTACAAGAGCTTAGCAACCACCAGAAGAAAAGGGCA 
ACTGAGATCCTGAATTTGCTGTTGAAAGATCTGGGGGAGATAGGTGGAATTATTGGCACCAATGATGTGAAAACTTT 
GGCAGATGTGAATGGAGTCATTGAGGAGGAGTTTACCATGGCCCGCCTGTACATCAGCAAGATGAAGTCAGAGGTCA 
AGTCCCTGGTGAACCGCAGCAAACAGCTCGAGAGCGCCCAGATGGACTCCAACAGGAAGATGAATGCCAGCGAGCGG 
GAGCTGGCAGCCTGCCAGCTGCTCATCTCCCAGCACGAAGCCAAGATCAAGTCTCTGACAGACTACATGCAGAACAT 
GGAACAGAAGAGGAGGCAGCTAGAAGAGTCCCAGGACTCGCTCAGCGAAGAGCTGGCAAAGCTCCGAGCCCAGGAAA 
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AAATGCACGAAGTCAGCTTCCAGGATAAGGAGAAGGAACATCTGACGCGGTTGCAGGATGCTGAAGAAATGAAGAAG 
GCGCTGGAGCAGCAGATGGAGAGCCACCGGGAAGCTCACCAGAAGCAGCTGTCCAGACTCCGAGACGAAATTGAGGA 
GAAGCAGAAAATCATTGATGAGATTCGGGATTTGAATCAGAAACTGCAACTGGAACAGGAGAAGCTTAGTTCTGATT 
ATAACAAGCTGAAAATAGAGGACCAAGAGAGAGAAATGAAGCTGGAAAAGCTCTTATTGCTCAACGATAAAAGGGAA 
CAAGCCAGAGAAGACCTCAAAGGGCTGGAGGAGACAGTGTCTAGAGAATTGCAGACACTGCACAACCTTCGGAAACT 
CTTTGTCCAGGATCTGACCACCCGAGTTAAAAAAAGTGTGGAGTTGGACAACGATGATGGAGGGGGCAGTGCTGCCC 
AGAAGCAGAAAATTTCCTTCTTGGAGAATAACCTGGAGCAGCTCACCAAAGTTCACAAGCAGCTGGTCCGGGACAAC 
GCAGACCTGCGCTGTGAACTGCCCAAGCTGGAGAAGCGGCTGCGTGCCACGGCGGAGCGCGTCAAGGCTCTGGAGAG 
CGCGCTGAAGGAGGCCAAGGAGAACGCCATGCGGGACCGTAAGCGCTACCAGCAGGAGGTGGATCGTATCAAGGAGG 
CCGTGCGGGCCAAGAACATGGCCAGAAGGGCCCATTCAGCCCAGATCGCCAAGCCCATCCGCCCCGGACACTACCCG 
GCCTCATCTCCAACGGCCGTCCATGCCATTCGAGGGGGAGGAGGCAGCTCTTCAAATTCCACTCACTACCAGAAATA 
AATACAAAATATGACTCCACGTAGCATGTCAAGGACTACATTAATCACCAATTCCTTTATTTTTCCCCCCCTACAGT 
TTCCATTTTTTTTTTATACTTGCTTACTCCAGCCATCTGCAGTACACCAGTTTCAGGTCTTTTGAGCTGTGTAGAGT 
TTCTGTGTGTACAGATGTGTGCTCGGACTTTTCTCTTTTTGAGAAATCTGAAGGAGATGGTTGCAGAAGATCCACTT 
ACTACTGAGAACCATTACCACCGACTCGGCCTCCGGGGTGTTGGGTGGTTTCTGGGTGGTTCCTGGAGCCTCCTCTG 
GGCAGTGCACTGTCCCATCTGTACGCCCTAATGTGCCATTCCCTAGAGGGGAACAACCAAGTGCCGTGGAGGCAGAT 
GATCATGGTCTGCCTCAACTGTCTGGTTTCCTGTAAAATAAACACATTGTTTTATATTTTTAGGGAACAAAAAGTGC 
TGCTATAGGGTTCAAAGTTTTCCTTCTGAACACTTTTCCGAAACAAATTACCCCAAAGACACATTTTGAATATCCTG 
GTCACATCTTTGGATCTGTAAAATATACCTTTTAGTATGGCACCTGTTAAAATGCAAAGCAAATTTCTTTGGGGCAG 
AAAAACAATCTGACAGTAGCAGTGTAGAATTTGTTCATTCAAATACATCTGTGTAAATGCAAAAAGTCATAAAATTC 
ACCTCCGAGCTGCTTGCTTTTGAACCTGCAGCAACTAGTCTTAGCCGGCCCGGTTTGAACATCGTTCTTTCAGAAGT 
GCTGAAAATGCTGCAAAGTTGGATAAGTGGAAATGTGGCTGCCCCTCTCCTCACTACTTCCTCTCTGATCGTTCTGA 
AGCTTGCATTGGGAATGGCTGCTTTCTCTAACCATTTTCAGCTTGAGTGGGTATTGCTGAAGAAATCCAACATCATT 
CCAGCAGTTGAAAAAGGAAGCCTTCGGGAGAAAGTGCTTGTCAAAATTTTGTTCTTTGTGCTTGTGTATGAGTAAGT 
TGCCATGAATAAGTTATTATTTTAACCCATAATTGGCGACTGTTTATATGAATTCTTTCTTTGGCACCAAATAGGTT 
TCATCTTCTTAGGCACAATTAGAAAAAATCCACATAGATGGATATTTTACATTTAGTTATTGCTTTATCCAAATACA 
TGAATCTAAAGCTGAATCAACCCTTACTTCCAGTTGTGCTTATTAAGAAGATCAATTTCCAAGTAGTAAAGTTTTCA 
GGGAAACTGACTGTGCTGCTATTTGTTTTGACAAATTTGGGGGTAAGTCAATGACAACCAAACCAATCTCGGTGGAA 
ACTCCTATCCTATCATGTTGTGTGCCCAAGATGAGTGAGCTGGCACTGTGCCCTGAAGCTTTCACCACTGTAATGAA 
ATATATGCCAGGGGAGACTTTGGGCTTTTCTCATGACTGTGTGGGTCGAAGGTAGCTCAAGTGTGTGTGTGTGTGTG 
TGTGTGTGTGTGTGTGTGTGTGTGTATGTGTGTAAAGTGCTAAGAACTGTGCATTGACATCCAAACATTTCTTGTAC 
AAAATTTCCCTAGCAAAGCAAACCTGCTTTGACTTAATTTATTTGTTAAATGTTGCACTTTGTTTATGTATGTTTTG 
TTTTTGGTGGGGAATAAGGAGAGAGAGGACGACAAATTCTATTGAAGTATTTATTTTGTGAAGATGGCAATTTTGCA 
TTTGTTTAAATTTTTTTCATTCTTTAATTTTGTTATCAGTGCCAGCCCAATATACCTGCTCTACCATTATTTGCGGT 
CTGATAAAAGGGTCCTTGTGGGGCAGGTTTTGCAAAGCTTATCAGGTAATAACATATGCCACATAACCTTGTTGATA 
TGTTTGCTTCTGATTTGGGAAGCTAAACATTGGTGTTTGAGAGGATTGCCAATTATTAATTGTCATTACCACTACTC 
TCCATTACTTTTTGTTTGGAAATTGAACAAAGGTCAGTAATGGTTTTTGGCTCTTGTTAATATCCATCATAAAATAG 
ATTGTTTTAGATTCTTTCCAGGGTGATTTTTCCCTGGGTACCCCGTTTCTACTTCTAAAGAATTGCTTGGCACTTTC 
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ATGTTTCAAAGGGAAACATTCGCTTGTAGTTCCATTTTACTTGATCTCTACAAGGGACTGACAACATTTGCTTTACT 
TTTATTCACAGAGAAAGTTGGCTTTGATGTCTCTTAAAGATAATTCTGCTAGTTGCTGATCAGCCAGTCAGTTCACC 
TAGCTTCAATCTTTATAGGACTTCTAATCTAATTTTCCTATAGTGTGACTAAAAGGGAGGCAAATTATTGGAACGGA 
TTATTCAAATGGATCCTTAAATATTGCTATGTATAATAAGCCAGTTATTATATCAGGACCATGTTCTCTGTAGGCCA 
CTTTCTAAAAAAGCCACATATGTGCAATTTTCAGGTTTTTAGACTATTGCTCCCTGTACTTTAAATGTAAAAACCAC 
ACTTCTGAACAACTAAGCTCATGAATATGATTTTGGTTATATGCAGCTTTTGACTAGCATGTATTGTGTCTTTTTCT 
CCTCTATGAATAATTTTATATTTCATGCTACTTCTTGAAAGTTTACTCTTTGATGCTCTAAGAGAACAGCCAGATGG 
TTTATATGAATAATCTTTATCTGCAGGATGGTGGATTGGTAAATTAGGAGAATGTTGTTTGAGATATCAAGATTTAT 
GTCTGGGAACTAAAATATATAATGCCAAATGTGTTTTTGTCAATTACTAGAGAATTCTGTGCAAACATATCATCTCT 
TCAAATGCTGCACACTTTGCTTTTGTTAAACAGCAGGTAGTAGACAGAACAATAACAGTTTCGCGTTAAGACTTTTA 
AAGGAAATAGAATCGTGATTAAGAAATCAGAATTTATAGATATATTGGGATAAATGAAGAAATAAAAATGTTTGTCT 
AGAATGTAGCATCTAGTGACTTTTTAAAGCCCTAACGTTTACATAAAGAAGCTCTAGTTCTTATAGAAATAACAAAG 
CAAATAAAAGTTCTTAACAATCCCCTCTTTCGAAGTGCATTTTTTTAAAGCAGGGCAGGAGACATTTGGACTCTAGC 
TATATGACATACTGGGAAAGGCAGAGGGTGGAGGGAAGATTTCACTTCATTGTCTAGCCCAGAATCTTGAGCAAGCT 
AAAGAAACCATCATAATCTAAAATTGCTTCATTTAACACTAACAATTTAGACTTTTTAAACCAAGCATTGAATAATG 
GCTGGATAACTGCCGAAGTAAGCGCCGCTCCATGAAGTCTGCTTACTTATTTAAAAATTGTGTATCAGTTTTAAATA 
CTGTTCATTGTGTGCAGATATAAGGGGAATAGGGCATTCTGTAGAATTATACATGTCTAGTTTGTAAAGTGTGTCCT 
GTGTACTGCAGATGTGTGTTCTCTGGGCTTTATGTATCTGTACAGTAGCTTTCACATTAAAAAAATTGTGGACAAAC 
TTGTCCGGGGGGTTTGAGGGGAGAATGGTGGTTTATATCAATAACGATGCTGTACTATAGTCCATGTAACAAAAGAT 
CTGGAAGTCACCCTCCTCTGGCCCACGGAAAATTTTGGTAATCTTCTAGGTTCTAAAATGAAGATGTATGGGTACTC 
TGGCAGACTGCATGTTGTATAATTTGAAAAATACTAAAAGTGGAAAATAAAATTGAATTAAACTTTGGCTGGTC 

<210> SEQ ID NO 67 

<211> Length : 5, 977 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 67 
>M 6 2 0 9 6_PEA_1JT 6 

GCTGATTGTCCCCATGAAGGCCAGCCTTGAAGCTTGGTCAGTCTCCCTAACTGTATGATTGATCCCCACTTATTGCA 
CTACATCACTGAGTTCCCGTATGCCAAGTTATGGCCACTTACATCCACGTCAGCAAAACTGGTGCCGAGGGAGCTGT 
TCTTGACGAAGCTAAAAATATCAATAAGTCTTTGTCTGCTCTTGGAAATGTGATCTCTGCTTTGGCAGAAGGGACAA 
AAACACATGTGCCATACCGGGACAGCAAGATGACTCGGATTCTTCAGGACTCTTTGGGTGGGAACTGCAGAACCACC 
ATCGTCATTTGCTGTTCTCCTTCTGTCTTCAATGAGGCTGAGACCAAGTCCACACTGATGTTCGGACAGAGAGCTAA 
GACCATCAAGAATACAGTCTCTGTGAACCTAGAACTGACAGCAGAAGAATGGAAGAAGAAATATGAAAAAGAGAAAG 
AGAAAAACAAGACTTTGAAGAATGTTATCCAGCATCTGGAGATGGAGCTAAACAGGTGGAGGAATGGAGAAGCTGTG 
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CCTGAGGATGAACAGATCAGTGCCAAGGACCAGAAGAACCTGGAGCCTTGTGATAACACCCCCATCATAGACAATAT 

TGCTCCTGTTGTTGCTGGCATCTCTACAGAGGAGAAAGAGAAGTACGATGAGGAGATCTCCAGTCTCTACAGACAAC 

TGGATGACAAGGATGATGAAATTAACCAGCAGAGCCAGCTGGCTGAAAAGCTGAAGCAACAGATGTTGGATCAGGAT 

GAGCTTTTAGCTTCCACAAGAAGAGACTATGAGAAGATACAGGAGGAGCTGACACGTCTCCAGATTGAAAATGAGGC 

AGCCAAGGATGAGGTGAAAGAAGTTCTCCAGGCCCTGGAGGAGCTGGCTGTCAATTATGACCAGAAATCACAGGAAG 

TGGAGGATAAGACCCGGGCCAATGAGCAGCTGACAGACGAGCTGGCCCAGAAAACGACTACATTGACAACCACACAG 

AGAGAGCTGAGCCAGCTACAAGAGCTTAGCAACCACCAGAAGA7VAAGGGCAACTGAGATCCTGAATTTGCTGTTGAA 

AGATCTGGGGGAGATAGGTGGAATTATTGGCACCAATGATGTGAAAACTTTGGCAGATGTGAATGGAGTCATTGAGG 

AGGAGTTTACCATGGCCCGCCTGTACATCAGCAAGATGAAGTCAGAGGTCAAGTCCCTGGTGAACCGCAGCAAACAG 

CTCGAGAGCGCCCAGATGGACTCCAACAGGAAGATGAATGCCAGCGAGCGGGAGCTGGCAGCCTGCCAGCTGCTCAT 

CTCCCAGCACGAAGCCAAGATCAAGTCTCTGACAGACTACATGCAGAACATGGAACAGAAGAGGAGGCAGCTAGAAG 

AGTCCCAGGACTCGCTCAGCGAAGAGCTGGCAAAGCTCCGAGCCCAGGAAAAAATGCACGAAGTCAGCTTCCAGGAT 

AAGGAGAAGGAACATCTGACGCGGTTGCAGGATGCTGAAGAAATGAAGAAGGCGCTGGAGCAGCAGATGGAGAGCCA 

CCGGGAAGCTCACCAGAAGCAGCTGTCCAGACTCCGAGACGAAATTGAGGAGAAGCAGAAAATCATTGATGAGATTC 

GGGATTTGAATCAGAAACTGCAACTGGAACAGGAGAAGCTTAGTTCTGATTATAACAAGCTGAAAATAGAGGACCAA 

GAGAGAGAAATGAAGCTGGAAAAGCTCTTATTGCTCAACGATAAAAGGGAACAAGCCAGAGAAGACCTCAAAGGGCT 

GGAGGAGACAGTGTCTAGAGAATTGCAGACACTGCACAACCTTCGGAAACTCTTTGTCCAGGATCTGACCACCCGAG 

TTAAAAAAAGTGTGGAGTTGGACAACGATGATGGAGGGGGCAGTGCTGCCCAGAAGCAGAAAATTTCCTTCTTGGAG 

AATAACCTGGAGCAGCTCACCAAAGTTCACAAGCAGCTGGTCCGGGACAACGCAGACCTGCGCTGTGAACTGCCCAA 

GCTGGAGAAGCGGCTGCGTGCCACGGCGGAGCGCGTCAAGGCTCTGGAGAGCGCGCTGAAGGAGGCCAAGGAGAACG 

CCATGCGGGACCGTAAGCGCTACCAGCAGGAGGTGGATCGTATCAAGGAGGCCGTGCGGGCCAAGAACATGGCCAGA 

AGGGCCCATTCAGCCCAGATCGCCAAGCCCATCCGCCCCGGACACTACCCGGCCTCATCTCCAACGGCCGTCCATGC 

CATTCGAGGGGGAGGAGGCAGCTCTTCAAATTCCACTCACTACCAGAAATAAATACAAAATATGACTCCACGTAGCA 

TGTCAAGGACTACATTAATCACCAATTCCTTTATTTTTCCCCCCCTACAGTTTCCATTTTTTTTTTATACTTGCTTA 

CTCCAGCCATCTGCAGTACACCAGTTTCAGGTCTTTTGAGCTGTGTAGAGTTTCTGTGTGTACAGATGTGTGCTCGG 

ACTTTTCTCTTTTTGAGAAATCTGAAGGAGATGGTTGCAGAAGATCCACTTACTACTGAGAACCATTACCACCGACT 

CGGCCTCCGGGGTGTTGGGTGGTTTCTGGGTGGTTCCTGGAGCCTCCTCTGGGCAGTGCACTGTCCCATCTGTACGC 

CCTAATGTGCCATTCCCTAGAGGGGAACAACCAAGTGCCGTGGAGGCAGATGATCATGGTCTGCCTCAACTGTCTGG 

TTTCCTGTAAAATAAACACATTGTTTTATATTTTTAGGGAACAAAAAGTGCTGCTATAGGGTTCAAAGTTTTCCTTC 

TGAACACTTTTCCGAAACAAATTACCCCAAAGACACATTTTGAATATCCTGGTCACATCTTTGGATCTGTAAAATAT 

ACCTTTTAGTATGGCACCTGTTAAAATGCAAAGCAAATTTCTTTGGGGCAGAAAAACAATCTGACAGTAGCAGTGTA 

GAATTTGTTCATTCAAATACATCTGTGTAAATGCAAAAAGTCATAAAATTCACCTCCGAGCTGCTTGCTTTTGAACC 

TGCAGCAACTAGTCTTAGCCGGCCCGGTTTGAACATCGTTCTTTCAGAAGTGCTGAAAATGCTGCAAAGTTGGATAA 

GTGGAAATGTGGCTGCCCCTCTCCTCACTACTTCCTCTCTGATCGTTCTGAAGCTTGCATTGGGAATGGCTGCTTTC 

TCTAACCATTTTCAGCTTGAGTGGGTATTGCTGAAGAAATCCAACATCATTCCAGCAGTTGAAAAAGGAAGCCTTCG 

GGAGAAAGTGCTTGTCAAAATTTTGTTCTTTGTGCTTGTGTATGAGTAAGTTGCCATGAATAAGTTATTATTTTAAC 

CCATAATTGGCGACTGTTTATATGAATTCTTTCTTTGGCACCAAATAGGTTTCATCTTCTTAGGCACAATTAGAAAA 

AATCCACATAGATGGATATTTTACATTTAGTTATTGCTTTATCCAAATACATGAATCTAAAGCTGAATCAACCCTTA 
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CTTCCAGTTGTGCTTATTAAGAAGATCAATTTCCAAGTAGTAAAGTTTTCAGGGAAACTGACTGTGCTGCTATTTGT 
TTTGACAAATTTGGGGGTAAGTCAATGACAACCAAACCAATCTCGGTGGAAACTCCTATCCTATCATGTTGTGTGCC 
CAAGATGAGTGAGCTGGCACTGTGCCCTGAAGCTTTCACCACTGTAATGAAATATATGCCAGGGGAGACTTTGGGCT 
TTTCTCATGACTGTGTGGGTCGAAGGTAGCTCAAGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTA 
TGTGTGTAAAGTGCTAAGAACTGTGCATTGACATCCAAACATTTCTTGTACAAAATTTCCCTAGCAAAGCAAACCTG 
CTTTGACTTAATTTATTTGTTAAATGTTGCACTTTGTTTATGTATGTTTTGTTTTTGGTGGGGAATAAGGAGAGAGA 
GGACGACAAATTC.TATTGAAGTATTTATTTTGTGAAGATGGCAATTTTGCATTTGTTTAAATTTTTTTCATTCTTTA 
ATTTTGTTATCAGTGCCAGCCCAATATACCTGCTCTACCATTATTTGCGGTCTGATAAAAGGGTCCTTGTGGGGCAG 
GTTTTGCAAAGCTTATCAGGTAATAACATATGCCACATAACCTTGTTGATATGTTTGCTTCTGATTTGGGAAGCTAA 
ACATTGGTGTTTGAGAGGATTGCCAATTATTAATTGTCATTACCACTACTCTCCATTACTTTTTGTTTGGAAATTGA 
ACAAAGGTCAGTAATGGTTTTTGGCTCTTGTTAATATCCATCATAAAATAGATTGTTTTAGATTCTTTCCAGGGTGA 
TTTTTCCCTGGGTACCCCGTTTCTACTTCTAAAGAATTGCTTGGCACTTTCATGTTTCAAAGGGAAACATTCGCTTG 
TAGTTCCATTTTACTTGATCTCTACAAGGGACTGACAACATTTGCTTTACTTTTATTCACAGAGAAAGTTGGCTTTG 
ATGTCTCTTAAAGATAATTCTGCTAGTTGCTGATCAGCCAGTCAGTTCACCTAGCTTCAATCTTTATAGGACTTCTA 
ATCTAATTTTCCTATAGTGTGACTAAAAGGGAGGCAAATTATTGGAACGGATTATTCAAATGGATCCTTAAATATTG 
CTATGTATAATAAGCCAGTTATTATATCAGGACCATGTTCTCTGTAGGCCACTTTCTAAAAAAGCCACATATGTGCA 
ATTTTCAGGTTTTTAGACTATTGCTCCCTGTACTTTAAATGTAAAAACCACACTTCTGAACAACTAAGCTCATGAAT 
ATGATTTTGGTTATATGCAGCTTTTGACTAGCATGTATTGTGTCTTTTTCTCCTCTATGAATAATTTTATATTTCAT 
GCTACTTCTTGAAAGTTTACTCTTTGATGCTCTAAGAGAACAGCCAGATGGTTTATATGAATAATCTTTATCTGCAG 
GATGGTGGATTGGTAAATTAGGAGAATGTTGTTTGAGATATCAAGATTTATGTCTGGGAACTAAAATATATAATGCC 
7VAATGTGTTTTTGTCAATTACTAGAGAATTCTGTGCAAACATATCATCTCTTCAAATGCTGCACACTTTGCTTTTGT 
TAAACAGCAGGTAGTAGACAGAACAATAACAGTTTCGCGTTAAGACTTTTAAAGGAAATAGAATCGTGATTAAGAAA 
TCAGAATTTATAGATATATTGGGATAAATGAAGAAATAAAAATGTTTGTCTAGAATGTAGCATCTAGTGACTTTTTA 
AAGCCCTAACGTTTACATAAAGAAGCTCTAGTTCTTATAGAAATAACAAAGCAAATAAAAGTTCTTAACAATCCCCT 
CTTTCGAAGTGCATTTTTTTAAAGCAGGGCAGGAGACATTTGGACTCTAGCTATATGACATACTGGGAAAGGCAGAG 
GGTGGAGGGAAGATTTCACTTCATTGTCTAGCCCAGAATCTTGAGCAAGCTAAAGAAACCATCATAATCTAAAATTG 
CTTCATTTAACACTAACAATTTAGACTTTTTAAACCAAGCATTGAATAATGGCTGGATAACTGCCGAAGTAAGCGCC 
GCTCCATGAAGTCTGCTTACTTATTTAAAAATTGTGTATCAGTTTTAAATACTGTTCATTGTGTGCAGATATAAGGG 
GAATAGGGCATTCTGTAGAATTATACATGTCTAGTTTGTAAAGTGTGTCCTGTGTACTGCAGATGTGTGTTCTCTGG 
GCTTTATGTATCTGTACAGTAGCTTTCACATTAAAAAAATTGTGGACAAACTTGTCCGGGGGGTTTGAGGGGAGAAT 
GGTGGTTTATATCAATAACGATGCTGTACTATAGTCCATGTAACAAAAGATCTGGAAGTCACCCTCCTCTGGCCCAC 
GGAAAATTTTGGTAATCTTCTAGGTTCTAAAATGAAGATGTATGGGTACTCTGGCAGACTGCATGTTGTATAATTTG 
AAAAATACTAAAAGTGGAAAATAAAATTGAATTAAACTTTGGCTGGTC 

<210> SEQ ID NO 68 
<211> Length : 5, 999 
<212> Type : DNA 
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<213> ORGANISM : Homo sapiens 

<400> sequence : 68 
>M62 0 9 6_PEA_1_T7 

GGTTTGACTTAGCCTGTGGCAGAGGCAGCTGCACACATGTGGGAGGGATATGAACTTCTGAGAAAAGAGAAAATCCC 
ATGTTGTACCCGACTCTAATAGGCCAGGAACGAGCTTTGCCATTGGAATGTGGCAGTCCTGCCTCTGCAGGTCAGCA 
AAACTGGTGCCGAGGGAGCTGTTCTTGACGAAGCTAAAAATATCAATAAGTCTTTGTCTGCTCTTGGAAATGTGATC 
TCTGCTTTGGCAGAAGGGACAAAAACACATGTGCCATACCGGGACAGCAAGATGACTCGGATTCTTCAGGACTCTTT 
GGGTGGGAACTGCAGAACCACCATCGTCATTTGCTGTTCTCCTTCTGTCTTCAATGAGGCTGAGACCAAGTCCACAC 
TGATGTTCGGACAGAGAGCTAAGACCATCAAGAATACAGTCTCTGTGAACCTAGAACTGACAGCAGAAGAATGGAAG 
AAGAAATATGAAAAAGAGAAAGAGAAAAACAAGACTTTGAAGAATGTTATCCAGCATCTGGAGATGGAGCTAAACAG 
GTGGAGGAATGGAGAAGCTGTGCCTGAGGATGAACAGATCAGTGCCAAGGACCAGAAGAACCTGGAGCCTTGTGATA 
ACACCCCCATCATAGACAATATTGCTCCTGTTGTTGCTGGCATCTCTACAGAGGAGAAAGAGAAGTACGATGAGGAG 
ATCTCCAGTCTCTACAGACAACTGGATGACAAGGATGATGAAATTAACCAGCAGAGCCAGCTGGCTGAAAAGCTGAA 
GCAACAGATGTTGGATCAGGATGAGCTTTTAGCTTCCACAAGAAGAGACTATGAGAAGATACAGGAGGAGCTGACAC 
GTCTCCAGATTGAAAATGAGGCAGCCAAGGATGAGGTGAAAGAAGTTCTCCAGGCCCTGGAGGAGCTGGCTGTCAAT 
TATGACCAGAAATCACAGGAAGTGGAGGATAAGACCCGGGCCAATGAGCAGCTGACAGACGAGCTGGCCCAGAAAAC 
GACTACATTGACAACCACACAGAGAGAGCTGAGCCAGCTACAAGAGCTTAGCAACCACCAGAAGAAAAGGGCAACTG 
AGATCCTGAATTTGCTGTTGAAAGATCTGGGGGAGATAGGTGGAATTATTGGCACCAATGATGTGAAAACTTTGGCA 
GATGTGAATGGAGTCATTGAGGAGGAGTTTACCATGGCCCGCCTGTACATCAGCAAGATGAAGTCAGAGGTCAAGTC 
CCTGGTGAACCGCAGCAAACAGCTCGAGAGCGCCCAGATGGACTCCAACAGGAAGATGAATGCCAGCGAGCGGGAGC 
TGGCAGCCTGCCAGCTGCTCATCTCCCAGCACGAAGCCAAGATCAAGTCTCTGACAGACTACATGCAGAACATGGAA 
CAGAAGAGGAGGCAGCTAGAAGAGTCCCAGGACTCGCTCAGCGAAGAGCTGGCAAAGCTCCGAGCCCAGGAAAAAAT 
GCACGAAGTCAGCTTCCAGGATAAGGAGAAGGAACATCTGACGCGGTTGCAGGATGCTGAAGAAATGAAGAAGGCGC 
TGGAGCAGCAGATGGAGAGCCACCGGGAAGCTCACCAGAAGCAGCTGTCCAGACTCCGAGACGAAATTGAGGAGAAG 
CAGAAAATCATTGATGAGATTCGGGATTTGAATCAGAAACTGCAACTGGAACAGGAGAAGCTTAGTTCTGATTATAA 
CAAGCTGAAAATAGAGGACCAAGAGAGAGAAATGAAGCTGGAAAAGCTCTTATTGCTCAACGATAAAAGGGAACAAG 
CCAGAGAAGACCTCAAAGGGCTGGAGGAGACAGTGTCTAGAGAATTGCAGACACTGCACAACCTTCGGAAACTCTTT 
GTCCAGGATCTGACCACCCGAGTTAAAAAAAGTGTGGAGTTGGACAACGATGATGGAGGGGGCAGTGCTGCCCAGAA 
GCAGAAAATTTCCTTCTTGGAGAATAACCTGGAGCAGCTCACCAAAGTTCACAAGCAGCTGGTCCGGGACAACGCAG 
ACCTGCGCTGTGAACTGCCCAAGCTGGAGAAGCGGCTGCGTGCCACGGCGGAGCGCGTCAAGGCTCTGGAGAGCGCG 
CTGAAGGAGGCCAAGGAGAACGCCATGCGGGACCGTAAGCGCTACCAGCAGGAGGTGGATCGTATCAAGGAGGCCGT 
GCGGGCCAAGAACATGGCCAGAAGGGCCCATTCAGCCCAGATCGCCAAGCCCATCCGCCCCGGACACTACCCGGCCT 
CATCTCCAACGGCCGTCCATGCCATTCGAGGGGGAGGAGGCAGCTCTTCAAATTCCACTCACTACCAGAAATAAATA 
CAAAATATGACTCCACGTAGCATGTCAAGGACTACATTAATCACCAATTCCTTTATTTTTCCCCCCCTACAGTTTCC 
ATTTTTTTTTTATACTTGCTTACTCCAGCCATCTGCAGTACACCAGTTTCAGGTCTTTTGAGCTGTGTAGAGTTTCT 
GTGTGTACAGATGTGTGCTCGGACTTTTCTCTTTTTGAGAAATCTGAAGGAGATGGTTGCAGAAGATCCACTTACTA 
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CTGAGAACCATTACCACCGACTCGGCCTCCGGGGTGTTGGGTGGTTTCTGGGTGGTTCCTGGAGCCTCCTCTGGGCA 
GTGCACTGTCCCATCTGTACGCCCTAATGTGCCATTCCCTAGAGGGGAACAACCAAGTGCCGTGGAGGCAGATGATC 
ATGGTCTGCCTCAACTGTCTGGTTTCCTGTAAAATAAACACATTGTTTTATATTTTTAGGGAACAAAAAGTGCTGCT 
ATAGGGTTCAAAGTTTTCCTTCTGAACACTTTTCCGAAACAAATTACCCCAAAGACACATTTTGAATATCCTGGTCA 
CATCTTTGGATCTGTAAAATATACCTXTTAGTATGGCACCTGTTAAAATGCAAAGCAAATTTCTTTGGGGCAGAAAA 
ACAATCTGACAGTAGCAGTGTAGAATTTGTTCATTCAAATACATCTGTGTAAATGCAAAAAGTCATAAAATTCACCT 
CCGAGCTGCTTGCTTTTGAACCTGCAGCAACTAGTCTTAGCCGGCCCGGTTTGAACATCGTTCTTTCAGAAGTGCTG 
AAAATGCTGCAAAGTTGGATAAGTGGAAATGTGGCTGCCCCTCTCCTCACTACTTCCTCTCTGATCGTTCTGAAGCT 
TGCATTGGGAATGGCTGCTTTCTCTAACCATTTTCAGCTTGAGTGGGTATTGCTGAAGAAATCCAACATCATTCCAG 
CAGTTGAAAAAGGAAGCCTTCGGGAGAAAGTGCTTGTCAAAATTTTGTTCTTTGTGCTTGTGTATGAGTAAGTTGCC 
ATGAATAAGTTATTATTTTAACCCATAATTGGCGACTGTTTATATGAATTCTTTCTTTGGCACCAAATAGGTTTCAT 
CTTCTTAGGCACAATTAGAAAAAATCCACATAGATGGATATTTTACATTTAGTTATTGCTTTATCCAAATACATGAA 
TCTAAAGCTGAATCAACCCTTACTTCCAGTTGTGCTTATTAAGAAGATCAATTTCCAAGTAGTAAAGTTTTCAGGGA 
AACTGACTGTGCTGCTATTTGTTTTGACAAATTTGGGGGTAAGTCAATGACAACCAAACCAATCTCGGTGGAAACTC 
CTATCCTATCATGTTGTGTGCCCAAGATGAGTGAGCTGGCACTGTGCCCTGAAGCTTTCACCACTGTAATGAAATAT 
ATGCCAGGGGAGACTTTGGGCTTTTCTCATGACTGTGTGGGTCGAAGGTAGCTCAAGTGTGTGTGTGTGTGTGTGTG 
TGTGTGTGTGTGTGTGTGTGTATGTGTGTAAAGTGCTAAGAACTGTGCATTGACATCCAAACATTTCTTGTACAAAA 
TTTCCCTAGCAAAGCAAACCTGCTTTGACTTAATTTATTTGTTAAATGTTGCACTTTGTTTATGTATGTTTTGTTTT 
TGGTGGGGAATAAGGAGAGAGAGGACGACAAATTCTATTGAAGTATTTATTTTGTGAAGATGGCAATTTTGCATTTG 
TTTAAATTTTTTTCATTCTTTAATTTTGTTATCAGTGCCAGCCCAATATACCTGCTCTACCATTATTTGCGGTCTGA 
TAAAAGGGTCCTTGTGGGGCAGGTTTTGCAAAGCTTATCAGGTAATAACATATGCCACATAACCTTGTTGATATGTT 
TGCTTCTGATTTGGGAAGCTAAACATTGGTGTTTGAGAGGATTGCCAATTATTAATTGTCATTACCACTACTCTCCA 
TTACTTTTTGTTTGGAAATTGAACAAAGGTCAGTAATGGTTTTTGGCTCTTGTTAATATCCATCATAAAATAGATTG 
TTTTAGATTCTTTCCAGGGTGATTTTTCCCTGGGTACCCCGTTTCTACTTCTAAAGAATTGCTTGGCACTTTCATGT 
TTCAAAGGGAAACATTCGCTTGTAGTTCCATTTTACTTGATCTCTACAAGGGACTGACAACATTTGCTTTACTTTTA 
TTCACAGAGAAAGTTGGCTTTGATGTCTCTTAAAGATAATTCTGCTAGTTGCTGATCAGCCAGTCAGTTCACCTAGC 
TTCAATCTTTATAGGACTTCTAATCTAATTTTCCTATAGTGTGACTAAAAGGGAGGCAAATTATTGGAACGGATTAT 
TCAAATGGATCCTTAAATATTGCTATGTATAATAAGCCAGTTATTATATCAGGACCATGTTCTCTGTAGGCCACTTT 
CTAAAAAAGCCACATATGTGCAATTTTCAGGTTTTTAGACTATTGCTCCCTGTACTTTAAATGTAAAAACCACACTT 
CTGAACAACTAAGCTCATGAATATGATTTTGGTTATATGCAGCTTTTGACTAGCATGTATTGTGTCTTTTTCTCCTC 
TATGAATAATTTTATATTTCATGCTACXTCTTGAAAGTTTACTCTTTGATGCTCTAAGAGAACAGCCAGATGGTTTA 
TATGAATAATCTTTATCTGCAGGATGGTGGATTGGTAAATTAGGAGAATGTTGTTTGAGATATCAAGATTTATGTCT 
GGGAACTAAAATATATAATGCCAAATGTGTTTTTGTCAATTACTAGAGAATTCTGTGCAAACATATCATCTCTTCAA 
ATGCTGCACACTTTGCTTTTGTTAAACAGCAGGTAGTAGACAGAACAATAACAGTTTCGCGTTAAGACTTTTAAAGG 
AAATAGAATCGTGATTAAGAAATCAGAATTTATAGATATATTGGGATAAATGAAGAAATAAAAATGTTTGTCTAGAA 
TGTAGCATCTAGTGACTTTTTAAAGCCCTAACGTTTACATAAAGAAGCTCTAGTTCTTATAGAAATAACAAAGCAAA 
TAAAAGTTCTTAACAATCCCCTCTTTCGAAGTGCATTTTTTTAAAGCAGGGCAGGAGACATTTGGACTCTAGCTATA 
TGACATACTGGGAAAGGCAGAGGGTGGAGGGAAGATTTCACTTCATTGTCTAGCCCAGAATCTTGAGCAAGCTAAAG 
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AAACCATCATAATCTAAAATTGCTTCATTTAACACTAACAATTTAGACTTTTTAAACCAAGCATTGAATAATGGCTG 
GATAACTGCCGAAGTAAGCGCCGCTCCATGAAGTCTGCTTACTTATTTAAAAATTGTGTATCAGTTTTAAATACTGT 
TCATTGTGTGCAGATATAAGGGGAATAGGGCATTCTGTAGAATTATACATGTCTAGTTTGTAAAGTGTGTCCTGTGT 
ACTGCAGATGTGTGTTCTCTGGGCTTTATGTATCTGTACAGTAGCTTTCACATTAAAAAAATTGTGGACAAACTTGT 
CCGGGGGGTTTGAGGGGAGAATGGTGGTTTATATCAATAACGATGCTGTACTATAGTCCATGTAACAAAAGATCTGG 
AAGTCACCCTCCTCTGGCCCACGGAAAATTTTGGTAATCTTCTAGGTTCTAAAATGAAGATGTATGGGTACTCTGGC 
AGACTGCATGTTGTATAATTTGAAAAATACTAAAAGTGGAAAATAAAATTGAATTAAACTTTGGCTGGTC 

<210> SEQ ID NO 69 

<211> Length : 6,038 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 69 
>M6 2 0 9 6_PEA_1_T 9 

GCTGATTGTCCCCATGAAGGCCAGCCTTGAAGCTTGGTCAGTCTCCCTAACTGTATGATTGATCCCCACTTATTGCA 
CTACATCACTGAGTTCCCGTATGCCAAGTTATGGCCACTTACATCCACGTCAGCAAAACTGGTGCCGAGGGAGCTGT 
TCTTGACGAAGCTAAAAATATCAATAAGTCTTTGTCTGCTCTTGGAAATGTGATCTCTGCTTTGGCAGAAGGGACAA 
AAACACATGTGCCATACCGGGACAGCAAGATGACTCGGATTCTTCAGGACTCTTTGGGTGGGAACTGCAGAACCACC 
ATCGTCATTTGCTGTTCTCCTTCTGTCTTCAATGAGGCTGAGACCAAGTCCACACTGATGTTCGGACAGAGGGTATG 
AAATCAAGGCTTAGGTGCAAAGCCATTGGATACCATACCTGAGACCACACAGCCAAGCTAAGACCATCAAGAATACA 
GTCTCTGTGAACCTAGAACTGACAGCAGAAGAATGGAAGAAGAAATATGAAAAAGAGAAAGAGAAAAACAAGACTTT 
GAAGAATGTTATCCAGCATCTGGAGATGGAGCTAAACAGGTGGAGGAATGGAGAAGCTGTGCCTGAGGATGAACAGA 
TCAGTGCCAAGGACCAGAAGAACCTGGAGCCTTGTGATAACACCCCCATCATAGACAATATTGCTCCTGTTGTTGCT 
GGCATCTCTACAGAGGAGAAAGAGAAGTACGATGAGGAGATCTCCAGTCTCTACAGACAACTGGATGACAAGGATGA 
TGAAATTAACCAGCAGAGCCAGCTGGCTGAAAAGCTGAAGCAACAGATGTTGGATCAGGATGAGCTTTTAGCTTCCA 
CAAGAAGAGACTATGAGAAGATACAGGAGGAGCTGACACGTCTCCAGATTGAAAATGAGGCAGCCAAGGATGAGGTG 
AAAGAAGTTCTCCAGGCCCTGGAGGAGCTGGCTGTCAATTATGACCAGAAATCACAGGAAGTGGAGGATAAGACCCG 
GGCCAATGAGCAGCTGACAGACGAGCTGGCCCAGAAAACGACTACATTGACAACCACACAGAGAGAGCTGAGCCAGC 
TACAAGAGCTTAGCAACCACCAGAAGAAAAGGGCAACTGAGATCCTGAATTTGCTGTTGAAAGATCTGGGGGAGATA 
GGTGGAATTATTGGCACCAATGATGTGAAAACTTTGGCAGATGTGAATGGAGTCATTGAGGAGGAGTTTACCATGGC 
CCGCCTGTACATCAGCAAGATGAAGTCAGAGGTCAAGTCCCTGGTGAACCGCAGCAAACAGCTCGAGAGCGCCCAGA 
TGGACTCCAACAGGAAGATGAATGCCAGCGAGCGGGAGCTGGCAGCCTGCCAGCTGCTCATCTCCCAGCACGAAGCC 
AAGATCAAGTCTCTGACAGACTACATGCAGAACATGGAACAGAAGAGGAGGCAGCTAGAAGAGTCCCAGGACTCGCT 
CAGCGAAGAGCTGGCAAAGCTCCGAGCCCAGGAAAAAATGCACGAAGTCAGCTTCCAGGATAAGGAGAAGGAACATC 
TGACGCGGTTGCAGGATGCTGAAGAAATGAAGAAGGCGCTGGAGCAGCAGATGGAGAGCCACCGGGAAGCTCACCAG 
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AAGCAGCTGTCCAGACTCCGAGACGAAATTGAGGAGAAGCAGAAAATCATTGATGAGATTCGGGATTTGAATCAGAA 
ACTGCAACTGGAACAGGAGAAGCTTAGTTCTGATTATAACAAGCTGAAAATAGAGGACCAAGAGAGAGAAATGAAGC 
TGGAAAAGCTCTTATTGCTCAACGATAAAAGGGAACAAGCCAGAGAAGACCTCAAAGGGCTGGAGGAGACAGTGTCT 
AGAGAATTGCAGACACTGCACAACCTTCGGAAACTCTTTGTCCAGGATCTGACCACCCGAGTTAAAAAAAGTGTGGA 
GTTGGACAACGATGATGGAGGGGGCAGTGCTGCCCAGAAGCAGAAAATTTCCTTCTTGGAGAATAACCTGGAGCAGC 
TCACCAAAGTTCACAAGCAGCTGGTCCGGGACAACGCAGACCTGCGCTGTGAACTGCCCAAGCTGGAGAAGCGGCTG 
CGTGCCACGGCGGAGCGCGTCAAGGCTCTGGAGAGCGCGCTGAAGGAGGCCAAGGAGAACGCCATGCGGGACCGTAA 
GCGCTACCAGCAGGAGGTGGATCGTATCAAGGAGGCCGTGCGGGCCAAGAACATGGCCAGAAGGGCCCATTCAGCCC 
AGATCGCCAAGCCCATCCGCCCCGGACACTACCCGGCCTCATCTCCAACGGCCGTCCATGCCATTCGAGGGGGAGGA 
GGCAGCTCTTCAAATTCCACTCACTACCAGAAATAAATACAAAATATGACTCCACGTAGCATGTCAAGGACTACATT 
AATCACCAATTCCTTTATTTTTCCCCCCCTACAGTTTCCATTTTTTTTTTATACTTGCTTACTCCAGCCATCTGCAG 
TACACCAGTTTCAGGTCTTTTGAGCTGTGTAGAGTTTCTGTGTGTACAGATGTGTGCTCGGACTTTTCTCTTTTTGA 
GAAATCTGAAGGAGATGGTTGCAGAAGATCCACTTACTACTGAGAACCATTACCACCGACTCGGCCTCCGGGGTGTT 
GGGTGGTTTCTGGGTGGTTCCTGGAGCCTCCTCTGGGCAGTGCACTGTCCCATCTGTACGCCCTAATGTGCCATTCC 
CTAGAGGGGAACAACCAAGTGCCGTGGAGGCAGATGATCATGGTCTGCCTCAACTGTCTGGTTTCCTGTAAAATAAA 
CACATTGTTTTATATTTTTAGGGAACAAAAAGTGCTGCTATAGGGTTCAAAGTTTTCCTTCTGAACACTTTTCCGAA 
ACAAATTACCCCAAAGACACATTTTGAATATCCTGGTCACATCTTTGGATCTGTAAAATATACCTTTTAGTATGGCA 
CCTGTTAAAATGCAAAGCAAATTTCTTTGGGGCAGAAAAACAATCTGACAGTAGCAGTGTAGAATTTGTTCATTCAA 
ATACATCTGTGTAAATGCAAAAAGTCATAAAATTCACCTCCGAGCTGCTTGCTTTTGAACCTGCAGCAACTAGTCTT 
AGCCGGCCCGGTTTGAACATCGTTCTTTCAGAAGTGCTGAAAATGCTGCAAAGTTGGATAAGTGGAAATGTGGCTGC 
CCCTCTCCTCACTACTTCCTCTCTGATCGTTCTGAAGCTTGCATTGGGAATGGCTGCTTTCTCTAACCATTTTCAGC 
TTGAGTGGGTATTGCTGAAGAAATCCAACATCATTCCAGCAGTTGAAAAAGGAAGCCTTCGGGAGAAAGTGCTTGTC 
AAAATTTTGTTCTTTGTGCTTGTGTATGAGTAAGTTGCCATGAATAAGTTATTATTTTAACCCATAATTGGCGACTG 
TTTATATGAATTCTTTCTTTGGCACCAAATAGGTTTCATCTTCTTAGGCACAATTAGAAAAAATCCACATAGATGGA 
TATTTTACATTTAGTTATTGCTTTATCCAAATACATGAATCTAAAGCTGAATCAACCCTTACTTCCAGTTGTGCTTA 
TTAAGAAGATCAATTTCCAAGTAGTAAAGTTTTCAGGGAAACTGACTGTGCTGCTATTTGTTTTGACAAATTTGGGG 
GTAAGTCAATGACAACCAAACCAATCTCGGTGGAAACTCCTATCCTATCATGTTGTGTGCCCAAGATGAGTGAGCTG 
GCACTGTGCCCTGAAGCTTTCACCACTGTAATGAAATATATGCCAGGGGAGACTTTGGGCTTTTCTCATGACTGTGT 
GGGTCGAAGGTAGCTCAAGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTATGTGTGTAAAGTGCTA 
AGAACTGTGCATTGACATCCAAACATTTCTTGTACAAAATTTCCCTAGCAAAGCAAACCTGCTTTGACTTAATTTAT 
TTGTTAAATGTTGCACTTTGTTTATGTATGTTTTGTTTTTGGTGGGGAATAAGGAGAGAGAGGACGACAAATTCTAT 
TGAAGTATTTATTTTGTGAAGATGGCAATTTTGCATTTGTTTAAATTTTTTTCATTCTTTAATTTTGTTATCAGTGC 
CAGCCCAATATACCTGCTCTACCATTATTTGCGGTCTGATAAAAGGGTCCTTGTGGGGCAGGTTTTGCAAAGCTTAT 
CAGGTAATAACATATGCCACATAACCTTGTTGATATGTTTGCTTCTGATTTGGGAAGCTAAACATTGGTGTTTGAGA 
GGATTGCCAATTATTAATTGTCATTACCACTACTCTCCATTACTTTTTGTTTGGAAATTGAACAAAGGTCAGTAATG 
GTTTTTGGCTCTTGTTAATATCCATCATAAAATAGATTGTTTTAGATTCTTTCCAGGGTGATTTTTCCCTGGGTACC 
CCGTTTCTACTTCTAAAGAATTGCTTGGCACTTTCATGTTTCAAAGGGAAACATTCGCTTGTAGTTCCATTTTACTT 
GATCTCTACAAGGGACTGACAACATTTGCTTTACTTTTATTCACAGAGAAAGTTGGCTTTGATGTCTCTTAAAGATA 
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ATTCTGCTAGTTGCTGATCAGCCAGTCAGTTCACCTAGCTTCAATCTTTATAGGACTTCTAATCTAATTTTCCTATA 

GTGTGACTAAAAGGGAGGCAAATTATTGGAACGGATTATTCAAATGGATCCTTAAATATTGCTATGTATAATAAGCC 

AGTTATTATATCAGGACCATGTTCTCTGTAGGCCACTTTCTAAAAAAGCCACATATGTGCAATTTTCAGGTTTTTAG 

ACTATTGCTCCCTGTACTTTAAATGTAAAAACCACACTTCTGAACAACTAAGCTCATGAATATGATTTTGGTTATAT 

GCAGCTTTTGACTAGCATGXATTGTGTCTTTTTCTCCTCTATGAATAATTTTATATTTCATGCTACTTCTTGAAAGT 

TTACTCTTTGATGCTCTAAGAGAACAGCCAGATGGTTTATATGAATAATCTTTATCTGCAGGATGGTGGATTGGTAA 

ATTAGGAGAATGTTGTTTGAGATATCAAGATTTATGTCTGGGAACTAAAATATATAATGCCAAATGTGTTTTTGTCA 

ATTACTAGAGAATTCTGTGCAAACATATCATCTCTTCAAATGCTGCACACTTTGCTTTTGTTAAACAGCAGGTAGTA 

GACAGAACAATAACAGTTTCGCGTTAAGACTTTTAAAGGAAATAGAATCGTGATTAAGAAAXCAGAATTTATAGATA 

TATTGGGATAAATGAAGAAATAAAAATGTTTGTCTAGAATGTAGCATCTAGTGACTTTTTAAAGCCCTAACGTTTAC 

ATAAAGAAGCTCTAGTTCTTATAGAAATAACAAAGCAAATAAAAGTTCTTAACAATCCCCTCTTTCGAAGTGCATTT 

TTTTAAAGCAGGGCAGGAGACATTTGGACTCTAGCTATATGACATACTGGGAAAGGCAGAGGGTGGAGGGAAGATTT 

CACTTCATTGTCTAGCCCAGAATCTTGAGCAAGCTAAAGAAACCATCATAATCTAAAATTGCTTCATTTAACACTAA 

CAATTTAGACTTTTTAAACCAAGCATTGAATAATGGCTGGATAACTGCCGAAGTAAGCGCCGCTCCATGAAGTCTGC 

TTACTTATTTAAAAATTGTGTATCAGTTTTAAATACTGTTCATTGTGTGCAGATATAAGGGGAATAGGGCATTCTGT 

AGAATTATACATGTCTAGTTTGTAAAGTGTGTCCTGTGTACTGCAGATGTGTGTTCTCTGGGCTTTATGTATCTGTA 

CAGTAGCTTTCACATTAAAAAAATTGTGGACAAACTTGTCCGGGGGGTTTGAGGGGAGAATGGTGGTTTATATCAAT 

AACGATGCTGTACTATAGTCCATGTAACAAAAGATCTGGAAGTCACCCTCCTCTGGCCCACGGAAAATTTTGGTAAT 

CTTCTAGGTTCTAAAATGAAGATGTATGGGTACTCTGGCAGACTGCATGTTGTATAATTTGAAAAATACTAAAAGTG 

GAAAATAAAATTGAATTAAACTTTGGCTGGTC 

<210> SEQ ID NO 70 

<211> Length : 5,044 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 70 
>M62 0 9 6JPEA_1_T11 

AAGCTTCCCTGGCTTGTGCCTTTCAAATGGACTCTGGGTTTTCCTTCTGGTAGTGCATAGCTGTTCCTTTACAGGCG 
CTTAGGCGTGGCTCTAGGAAAGGTTTTATGAGTCCTGGGCTGATGTAAATGTTGACCAAACACCCTCAACCAGATGG 
CGAGTTTCTGTTTGCAGCAGAGCCCAGGCTGTCTTTTCTTCATAATTCTCTCTGTGCCCACTCCTCGAGGGCAGGAA 
CTGTCCCTGTATCAGTGAGGCATTCGGACTTGGGAGATGTTTTTAGAACATCAGACCAGAAATGAGGGAAGGTGGAA 
ATGGCCAAATCAGGTTCCCCAAGTGACTGCATGCCATCCGAGGGGCCGAGGAAGCAGAGTTCTTCTGACATGGGCTC 
TCTGTTTTAAAATATCAGCCCTTCTCCCATCTCATTATTTTTCCCCTGAAGCTCTTGCACAAGCAAACTATAAATAC 
ATCCTCAAAGCCTTATGTTTCATGACTCTTAGATGCACCCCAGAAATTAGTTTTCACTTGGGCACGGAGGAAGCTGT 
GAGGCTGTTCTGTGATCCCTCCAAATCCTGCAGAATTACTGCCTTTATTGTACAGAGCTAATAGGGTTGGAACAGAA 
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CCACGGTTTTAGCCTGATGACTCAGAATTTCAGACTGATGTGGAATATATTGCTTTTTCCTCTCAATTTCAGTTTGA 
ATCAGAAACTGCAACTGGAACAGGAGAAGCTTAGTTCTGATTATAACAAGCTGAAAATAGAGGACCAAGAGAGAGAA 
ATGAAGCTGGAAAAGCTCTTATTGCTCAACGATAAAAGGGAACAAGCCAGAGAAGACCTCAAAGGGCTGGAGGAGAC 
AGTGTCTAGAGAATTGCAGACACTGCACAACCTTCGGAAACTCTTTGTCCAGGATCTGACCACCCGAGTTAAAAAAA 
GTGTGGAGTTGGACAACGATGATGGAGGGGGCAGTGCTGCCCAGAAGCAGAAAATTTCCTTCTTGGAGAATAACCTG 
GAGCAGCTCACCAAAGTTCACAAGCAGCTGGTCCGGGACAACGCAGACCTGCGCTGTGAACTGCCCAAGCTGGAGAA 
GCGGCTGCGTGCCACGGCGGAGCGCGTCAAGGCTCTGGAGAGCGCGCTGAAGGAGGCCAAGGAGAACGCCATGCGGG 
ACCGTAAGCGCTACCAGCAGGAGGTGGATCGTATCAAGGAGGCCGTGCGGGCCAAGAACATGGCCAGAAGGGCCCAT 
TCAGCCCAGATCGCCAAGCCCATCCGCCCCGGACACTACCCGGCCTCATCTCCAACGGCCGTCCATGCCATTCGAGG 
GGGAGGAGGCAGCTCTTCAAATTCCACTCACTACCAGAAATAAATACAAAATATGACTCCACGTAGCATGTCAAGGA 
CTACATTAATCACCAATTCCTTTATTTTTCCCCCCCTACAGTTTCCATTTTTTTTTTATACTTGCTTACTCCAGCCA 
TCTGCAGTACACCAGTTTCAGGTCTTTTGAGCTGTGTAGAGTTTCTGTGTGTACAGATGTGTGCTCGGACTTTTCTC 
TTTTTGAGAAATCTGAAGGAGATGGTTGCAGAAGATCCACTTACTACTGAGAACCATTACCACCGACTCGGCCTCCG 
GGGTGTTGGGTGGTTTCTGGGTGGTTCCTGGAGCCTCCTCTGGGCAGTGCACTGTCCCATCTGTACGCCCTAATGTG 
CCATTCCCTAGAGGGGAACAACCAAGTGCCGTGGAGGCAGATGATCATGGTCTGCCTCAACTGTCTGGTTTCCTGTA 
AAATAAACACATTGTTTTATATTTTTAGGGAACAAAAAGTGCTGCTATAGGGTTCAAAGTTTTCCTTCTGAACACTT 
TTCCGAAACAAATTACCCCAAAGACACATTTTGAATATCCTGGTCACATCTTTGGATCTGTAAAATATACCTTTTAG 
TATGGCACCTGTTAAAATGCAAAGCAAATTTCTTTGGGGCAGAAAAACAATCTGACAGTAGCAGTGTAGAATTTGTT 
CATTCAAATACATCTGTGTAAATGCAAAAAGTCATAAAATTCACCTCCGAGCTGCTTGCTTTTGAACCTGCAGCAAC 
TAGTCTTAGCCGGCCCGGTTTGAACATCGTTCTTTCAGAAGTGCTGAAAATGCTGCAAAGTTGGATAAGTGGAAATG 
TGGCTGCCCCTCTCCTCACTACTTCCTCTCTGATCGTTCTGAAGCTTGCATTGGGAATGGCTGCTTTCTCTAACCAT 
TTTCAGCTTGAGTGGGTATTGCTGAAGAAATCCAACATCATTCCAGCAGTTGAAAAAGGAAGCCTTCGGGAGAAAGT 
GCTTGTCAAAATTTTGTTCTTTGTGCTTGTGTATGAGTAAGTTGCCATGAATAAGTTATTATTTTAACCCATAATTG 
GCGACTGTTTATATGAATTCTTTCTTTGGCACCAAATAGGTTTCATCTTCTTAGGCACAATTAGAAAAAATCCACAT 
AGATGGATATTTTACATTTAGTTATTGCTTTATCCAAATACATGAATCTAAAGCTGAATCAACCCTTACTTCCAGTT 
GTGCTTATTAAGAAGATCAATTTCCAAGTAGTAAAGTTTTCAGGGAAACTGACTGTGCTGCTATTTGTTTTGACAAA 
TTTGGGGGTAAGTCAATGACAACCAAACCAATCTCGGTGGAAACTCCTATCCTATCATGTTGTGTGCCCAAGATGAG 
TGAGCTGGCACTGTGCCCTGAAGCTTTCACCACTGTAATGAAATATATGCCAGGGGAGACTTTGGGCTTTTCTCATG 
ACTGTGTGGGTCGAAGGTAGCTCAAGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTATGTGTGTAA 
AGTGCTAAGAACTGTGCATTGACATCCAAACATTTCTTGTACAAAATTTCCCTAGCAAAGCAAACCTGCTTTGACTT 
AATTTATTTGTTAAATGTTGCACTTTGTTTATGTATGTTTTGTTTTTGGTGGGGAATAAGGAGAGAGAGGACGACAA 
ATTCTATTGAAGTATTTATTTTGTGAAGATGGCAATTTTGCATTTGTTTAAATTTTTTTCATTCTTTAATTTTGTTA 
TCAGTGCCAGCCCAATATACCTGCTCTACCATTATTTGCGGTCTGATAAAAGGGTCCTTGTGGGGCAGGTTTTGCAA 
AGCTTATCAGGTAATAACATATGCCACATAACCTTGTTGATATGTTTGCTTCTGATTTGGGAAGCTAAACATTGGTG 
TTTGAGAGGATTGCCAATTATTAATTGTCATTACCACTACTCTCCATTACTTTTTGTTTGGAAATTGAACAAAGGTC 
AGTAATGGTTTTTGGCTCTTGTTAATATCCATCATAAAATAGATTGTTTTAGATTCTTTCCAGGGTGATTTTTCCCT 
GGGTACCCCGTTTCTACTTCTAAAGAATTGCTTGGCACTTTCATGTTTCAAAGGGAAACATTCGCTTGTAGTTCCAT 
TTTACTTGATCTCTACAAGGGACTGACAACATTTGCTTTACTTTTATTCACAGAGAAAGTTGGCTTTGATGTCTCTT 
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AAAGATAATTCTGCTAGTTGCTGATCAGCCAGTCAGTTCACCTAGCTTCAATCTTTATAGGACTTCTAATCTAATTT 
TCCTATAGTGTGACTAAAAGGGAGGCAAATTATTGGAACGGATTATTCAAATGGATCCTTAAATATTGCTATGTATA 
ATAAGCCAGTTATTATATCAGGACCATGTTCTCTGTAGGCCACTTTCTAAAAAAGCCACATATGTGCAATTTTCAGG 
TTTTTAGACTATTGCTCCCTGTACTTTAAATGTAAAAACCACACTTCTGAACAACTAAGCTCATGAATATGATTTTG 
GTTATATGCAGCTTTTGACTAGCATGTATXGTGTCTTTTTCTCCTCTATGAATAATTTTATATTTCATGCTACTTCT 
TGAAAGTTTACTCTTTGATGCTCTAAGAGAACAGCCAGATGGTTTATATGAATAATCTTTATCTGCAGGATGGTGGA 
TTGGTAAATTAGGAGAATGTTGTTTGAGATATCAAGATTTATGTCTGGGAACTAAAATATATAATGCCAAATGTGTT 
TTTGTCAATTACTAGAGAATTCTGTGCAAACATATCATCTCTTCAAATGCTGCACACTTTGCTTTTGTTAAACAGCA 
GGTAGTAGACAGAACAATAACAGTTTCGCGTTAAGACTTTTAAAGGAAATAGAATCGTGATTAAGAAATCAGAATTT 
ATAGATATATTGGGATAAATGAAGAAATAAAAATGTTTGTCTAGAATGTAGCATCTAGTGACTTTTTAAAGCCCTAA 
CGTTTACATAAAGAAGCTCTAGTTCTTATAGAAATAACAAAGCAAATAAAAGTTCTTAACAATCCCCTCTTTCGAAG 
TGCATTTTTTTAAAGCAGGGCAGGAGACATTTGGACTCTAGCTATATGACATACTGGGAAAGGCAGAGGGTGGAGGG 
AAGATTTCACTTCATTGTCTAGCCCAGAATCTTGAGCAAGCTAAAGAAACCATCAT7VATCTAAAATTGCTTCATTTA 
ACACTAACAATTTAGACTTTTTAAACCAAGCATTGAATAATGGCTGGATAACTGCCGAAGTAAGCGCCGCTCCATGA 
AGTCTGCTTACTTATTTAAAAATTGTGTATCAGTTTTAAATACTGTTCATTGTGTGCAGATATAAGGGGAATAGGGC 
ATTCTGTAGAATTATACATGTCTAGTTTGTAAAGTGTGTCCTGTGTACTGCAGATGTGTGTTCTCTGGGCTTTATGT 
ATCTGTACAG ( TAGCTTTCACATTAAAAAAATTGTGGACAAACTTGTCCGGGGGGTTTGAGGGGAGAATGGTGGTTTA 
TATCAATAACGATGCTGTACTATAGTCCATGTAACAAAAGATCTGGAAGTCACCCTCCTCTGGCCCACGGAAAATTT 
TGGTAATCTTCTAGGTTCTAAAATGAAGATGTATGGGTACTCTGGCAGACTGCATGTTGTATAATTTGAAAAATACT 
AAAAGTGGAAAATAAAATTGAATTAAACTTTGGCTGGTC 

<210> SEQ ID NO 71 

<211> Length : 2, 945 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 71 
>M 6 2 0 9 6__PEA_ 1JT 1 3 

CGGCAGAGCGCGCTGGTGCTGATGCAGGATGGCTGAGCGCGCAGGAGCCCGGGAGGTCTGAGCCGGGCGAGGCTCGC 
TCCCTGCGCATCGCCTCCTCCGCCCGCCGCGTGGTCGCGGGCAGGTGGGCCGGGGGGCGCTGGGCAGGGGCGGGGCA 
GGGCCAGGGCAGGCCGGTCTGCAGCCGGAGGGGCCGGAGCGGAGAAGCTGCCCACCTTCCCGGGCTCGGAGCGGCCG 
GGGCTGCTCAGCCGGCCGGGCTCGCGATGACCTGCTGAGAAGCGTCGTCGGAGGCTGCAGGAGGCGGCCTAGCTGTG 
GGCGGTGCAGCTCGCGGCCTCCTCCCTCGTCGTTCCCGGCCCCGGCCCCCCACCCATCCCCGTGCCCCCTCCCTACC 
GCCGGCCGAGATGGCGGATCCAGCCGAATGCAGCATCAAAGTGATGTGCCGGTTCCGGCCCCTCAACGAAGCGGAGA 
TCCTCCGCGGGGACAAATTCATCCCCAAATTTAAAGGCGATGAGACCGTGGTGATCGGGCAAGGGAAGCCATATGTC 
TTCGACAGAGTGCTACCTCCCAACACGACCCAAGAGCAGGTTTACAATGCATGTGCGAAGCAAATTGTCAAAGATGT 
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CCTTGAAGGTTATAACGGGACGATTTTTGCGTATGGGCAGACTTCATCAGGAAAAACCCACACCATGGAGGGGAAGC 

TGCATGACCCCCAGCTCATGGGGATCATCCCACGAATTGCCCATGATATCTTTGACCATATCTACTCCATGGATGAG 

AACCTGGAGTTTCACATAAAGGTTTCCTATTTTGAGATCTACTTGGACAAAATAAGGGACTTACTTGATGTATCCAA 

GACCAACTTGGCTGTTCATGAAGATAAAAACAGAGTCCCGTATGTAAAGGGGTGCACTGAGCGGTTTGTGTCGAGCC 

CTGAGGAAGTCATGGATGTAATAGATGAAGGCAAAGCAAACCGACACGTGGCTGTGACAAACATGAATGAACACAGC 

TCTAGAAGTCACAGTATCTTCCTGATAAATATTAAACAAGAGAATGTAGAGACTGAAAAAAAACTCAGTGGGAAACT 

TTATTTGGTTGATTTGGCTGGGAGCGAAAAGGTCAGCAAAACTGGTGCCGAGGGAGCTGTTCTTGACGAAGCTAAAA 

ATATCAATAAGTCTTTGTCTGCTCTTGGAAATGTGATCTCTGCTTTGGCAGAAGGGACAAAAACACATGTGCCATAC 

CGGGACAGCAAGATGACTCGGATTCTTCAGGACTCTTTGGGTGGGAACTGCAGAACCACCATCGTCATTTGCTGTTC 

TCCTTCTGTCTTCAATGAGGCTGAGACCAAGTCCACACTGATGTTCGGACAGAGAGCTAAGACCATCAAGAATACAG 

TCTCTGTGAACCTAGAACTGACAGCAGAAGAATGGAAGAAGAAATATGAAAAAGAGAAAGAGAAAAACAAGACTTTG 

AAGAATGTTATCCAGCATCTGGAGATGGAGCTAAACAGGTGGAGGAATGGAGAAGCTGTGCCTGAGGATGAACAGAT 

CAGTGCCAAGGACCAGAAGAACCTGGAGCCTTGTGATAACACCCCCATCATAGACAATATTGCTCCTGTTGTTGCTG 

GCATCTCTACAGAGGAGAAAGAGAAGTACGATGAGGAGATCTCCAGTCTCTACAGACAACTGGATGACAAGGATGAT 

GAAATTAACCAGCAGAGCCAGCTGGCTGAAAAGCTGAAGCAACAGATGTTGGATCAGGATGAGCTTTTAGCTTCCAC 

AAGAAGAGACTATGAGAAGATACAGGAGGAGCTGACACGTCTCCAGATTGAAAATGAGGCAGCCAAGGATGAGGTGA 

AAGAAGTTCTCCAGGCCCTGGAGGAGCTGGCTGTCAATTATGACCAGAAATCACAGGAAGTGGAGGATAAGACCCGG 

GCCAATGAGCAGCTGACAGACGAGCTGGCCCAGAAAACGACTACATTGACAACCACACAGAGAGAGCTGAGCCAGCT 

ACAAGAGCTTAGCAACCACCAGAAGAAAAGGGCAACTGAGATCCTGAATTTGCTGTTGAAAGATCTGGGGGAGATAG 

GTGGAATTATTGGCACCAATGATGTGAAAACTTTGGCAGATGTGAATGGAGTCATTGAGGAGGAGTTTACCATGGCC 

CGCCTGTACATCAGCAAGATGAAGTCAGAGGTCAAGTCCCTGGTGAACCGCAGCAAACAGCTCGAGAGCGCCCAGAT 

GGACTCCAACAGGAAGATGAATGCCAGCGAGCGGGAGCTGGCAGCCTGCCAGCTGCTCATCTCCCAGCACGAAGCCA 

AGATCAAGTCTCTGACAGACTACATGCAGAACATGGAACAGAAGAGGAGGCAGCTAGAAGAGTCCCAGGACTCGCTC 

AGCGAAGAGCTGGCAAAGCTCCGAGCCCAGGAAAAAATGCACGAAGTCAGCTTCCAGGATAAGGAGAAGGAACATCT 

GACGCGGTTGCAGGATGCTGAAGAAATGAAGAAGGCGCTGGAGCAGCAGATGGAGAGCCACCGGGAAGCTCACCAGA 

AGCAGCTGTCCAGACTCCGAGACGAAATTGAGGAGAAGCAGAAAATCATTGATGAGATTCGGGAGTGAGTCGGCCCA 

GGGCACCAGGGGTGTGGGGGTGGTCATGCCTCGGTCCTCTTGGGGAAGCCTGGAAGGATGTGGCTCTTAGTCGAGGG 

CCCTGCTCACCTTGCCCTGTGGGCACTGCCCTGGTGACACAGCAGGCTGGGCGGGCTGCTTCCAAGGTCTGTTCTCC 

GATCTGGAGCTGAGCCTCCTGGAGCCCTGGCATGGCAGGTGGCAGGCGGGCCCAGCTGCCTCTCCTAGTCCCCGAGG 

GCCAGGGTCACATAGGTGATTCCGCTGGATGGACGCATGCCCCAGTAGATGGGGGGGAGCATCTGTAATGAAAGCAC 
CAAAAAAAAAAAAAAAAAA 

<210> SEQ ID NO 72 

<211> Length : 2,261 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 
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<400> sequence : 72 
>M62 0 9 6_PEA_1_T1 4 

CGGCAGAGCGCGCTGGTGCTGATGCAGGATGGCTGAGCGCGCAGGAGCCCGGGAGGTCTGAGCCGGGCGAGGCTCGC 

TCCCTGCGCATCGCCTCCTCCGCCCGCCGCGTGGTCGCGGGCAGGTGGGCCGGGGGGCGCTGGGCAGGGGCGGGGCA 

GGGCCAGGGCAGGCCGGTCTGCAGCCGGAGGGGCCGGAGCGGAGAAGCTGCCCACCTTCCCGGGCTCGGAGCGGCCG 

GGGCTGCTCAGCCGGCCGGGCTCGCGATGACCTGCTGAGAAGCGTCGTCGGAGGCTGCAGGAGGCGGCCTAGCTGTG 

GGCGGTGCAGCTCGCGGCCTCCTCCCTCGTCGTTCCCGGCCCCGGCCCCCCACCCATCCCCGTGCCCCCTCCCTACC 

GCCGGCCGAGATGGCGGATCCAGCCGAATGCAGCATCAAAGTGATGTGCCGGTTCCGGCCCCTCAACGAAGCGGAGA 

TCCTCCGCGGGGACAAATTCATCCCCAAATTTAAAGGCGATGAGACCGTGGTGATCGGGCAAGGGAAGCCATATGTC 

TTCGACAGAGTGCTACCTCCCAACACGACCCAAGAGCAGGTTTACAATGCATGTGCGAAGCAAATTGTCAAAGATGT 

CCTTGAAGGTTATAACGGGACGATTTTTGCGTATGGGCAGACTTCATCAGGAAAAACCCACACCATGGAGGGGAAGC 

TGCATGACCCCCAGCTCATGGGGATCATCCCACGAATTGCCCATGATATCTTTGACCATATCTACTCCATGGATGAG 

AACCTGGAGTTTCACATAAAGGTTTCCTATTTTGAGATCTACTTGGACAAAATAAGGGACTTACTTGATGTATCCAA 

GACCAACTTGGCTGTTCATGAAGATAAAAACAGAGTCCCGTATGTAAAGGGGTGCACTGAGCGGTTTGTGTCGAGCC 

CTGAGGAAGTCATGGATGTAATAGATGAAGGCAAAGCAAACCGACACGTGGCTGTGACAAACATGAATGAACACAGC 

TCTAGAAGTCACAGTATCTTCCTGATAAATATTAAACAAGAGAATGTAGAGACTGAAAAAAAACTCAGTGGGAAACT 

TTATTTGGTTGATTTGGCTGGGAGCGAAAAGGTCAGCAAAACTGGTGCCGAGGGAGCTGTTCTTGACGAAGCTAAAA 

ATATCAATAAGTCTTTGTCTGCTCTTGGAAATGTGATCTCTGCTTTGGCAGAAGGGACAAA7VACACATGTGCCATAC 

CGGGACAGCAAGATGACTCGGATTCTTCAGGACTCTTTGGGTGGGAACTGCAGAACCACCATCGTCATTTGCTGTTC 

TCCTTCTGTCTTCAATGAGGCTGAGACCAAGTCCACACTGATGTTCGGACAGAGAGCTAAGACCATCAAGAATACAG 

TCTCTGTGAACCTAGAACTGACAGCAGAAGAATGGAAGAAGAAATATGAAAAAGAGAAAGAGAAAAACAAGACTTTG 

AAGAATGTTATCCAGCATCTGGAGATGGAGCTAAACAGGTGGAGGAATGGAGAAGCTGTGCCTGAGGATGAACAGAT 

CAGTGCCAAGGACCAGAAGAACCTGGAGCCTTGTGATAACACCCCCATCATAGACAATATTGCTCCTGTTGTTGCTG 

GCATCTCTACAGAGGAGAAAGAGAAGTACGATGAGGAGATCTCCAGTCTCTACAGACAACTGGATGACAAGGATGAT 

GAAATTAACCAGCAGAGCCAGCTGGCTGAAAAGCTGAAGCAACAGATGTTGGATCAGGATGAGGTAAAGAATGCAAT 

ATATTTTTTTTTCCACAAAGTTCTTCTATTACTCTTTGTTGTTGATGTTTGTTCCAGGAATTTAATTGGCATAGAAG 

CTTTTCATAATTACAGAATCATGTGGAAATTTCTTGGTAGATGTCCCTTCACTGCCTCTTACAAGCTGATTATCACT 

GAATTTAGAAAATAAATGTCTGACTTTCAAAAACCCCTGATGTTTTGAGATTGAGTAGCCAGTGGCTACAGTTCGTT 

CTGGAAGGGCAGAGACCTTTGGTTGGGTGATCAAGCAAGGATGATCCTTTTTTATTTTTATTTTTTTGAGACAGGGT 

CTCTCTGTTGTCCAGGCTGGAATGCAGTGGTGCAATCATGGCTCACTGCAACCTCCAGAGCTCAAATGATCTTCCCG 

CCTAAGACTCTCAAGTAGCTAAGACTACAAGAATGTGCCACCATACCTAGCTAATTTTTTAATATTTTGAGACAGAG 

TTTCTCTATGTTGGTCAGGGTGATCTTG 

<210> SEQ ID NO 73 
<211> Length : 1,059 
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<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 73 
>M62 0 96__PEA_1 JT1 5 

AAGCTTCCCTGGCTTGTGCCTTTCAAATGGACTCTGGGTTTTCCTTCTGGTAGTGCATAGCTGTTCCTTTACAGGCG 
CTTAGGCGTGGCTCTAGGAAAGGTTTTATGAGTCCTGGGCTGATGTAAATGTTGACCAAACACCCTCAACCAGATGG 
CGAGTTTCTGTTTGCAGCAGAGCCCAGGCTGTCTTTTCTTCATAATTCTCTCTGTGCCCACTCCTCGAGGGCAGGAA 
CTGTCCCTGTATCAGTGAGGCATTCGGACTTGGGAGATGTTTTTAGAACATCAGACCAGAAATGAGGGAAGGTGGAA 
ATGGCCAAATCAGGTTCCCCAAGTGACTGCATGCCATCCGAGGGGCCGAGGAAGCAGAGTTCTTCTGACATGGGCTC 
TCTGTTTTAAAATATCAGCCCTTCTCCCATCTCATTATTTTTCCCCTGAAGCTCTTGCACAAGCAAACTATAAATAC 
ATCCTCAAAGCCTTATGTTTCATGACTCTTAGATGCACCCCAGAAATTAGTTTTCACTTGGGCACGGAGGAAGCTGT 
GAGGCTGTTCTGTGATCCCTCCAAATCCTGCAGAATTACTGCCTTTATTGTACAGAGCTAATAGGGTTGGAACAGAA 
CCACGGTTTTAGCCTGATGACTCAGAATTTCAGACTGATGTGGAATATATTGCTTTTTCCTCTCAATTTCAGTTTGA 
ATCAGAAACTGCAACTGGAACAGGAGAAGCTTAGTTCTGATTATAACAAGCTGAAAATAGAGGACCAAGAGAGAGAA 
ATGAAGCTGGAAAAGCTCTTATTGCTCAACGATAAAAGGGAACAAGCCAGAGAAGACCTCAAAGGGCTGGAGGAGAC 
AGTGTCTAGAGAATTGCAGACACTGCACAACCTTCGGAAACTCTTTGTCCAGGATCTGACCACCCGAGTTAAAAAAG 
TGAGTTCTCTTTGTCTGAATGGGACTGAGAAGAAAATCAAAGATGGCAGGGAAGAATCATTTTCAGTTGAAATATCA 
CTTGCTTAAGTCGGGGCTGGTTATGCTTAAAAATTAATTACTGCACACCAGAGAATGT 



<210> SEQ ID NO 74 

<211> Length : 2,715 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 74 
>M7 807 6_PEA_1_T2 

CGCGGGGCGGGGCTGGCGGCGCCGGCGCAGCCCGGGGGCGGCGGGAGGAGGAGGTGGCGGCGGTGGCGCTGGGAGCT 
CCTGTCACCGCTGGGGCCGGGCCGGGCGGGAGTGCAGGGGACGTGAGGGCGCAAGGGCCGGGACATGGGGCCCGCCA 
GCCCCGCTGCTCGCGGTCTAAGTCGCCGCCCGGGCCAGCCGCCGCTGCCGCTGCTGCTGCCACTATTGCTGCTGCTT 
CTGCGCGCGCAGCCCGCCATCGGGAGCCTGGCCGGTGGGAGCCCCGGCGCGGCCGAGGCCCCGGGGTCGGCCCAGGT 
GGCTGGACTATGCGGGCGCCTAACCCTTCACCGGGACCTGCGCACCGGCCGCTGGGAACCAGACCCACAGCGCTCTC 
GACGCTGTCTCCGGGACCCGCAGCGCGTGCTGGAGTACTGCAGACAGATGTACCCGGAGCTGCAGATTGCACGTGTG 
GAGCAGGCTACGCAGGCCATCCCCATGGAGCGCTGGTGCGGGGGTTCCCGGAGCGGCAGCTGCGCCCACCCCCACCA 
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CCAGGTTGTGCCCTTCCGCTGCCTGCCTGGTGAATTTGTGAGTGAGGCCCTGCTGGTGCCTGAAGGCTGCCGGTTCT 
TGCACCAGGAGCGCATGGACCAATGTGAGAGTTCAACCCGGAGGCATCAGGAGGCACAGGAGGCCTGCAGCTCCCAG 
GGCCTCATCCTGCACGGCTCGGGCATGCTCTTACCCTGTGGCTCGGATCGGTTCCGTGGTGTGGAGTATGTGTGCTG 
TCCCCCTCCAGGGACCCCCGACCCATCTGGGACAGCAGTTGGTGACCCCTCCACCCGGTCCTGGCCCCCGGGGAGCA 
GAGTAGAGGGGGCTGAGGACGAGGAAGAGGAGGAATCCTTCCCACAGCCAGTAGATGATTACTTCGTGGAGCCTCCG 
CAGGCTGAAGAGGAAGAGGAAACGGTCCCACCCCCAAGCTCCCATACACTTGCAGTGGTCGGCAAAGTCACTCCCAC 
CCCGAGGCCCACAGACGGTGTGGATATTTACTTTGGCATGCCTGGGGAAATCAGTGAGCACGAGGGGTTCCTGAGGG 
CCAAGATGGACCTGGAGGAGCGTAGGATGCGCCAGATTAATGAGGTGATGCGTGAATGGGCCATGGCAGACAACCAG 
TCCAAGAACCTGCCTAAAGCCGACAGACAGGCCCTGAATGAGCACTTCCAGTCCATTCTGCAGACTCTGGAGGAGCA 
GGTGTCTGGTGAGCGACAGCGCCTGGTGGAAACCCACGCCACCCGCGTCATCGCCCTTATCAACGACCAGCGCCGGG 
CTGCCTTGGAGGGCTTCCTGGCAGCCCTGCAGGCAGATCCGCCTCAGGCGGAGCGTGTCCTGTTGGCCCTGCGGCGC 
TACCTGCGTGCGGAGCAGAAGGAACAGAGGCACACGCTGCGCCACTACCAGCATGTGGCCGCCGTGGATCCCGAGAA 
GGCACAGCAGATGCGCTTCCAGGTGCATACCCACCTTCAAGTGATTGAGGAGAGGGTGAATCAGAGCCTGGGCCTGC 
TTGACCAGAACCCCCACCTGGCTCAGGAGCTGCGGCCCCAAATCCAGGAACTCCTCCACTCTGAACACCTGGGTCCC 
AGTGAATTGGAAGCCCCTGCCCCTGGGGGCAGCAGCGAGGACAAGGGTGGGCTGCAGCCTCCAGATTCCAAGGATGG 
TGAGTGAGCCCACATATAGATGACCCCAGACATTAGGGAACAGGCCCCAGCCTAATTTGTAATCCCCTAGAGTCTGA 
GGGTGTCTTCACCACCACAGTGACTGGGAGAGGATGAGGAGGAACGTCTAAGGTTGCAGGGGCCTCTGTAGGATCCC 
CAATCCTCCTTCTTAGTCCCTGGAAGGATGTTTCTCCACCTTTCTTTGCTGATACCCTCCTCTCTTCACTGTTCCAC 
TCCCTTGCTTCCTCTGGCTGCCAGCAGACACCCCCATGACCCTTCCAAAAGGGTCCACAGAACAAGATGCTGCATCC 
CCTGAGAAAGAGAAGATGAACCCGCTGGAACAGTATGAGCGAAAGGTGAATGCGTCTGTTCCAAGGGGTTTCCCTTT 
CCACTCATCGGAGATTCAGAGGGATGAGCTGGCACCAGCTGGGACAGGGGTGTCCCGTGAGGCTGTGTCGGGTCTGC 
TGATCATGGGAGCGGGCGGAGGCTCCCTCATCGTCCTCTCCATGCTGCTCCTGCGCAGGAAGAAGCCCTACGGGGCT 
ATCAGCCATGGCGTGGTGGAGGTGGACCCCATGCTGACCCTGGAGGAGCAGCAGCTCCGCGAACTGCAGCGGCACGG 
CTATGAGAACCCCACTTACCGCTTCCTGGAGGAACGACCCTGACCCGGCCCCCTTCACCCCTTCAGCCGAGCCCAGA 
CCTCCCCTCTTCCTGGAGCCCCAGAACCCCAACTCCCAGCCTAGGGCAGCAGGGAGTCTTGAAGTGATCATTTCACA 
CCCTTTTGTGAGACGGCTGGAAATTCTTATTTCCCCTTTCCAATTCCAAAATTCCATCGCTAAGAATTCCCAGATAG 
TCCCAGCAGCCTCCCCACGTGGCACCTCCTCACCTTAATTTATTTTTTAAGTTTATTTATGGCTCTTTAAGGTGACC 
GCCACCTTGGTCCTAGTGTCTATTCCCTGGAATTCACCCTCTCATGTTTCCCTACTAACATCCCAATAAAGTCCTCT 
TCCCTACCAGGCCAGTCTGA 

<210> SEQ ID NO 75 

<211> Length : 2, 931 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 75 



WO 2006/131783 



PCT/IB2005/004037 



95 

>M7 8 0 7 6_PE A_1_T3 

CGCGGGGCGGGGCTGGCGGCGCCGGCGCAGCCCGGGGGCGGCGGGAGGAGGAGGTGGCGGCGGTGGCGCTGGGAGCT 
CCTGTCACCGCTGGGGCCGGGCCGGGCGGGAGTGCAGGGGACGTGAGGGCGCAAGGGCCGGGACATGGGGCCCGCCA 
GCCCCGCTGCTCGCGGTCTAAGTCGCCGCCCGGGCCAGCCGCCGCTGCCGCTGCTGCTGCCACTATTGCTGCTGCTT 
CTGCGCGCGCAGCCCGCCATCGGGAGCCTGGCCGGTGGGAGCCCCGGCGCGGCCGAGGCCCCGGGGTCGGCCCAGGT 
GGCTGGACTATGCGGGCGCCTAACCCTTCACCGGGACCTGCGCACCGGCCGCTGGGAACCAGACCCACAGCGCTCTC 
GACGCTGTCTCCGGGACCCGCAGCGCGTGCTGGAGTACTGCAGACAGATGTACCCGGAGCTGCAGATTGCACGTGTG 
GAGCAGGCTACGCAGGCCATCCCCATGGAGCGCTGGTGCGGGGGTTCCCGGAGCGGCAGCTGCGCCCACCCCCACCA 
CCAGGTTGTGCCCTTCCGCTGCCTGCCTGGTGAATTTGTGAGTGAGGCCCTGCTGGTGCCTGAAGGCTGCCGGTTCT 
TGCACCAGGAGCGCATGGACCAATGTGAGAGTTCAACCCGGAGGCATCAGGAGGCACAGGAGGCCTGCAGCTCCCAG 
GGCCTCATCCTGCACGGCTCGGGCATGCTCTTACCCTGTGGCTCGGATCGGTTCCGTGGTGTGGAGTATGTGTGCTG 
TCCCCCTCCAGGGACCCCCGACCCATCTGGGACAGCAGTTGGTGACCCCTCCACCCGGTCCTGGCCCCCGGGGAGCA 
GAGTAGAGGGGGCTGAGGACGAGGAAGAGGAGGAATCCTTCCCACAGCCAGTAGATGATTACTTCGTGGAGCCTCCG 
CAGGCTGAAGAGGAAGAGGAAACGGTCCCACCCCCAAGCTCCCATACACTTGCAGTGGTCGGCAAAGTCACTCCCAC 
CCCGAGGCCCACAGACGGTGTGGATATTTACTTTGGCATGCCTGGGGAAATCAGTGAGCACGAGGGGTTCCTGAGGG 
CCAAGATGGACCTGGAGGAGCGTAGGATGCGCCAGATTAATGAGGTGATGCGTGAATGGGCCATGGCAGACAACCAG 
TCCAAGAACCTGCCTAAAGCCGACAGACAGGCCCTGAATGAGCACTTCCAGTCCATTCTGCAGACTCTGGAGGAGCA 
GGTGTCTGGTGAGCGACAGCGCCTGGTGGAAACCCACGCCACCCGCGTCATCGCCCTTATCAACGACCAGCGCCGGG 
CTGCCTTGGAGGGCTTCCTGGCAGCCCTGCAGGCAGATCCGCCTCAGGCGGAGCGTGTCCTGTTGGCCCTGCGGCGC 
TACCTGCGTGCGGAGCAGAAGGAACAGAGGCACACGCTGCGCCACTACCAGCATGTGGCCGCCGTGGATCCCGAGAA 
GGCACAGCAGATGCGCTTCCAGGTGCATACCCACCTTCAAGTGATTGAGGAGAGGGTGAATCAGAGCCTGGGCCTGC 
TTGACCAGAACCCCCACCTGGCTCAGGAGCTGCGGCCCCAAATCCAGGAACTCCTCCACTCTGAACACCTGGGTCCC 
AGTGAATTGGAAGCCCCTGCCCCTGGGGGCAGCAGCGAGGACAAGGGTGGGCTGCAGCCTCCAGATTCCAAGGATGA 
CACCCCCATGACCCTTCCAAAAGGTGAGTGTCTCACAGTTAACCCCAGCCTCCAAATCCCACTGAATCCCTGAACCC 
AGAAGGAAACAGGGTCCATCCATTGGGAACCTCAGACCCCCTGGGGTAGAGTTTGATGTACTTTCCAGCCCCCTCCT 
CTGGACCCTAAAGAATGAGATAGGGCCAGGCGCTGGTGACTCACACCCGTAATCCTAGCACTTTCAGAGGCTGAGGC 
AGGAGGATCCCTTGAGGCCACGAGTTCTAGACCAGCCTGGGCAACATAATGAGACCCTGTACCTACAAATAATTTAA 
AAATTACCTGGGTGTGGTGGGGCATGTCTGTAGTCCCAGCTGCTCAGGAGGCTGACGTAGAAGGATCACTGGAGCCC 
AGGAAGTTGAGGCTGCAGTGAGCTGAGATCATGCCACTGCACTCCAGCCTGGGTGACAGAGTGAGACTCTGTCTAAA 
GAAAAAAAAAAAGAATGAGATCAGACTTGGGGGTAGGGTCCACAGAACAAGATGCTGCATCCCCTGAGAAAGAGAAG 
ATGAACCCGCTGGAACAGTATGAGCGAAAGGTGAATGCGTCTGTTCCAAGGGGTTTCCCTTTCCACTCATCGGAGAT 
TCAGAGGGATGAGCTGGCACCAGCTGGGACAGGGGTGTCCCGTGAGGCTGTGTCGGGTCTGCTGATCATGGGAGCGG 
GCGGAGGCTCCCTCATCGTCCTCTCCATGCTGCTCCTGCGCAGGAAGAAGCCCTACGGGGCTATCAGCCATGGCGTG 
GTGGAGGTGGACCCCATGCTGACCCTGGAGGAGCAGCAGCTCCGCGAACTGCAGCGGCACGGCTATGAGAACCCCAC 
TTACCGCTTCCTGGAGGAACGACCCTGACCCGGCCCCCTTCACCCCTTCAGCCGAGCCCAGACCTCCCCTCTTCCTG 
GAGCCCCAGAACCCCAACTCCCAGCCTAGGGCAGCAGGGAGTCTTGAAGTGATCATTTCACACCCTTTTGTGAGACG 
GCTGGAAATTCTTATTTCCCCTTTCCAATTCCAAAATTCCATCCCTAAGAATTCCCAGATAGTCCCAGCAGCCTCCC 
CACGTGGCACCTCCTCACCTTAATTTATTTTTTAAGTTTATTTATGGCTCTTTAAGGTGACCGCCACCTTGGTCCTA 
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GTGTCTATTCCCTGGAATTCACCCTCTCATGTTTCCCTACTAACATCCCAATAAAGTCCTCTTCCCTACCAGGCCAG 
TCTGA 

<210> SEQ ID NO 76 

<211> Length : 3, 190 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 



<400> sequence : 76 
>M7 807 6_PE A__1_T 5 

CGCGGGGCGGGGCTGGCGGCGCCGGCGCAGCCCGGGGGCGGCGGGAGGAGGAGGTGGCGGCGGTGGCGCTGGGAGCT 

CCTGTCACCGCTGGGGCCGGGCCGGGCGGGAGTGCAGGGGACGTGAGGGCGCAAGGGCCGGGACATGGGGCCCGCCA 

GCCCCGCTGCTCGCGGTCTAAGTCGCCGCCCGGGCCAGCCGCCGCTGCCGCTGCTGCTGCCACTATTGCTGCTGCTT 

CTGCGCGCGCAGCCCGCCATCGGGAGCCTGGCCGGTGGGAGCCCCGGCGCGGCCGAGGCCCCGGGGTCGGCCCAGGT 

GGCTGGACTATGCGGGCGCCTAACCCTTCACCGGGACCTGCGCACCGGCCGCTGGGAACCAGACCCACAGCGCTCTC 

GACGCTGTCTCCGGGACCCGCAGCGCGTGCTGGAGTACTGCAGACAGATGTACCCGGAGCTGCAGATTGCACGTGTG 

GAGCAGGCTACGCAGGCCATCCCCATGGAGCGCTGGTGCGGGGGTTCCCGGAGCGGCAGCTGCGCCCACCCCCACCA 

CCAGGTTGTGCCCTTCCGCTGCCTGCCTGGTGAATTTGTGAGTGAGGCCCTGCTGGTGCCTGAAGGCTGCCGGTTCT 

TGCACCAGGAGCGCATGGACCAATGTGAGAGTTCAACCCGGAGGCATCAGGAGGCACAGGAGGCCTGCAGCTCCCAG 

GGCCTCATCCTGCACGGCTCGGGCATGCTCTTACCCTGTGGCTCGGATCGGTTCCGTGGTGTGGAGTATGTGTGCTG 

TCCCCCTCCAGGGACCCCCGACCCATCTGGGACAGCAGTTGGTGACCCCTCCACCCGGTCCTGGCCCCCGGGGAGCA 

GAGTAGAGGGGGCTGAGGACGAGGAAGAGGAGGAATCCTTCCCACAGCCAGTAGATGATTACTTCGTGGAGCCTCCG 

CAGGCTGAAGAGGAAGAGGAAACGGTCCCACCCCCAAGCTCCCATACACTTGCAGTGGTCGGCAAAGTCACTCCCAC 

CCCGAGGCCCACAGACGGTGTGGATATTTACTTTGGCATGCCTGGGGAAATCAGTGAGCACGAGGGGTTCCTGAGGG 

CCAAGATGGACCTGGAGGAGCGTAGGATGCGCCAGATTAATGAGGTGATGCGTGAATGGGCCATGGCAGACAACCAG 

TCCAAGAACCTGCCTAAAGCCGACAGACAGGCCCTGAATGAGCACTTCCAGTCCATTCTGCAGACTCTGGAGGAGCA 

GGTGTCTGGTGAGCGACAGCGCCTGGTGGAAACCCACGCCACCCGCGTCATCGCCCTTATCAACGACCAGCGCCGGG 

CTGCCTTGGAGGGCTTCCTGGCAGCCCTGCAGGCAGATCCGCCTCAGGCGGAGCGTGTCCTGTTGGCCCTGCGGCGC 

TACCTGCGTGCGGAGCAGAAGGAACAGAGGCACACGCTGCGCCACTACCAGCATGTGGCCGCCGTGGATCCCGAGAA 

GGCACAGCAGATGCGCTTCCAGGTGCATACCCACCTTCAAGTGATTGAGGAGAGGGTGAATCAGAGCCTGGGCCTGC 

TTGACCAGAACCCCCACCTGGCTCAGGAGCTGCGGCCCCAAATCCAGGAACTCCTCCACTCTGAACACCTGGGTCCC 

AGTGAATTGGAAGCCCCTGCCCCTGGGGGCAGCAGCGAGGACAAGGGTGGGCTGCAGCCXCCAGATTCCAAGGATGG 

TGAGTGAGCCCACATATAGATGACCCCAGACATTAGGGAACAGGCCCCAGCCTAATTTGTAATCCCCTAGAGTCTGA 

GGGTGTCTTCACCACCACAGTGACTGGGAGAGGATGAGGAGGAACGTCTAAGGTTGCAGGGGCCTCTGTAGGATCCC 

CAATCCTCCTTCTTAGTCCCTGGAAGGATGTTTCTCCACCTTTCTTTGCTGATACCCTCCTCTCTTCACTGTTCCAC 

TCCCTTGCTTCCTCTGGCTGCCAGCAGACACCCCCATGACCCTTCCAAAAGGTGAGTGTCTCACAGTTAACCCCAGC 
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CTCCAAATCCCACTGAATCCCTGAACCCAGAAGGAAACAGGGTCCATCCATTGGGAACCTCAGACCCCCTGGGGTAG 
AGTTTGATGTACTTTCCAGCCCCCTCCTCTGGACCCTAAAGAATGAGATAGGGCCAGGCGCTGGTGACTCACACCCG 
TAATCCTAGCACTTTCAGAGGCTGAGGCAGGAGGATCCCTTGAGGCCACGAGTTCTAGACCAGCCTGGGCAACATAA 
TGAGACCCTGTACCTACAAATAATTTAAAAATTACCTGGGTGTGGTGGGGCATGTCTGTAGTCCCAGCTGCTCAGGA 
GGCTGACGTAGAAGGATCACTGGAGCCCAGGAAGTTGAGGCTGCAGTGAGCTGAGATCATGCCACTGCACTCCAGCC 
TGGGTGACAGAGTGAGACTCTGTCTAAAGAAAAAAAAAAAGAATGAGATCAGACTTGGGGGTAGGGTCCACAGAACA 
AGATGCTGCATCCCCTGAGAAAGAGAAGATGAACCCGCTGGAACAGTATGAGCGAAAGGTGAATGCGTCTGTTCCAA 
GGGGTTTCCCTTTCCACTCATCGGAGATTCAGAGGGATGAGCTGGCACCAGCTGGGACAGGGGTGTCCCGTGAGGCT 
GTGTCGGGTCTGCTGATCATGGGAGCGGGCGGAGGCTCCCTCATCGTCCTCTCCATGCTGCTCCTGCGCAGGAAGAA 
GCCCTACGGGGCTATCAGCCATGGCGTGGTGGAGGTGGACCCCATGCTGACCCTGGAGGAGCAGCAGCTCCGCGAAC 
TGCAGCGGCACGGCTATGAGAACCCCACTTACCGCTTCCTGGAGGAACGACCCTGACCCGGCCCCCTTCACCCCTTC 
AGCCGAGCCCAGACCTCCCCTCTTCCTGGAGCCCCAGAACCCCAACTCCCAGCCTAGGGCAGCAGGGAGTCTTGAAG 
TGATCATTTCACACCCTTTTGTGAGACGGCTGGAAATTCTTATTTCCCCTTTCCAATTCCAAAATTCCATCCCTAAG 
AATTCCCAGATAGTCCCAGCAGCCTCCCCACGTGGCACCTCCTCACCTTAATTTATTTTTTAAGTTTATTTATGGCT 
CTTTAAGGTGACCGCCACCTTGGTCCTAGTGTCTATTCCCTGGAATTCACCCTCTCATGTTTCCCTACTAACATCCC 
AATAAAGTCCTCTTCCCTACCAGGCCAGTCTGA 

<210> SEQ ID NO 77 

<211> Length : 2,385 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 77 
>M7 8 07 6_PE A_1__T 1 3 

CGCGGGGCGGGGCTGGCGGCGCCGGCGCAGCCCGGGGGCGGCGGGAGGAGGAGGTGGCGGCGGTGGCGCTGGGAGCT 
CCTGTCACCGCTGGGGCCGGGCCGGGCGGGAGTGCAGGGGACGTGAGGGCGCAAGGGCCGGGACATGGGGCCCGCCA 
GCCCCGCTGCTCGCGGTCTAAGTCGCCGCCCGGGCCAGCCGCCGCTGCCGCTGCTGCTGCCACTATTGCTGCTGCTT 
CTGCGCGCGCAGCCCGCCATCGGGAGCCTGGCCGGTGGGAGCCCCGGCGCGGCCGAGGCCCCGGGGTCGGCCCAGGT 
GGCTGGACTATGCGGGCGCCTAACCCTTCACCGGGACCTGCGCACCGGCCGCTGGGAACCAGACCCACAGCGCTCTC 
GACGCTGTCTCCGGGACCCGCAGCGCGTGCTGGAGTACTGCAGACAGATGTACCCGGAGCTGCAGATTGCACGTGTG 
GAGCAGGCTACGCAGGCCATCCCCATGGAGCGCTGGTGCGGGGGTTCCCGGAGCGGCAGCTGCGCCCACCCCCACCA 
CCAGGTTGTGCCCTTCCGCTGCCTGCCTGGTGAATTTGTGAGTGAGGCCCTGCTGGTGCCTGAAGGCTGCCGGTTCT 
TGCACCAGGAGCGCATGGACCAATGTGAGAGTTCAACCCGGAGGCATCAGGAGGCACAGGAGGCCTGCAGCTCCCAG 
GGCCTCATCCTGCACGGCTCGGGCATGCTCTTACCCTGTGGCTCGGATCGGTTCCGTGGTGTGGAGTATGTGTGCTG 
TCCCCCTCCAGGGACCCCCGACCCATCTGGGACAGCAGTTGGTGACCCCTCCACCCGGTCCTGGCCCCCGGGGAGCA 
GAGTAGAGGGGGCTGAGGACGAGGAAGAGGAGGAATCCTTCCCACAGCCAGTAGATGATTACTTCGTGGAGCCTCCG 
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CAGGCTGAAGAGGAAGAGGAAACGGTCCCACCCCCAAGCTCCCATACACTTGCAGTGGTCGGCAAAGTCACTCCCAC 
CCCGAGGCCCACAGACGGTGTGGATATTTACTTTGGCATGCCTGGGGAAATCAGTGAGCACGAGGGGTTCCTGAGGG 
CCAAGATGGACCTGGAGGAGCGTAGGATGCGCCAGATTAATGAGGTGATGCGTGAATGGGCCATGGCAGACAACCAG 
TCCAAGAACCTGCCTAAAGCCGACAGACAGGCCCTGAATGAGCACTTCCAGTCCATTCTGCAGACTCTGGAGGAGCA 
GGTGTCTGGTGAGCGACAGCGCCTGGTGGAAACCCACGCCACCCGCGTCATCGCCCTTATCAACGACCAGCGCCGGG 
CTGCCTTGGAGGGCTTCCTGGCAGCCCTGCAGGCAGATCCGCCTCAGGCGGAGCGTGTCCTGTTGGCCCTGCGGCGC 
TACCTGCGTGCGGAGCAGAAGGAACAGAGGCACACGCTGCGCCACTACCAGCATGTGGCCGCCGTGGATCCCGAGAA 
GGCACAGCAGATGCGCTTCCAGGTGCATACCCACCTTCAAGTGATTGAGGAGAGGGTGAATCAGAGCCTGGGCCTGC 
TTGACCAGAACCCCCACCTGGCTCAGGAGCTGCGGCCCCAAATCCAGGAACTCCTCCACTCTGAACACCTGGGTCCC 
AGTGAATTGGAAGCCCCTGCCCCTGGGGGCAGCAGCGAGGACAAGGGTGGGCTGCAGCCTCCAGATTCCAAGGATGA 
CACCCCCATGACCCTTCCAAAAGGTGAATGCGTCTGTTCCAAGGGGTTTCCCTTTCCACTCATCGGAGATTCAGAGG 
GATGAGCTGGCACCAGCTGGGACAGGGGTGTCCCGTGAGGCTGTGTCGGGTCTGCTGATCATGGGAGCGGGCGGAGG 
CTCCCTCATCGTCCTCTCCATGCTGCTCCTGCGCAGGAAGAAGCCCTACGGGGCTATCAGCCATGGCGTGGTGGAGG 
TGGACCCCATGCTGACCCTGGAGGAGCAGCAGCTCCGCGAACTGCAGCGGCACGGCTATGAGAACCCCACTTACCGC 
TTCCTGGAGGAACGACCCTGACCCGGCCCCCTTCACCCCTTCAGCCGAGCCCAGACCTCCCCTCTTCCTGGAGCCCC 
AGAACCCCAACTCCCAGCCTAGGGCAGCAGGGAGTCTTGAAGTGATCATTTCACACCCTTTTGTGAGACGGCTGGAA 
ATTCTTATTTCCCCTTTCCAATTCCAAAATTCCATCCCTAAGAATTCCCAGATAGTCCCAGCAGCCTCCCCACGTGG 
CACCTCCTCACCTTAATTTATTTTTTAAGTTTATTTATGGCTCTTTAAGGTGACCGCCACCTTGGTCCTAGTGTCTA 
TTCCCTGGAATTCACCCTCTCATGTTTCCCTACTAACATCCCAATAAAGTCCTCTTCCCTACCAGGCCAGTCTGA 

<210> SEQ ID NO 78 

<211> Length : 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 78 
>M7 8 0 7 6_PEA__1_T1 5 

CGCGGGGCGGGGCTGGCGGCGCCGGCGCAGCCCGGGGGCGGCGGGAGGAGGAGGTGGCGGCGGTGGCGCTGGGAGCT 
CCTGTCACCGCTGGGGCCGGGCCGGGCGGGAGTGCAGGGGACGTGAGGGCGCAAGGGCCGGGACATGGGGCCCGCCA 
GCCCCGCTGCTCGCGGTCTAAGTCGCCGCCCGGGCCAGCCGCCGCTGCCGCTGCTGCTGCCACTATTGCTGCTGCTT 
CTGCGCGCGCAGCCCGCCATCGGGAGCCTGGCCGGTGGGAGCCCCGGCGCGGCCGAGGCCCCGGGGTCGGCCCAGGT 
GGCTGGACTATGCGGGCGCCTAACCCTTCACCGGGACCTGCGCACCGGCCGCTGGGAACCAGACCCACAGCGCTCTC 
GACGCTGTCTCCGGGACCCGCAGCGCGTGCTGGAGTACTGCAGACAGATGTACCCGGAGCTGCAGATTGCACGTGTG 
GAGCAGGCTACGCAGGCCATCCCCATGGAGCGCTGGTGCGGGGGTTCCCGGAGCGGCAGCTGCGCCCACCCCCACCA 
CCAGGTTGTGCCCTTCCGCTGCCTGCCTGGTGAATTTGTGAGTGAGGCCCTGCTGGTGCCTGAAGGCTGCCGGTTCT 
TGCACCAGGAGCGCATGGACCAATGTGAGAGTTCAACCCGGAGGCATCAGGAGGCACAGGAGGCCTGCAGCTCCCAG 
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GGCCTCATCCTGCACGGCTCGGGCATGCTCTTACCCTGTGGCTCGGATCGGTTCCGTGGTGTGGAGTATGTGTGCTG 
TCCCCCTCCAGGGACCCCCGACCCATCTGGGACAGCAGTTGGTGACCCCTCCACCCGGTCCTGGCCCCCGGGGAGCA 
GAGTAGAGGGGGCTGAGGACGAGGAAGAGGAGGAATCCTTCCCACAGCCAGTAGATGATTACTTCGTGGAGCCTCCG 
CAGGCTGAAGAGGAAGAGGAAACGGTCCCACCCCCAAGCTCCCATACACTTGCAGTGGTCGGCAAAGTCACTCCCAC 
CCCGAGGCCCACAGACGGTGTGGATATTTACTTTGGCATGCCTGGGGAAATCAGTGAGCACGAGGGGTTCCTGAGGG 
CCAAGATGGACCTGGAGGAGCGTAGGATGCGCCAGATTAATGAGGTGATGCGTGAATGGGCCATGGCAGACAACCAG 
TCCAAGAACCTGCCTAAAGCCGACAGACAGGCCCTGAATGAGCACTTCCAGTCCATTCTGCAGACTCTGGAGGAGCA 
GGTGTCTGGTGAGCGACAGCGCCTGGTGGAAACCCACGCCACCCGCGTCATCGCCCTTATCAACGACCAGCGCCGGG 
CTGCCTTGGAGGGCTTCCTGGCAGCCCTGCAGGCAGATCCGCCTCAGGCGGAGCGTGTCCTGTTGGCCCTGCGGCGC 
TACCTGCGTGCGGAGCAGAAGGAACAGAGGCACACGCTGCGCCACTACCAGCATGTGGCCGCCGTGGATCCCGAGAA 
GGCACAGCAGATGCGCTTCCAGGTGCATACCCACCTTCAAGTGATTGAGGAGAGGGTGAATCAGAGCCTGGGCCTGC 
TTGACCAGAACCCCCACCTGGCTCAGGAGCTGCGGCCCCAAATCCAGGAACTCCTCCACTCTGAACACCTGGGTCCC 
AGTGAATTGGAAGCCCCTGCCCCTGGGGGCAGCAGCGAGGACAAGGGTGGGCTGCAGCCTCCAGATTCCAAGGATGA 
CACCCCCATGACCCTTCCAAAAGGGTCCACAGAACAAGATGCTGCATCCCCTGAGAAAGAGAAGATGAACCCGCTGG 
AACAGTATGAGCGAAAGGTGAATGCGTCTGTTCCAAGGGGTTTCCCTTTCCACTCATCGGAGATTCAGAGGGATGAG 
CTGGTAAGAGGAGGAACAGCCGGGTACCTAGGGGAAGAGACCAGAGGTCAGCGGCCAGGCTGTGATTCCCAAAGCCA 
CACAGGACCCTCAAAGAAGCCCTCTGCCCCATCTCCTCTCCCTGCAGGCACCAGCTGGGACAGGGGTGTCCCGTGAG 
GCTGTGTCGGGTCTGCTGATCATGGGAGCGGGCGGAGGCTCCCTCATCGTCCTCTCCATGCTGCTCCTGCGCAGGAA 
GAAGCCCTACGGGGCTATCAGCCATGGCGTGGTGGAGGTGGACCCCATGCTGACCCTGGAGGAGCAGCAGCTCCGCG 
AACTGCAGCGGCACGGCTATGAGAACCCCACTTACCGCTTCCTGGAGGAACGACCCTGACCCGGCCCCCTTCACCCC 
TTCAGCCGAGCCCAGACCTCCCCTCTTCCTGGAGCCCCAGAACCCCAACTCCCAGCCTAGGGCAGCAGGGAGTCTTG 
AAGTGATCATTTCACACCCTTTTGTGAGACGGCTGGAAATTCTTATTTCCCCTTTCCAATTCCAAAATTCCATCCCT 
AAGAATTCCCAGATAGTCCCAGCAGCCTCCCCACGTGGCACCTCCTCACCTTAATTTATTTTTTAAGTTTATTTATG 
GCTCTTTAAGGTGACCGCCACCTTGGTCCTAGTGTCTATTCCCTGGAATTCACCCTCTCATGTTTCCCTACTAACAT 
CCCAATAAAGTCCTCTTCCCTAC CAGGCCAGTCTGA 

<210> SEQ ID NO 79 

<211> Length : 2,297 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 79 
>M7 8 0 7 6_PEA__1_T2 3 

CGCGGGGCGGGGCTGGCGGCGCCGGCGCAGCCCGGGGGCGGCGGGAGGAGGAGGTGGCGGCGGTGGCGCTGGGAGCT 
CCTGTCACCGCTGGGGCCGGGCCGGGCGGGAGTGCAGGGGACGTGAGGGCGCAAGGGCCGGGACATGGGGCCCGCCA 
GCCCCGCTGCTCGCGGTCTAAGTCGCCGCCCGGGCCAGCCGCCGCTGCCGCTGCTGCTGCCACTATTGCTGCTGCTT 
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CTGCGCGCGCAGCCCGCCATCGGGAGCCTGGCCGGTGGGAGCCCCGGCGCGGCCGAGGCCCCGGGGTCGGCCCAGGT 
GGCTGGACTATGCGGGCGCCTAACCCTTCACCGGGACCTGCGCACCGGCCGCTGGGAACCAGACCCACAGCGCTCTC 
GACGCTGTCTCCGGGACCCGCAGCGCGTGCTGGAGTACTGCAGACAGATGTACCCGGAGCTGCAGATTGCACGTGTG 
GAGCAGGCTACGCAGGCCATCCCCATGGAGCGCTGGTGCGGGGGTTCCCGGAGCGGCAGCTGCGCCCACCCCCACCA 
CCAGGTTGTGCCCTTCCGCTGCCTGCCTGGTGAATTTGTGAGTGAGGCCCTGCTGGTGCCTGAAGGCTGCCGGTTCT 
TGCACCAGGAGCGCATGGACCAATGTGAGAGTTCAACCCGGAGGCATCAGGAGGCACAGGAGGCCTGCAGCTCCCAG 
GGCCTCATCCTGCACGGCTCGGGCATGCTCTTACCCTGTGGCTCGGATCGGTTCCGTGGTGTGGAGTATGTGTGCTG 
TCCCCCTCCAGGGACCCCCGACCCATCTGGGACAGCAGTTGGTGACCCCTCCACCCGGTCCTGGCCCCCGGGGAGCA 
GAGTAGAGGGGGCTGAGGACGAGGAAGAGGAGGAATCCTTCCCACAGCCAGTAGATGATTACTTCGTGGAGCCTCCG 
CAGGCTGAAGAGGAAGAGGAAACGGTCCCACCCCCAAGCTCCCATACACTTGCAGTGGTCGGCAAAGTCACTCCCAC 
CCCGAGGCCCACAGACGGTGTGGATATTTACTTTGGCATGCCTGGGGAAATCAGTGAGCACGAGGGGTTCCTGAGGG 
CCAAGATGGACCTGGAGGAGCGTAGGATGCGCCAGATTAATGAGGTGATGCGTGAATGGGCCATGGCAGACAACCAG 
TCCAAGAACCTGCCTAAAGCCGACAGACAGGCCCTGAATGAGGCGGAGCGTGTCCTGTTGGCCCTGCGGCGCTACCT 
GCGTGCGGAGCAGAAGGAACAGAGGCACACGCTGCGCCACTACCAGCATGTGGCCGCCGTGGATCCCGAGAAGGCAC 
AGCAGATGCGCTTCCAGGTGCATACCCACCTTCAAGTGATTGAGGAGAGGGTGAATCAGAGCCTGGGCCTGCTTGAC 
CAGAACCCCCACCTGGCTCAGGAGCTGCGGCCCCAAATCCAGGAACTCCTCCACTCTGAACACCTGGGTCCCAGTGA 
ATTGGAAGCCCCTGCCCCTGGGGGCAGCAGCGAGGACAAGGGTGGGCTGCAGCCTCCAGATTCCAAGGATGACACCC 
CCATGACCCTTCCAAAAGGGTCCACAGAACAAGATGCTGCATCCCCTGAGAAAGAGAAGATGAACCCGCTGGAACAG 
TATGAGCGAAAGGTGAATGCGTCTGTTCCAAGGGGTTTCCCTTTCCACTCATCGGAGATTCAGAGGGATGAGCTGGC 
ACCAGCTGGGACAGGGGTGTCCCGTGAGGCTGTGTCGGGTCTGCTGATCATGGGAGCGGGCGGAGGCTCCCTCATCG 
TCCTCTCCATGCTGCTCCTGCGCAGGAAGAAGCCCTACGGGGCTATCAGCCATGGCGTGGTGGAGGTGGACCCCATG 
CTGACCCTGGAGGAGCAGCAGCTCCGCGAACTGCAGCGGCACGGCTATGAGAACCCCACTTACCGCTTCCTGGAGGA 
ACGACCCTGACCCGGCCCCCTTCACCCCTTCAGCCGAGCCCAGACCTCCCCTCTTCCTGGAGCCCCAGAACCCCAAC 
TCCCAGCCTAGGGCAGCAGGGAGTCTTGAAGTGATCATTTCACACCCTTTTGTGAGACGGCTGGAAATTCTTATTTC 
CCCTTTCCAATTCCAAAATTCCATCCCTAAGAATTCCCAGATAGTCCCAGCAGCCTCCCCACGTGGCACCTCCTCAC 
CTTAATTTATTTTTTAAGTTTATTTATGGCTCTTTAAGGTGACCGCCACCTTGGTCCTAGTGTCTATTCCCTGGAAT 
TCACCCTCTCATGTTTCCCTACTAACATCCCAATAAAGTCCTCTTCCCTACCAGGCCAGTCTGA 

<210> SEQ ID NO 80 

<211> Length : 2 f 457 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 80 
>M78076 PEA 1 T2 6 
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CGCGGGGCGGGGCTGGCGGCGCCGGCGCAGCCCGGGGGCGGCGGGAGGAGGAGGTGGCGGCGGTGGCGCTGGGAGCT 

CCTGTCACCGCTGGGGCCGGGCCGGGCGGGAGTGCAGGGGACGTGAGGGCGCAAGGGCCGGGACATGGGGCCCGCCA 

GCCCCGCTGCTCGCGGTCTAAGTCGCCGCCCGGGCCAGCCGCCGCTGCCGCTGCTGCTGCCACTATTGCTGCTGCTT 

CTGCGCGCGCAGCCCGCCATCGGGAGCCTGGCCGGTGGGAGCCCCGGCGCGGCCGAGGCCCCGGGGTCGGCCCAGGT 

GGCTGGACTATGCGGGCGCCTAACCCTTCACCGGGACCTGCGCACCGGCCGCTGGGAACCAGACCCACAGCGCTCTC 

GACGCTGTCTCCGGGACCCGCAGCGCGTGCTGGAGTACTGCAGACAGATGTACCCGGAGCTGCAGATTGCACGTGTG 

GAGCAGGCTACGCAGGCCATCCCCATGGAGCGCTGGTGCGGGGGTTCCCGGAGCGGCAGCTGCGCCCACCCCCACCA 

CCAGGTTGTGCCCTTCCGCTGCCTGCCTGGTGAATTTGTGAGTGAGGCCCTGCTGGTGCCTGAAGGCTGCCGGTTCT 

TGCACCAGGAGCGCATGGACCAATGTGAGAGTTCAACCCGGAGGCATCAGGAGGCACAGGAGGCCTGCAGCTCCCAG 

GGCCTCATCCTGCACGGCTCGGGCATGCTCTTACCCTGTGGCTCGGATCGGTTCCGTGGTGTGGAGTATGTGTGCTG 

TCCCCCTCCAGGGACCCCCGACCCATCTGGGACAGCAGTTGGTGACCCCTCCACCCGGTCCTGGCCCCCGGGGAGCA 

GAGTAGAGGGGGCTGAGGACGAGGAAGAGGAGGAATCCTTCCCACAGCCAGTAGATGATTACTTCGTGGAGCCTCCG 

CAGGCTGAAGAGGAAGAGGAAACGGTCCCACCCCCAAGCTCCCATACACTTGCAGTGGTCGGCAAAGTCACTCCCAC 

CCCGAGGCCCACAGACGGTGTGGATATTTACTTTGGCATGCCTGGGGAAATCAGTGAGCACGAGGGGTTCCTGAGGG 

CCAAGATGGACCTGGAGGAGCGTAGGATGCGCCAGATTAATGAGGTGATGCGTGAATGGGCCATGGCAGACAACCAG 

TCCAAGAACCTGCCTAAAGCCGACAGACAGGCCCTGAATGAGCACTTCCAGTCCATTCTGCAGACTCTGGAGGAGCA 

GGTGTCTGGTGAGCGACAGCGCCTGGTGGAAACCCACGCCACCCGCGTCATCGCCCTTATCAACGACCAGCGCCGGG 

CTGCCTTGGAGGGCTTCCTGGCAGCCCTGCAGGCAGATCCGCCTCAGGCGGAGCGTGTCCTGTTGGCCCTGCGGCGC 

TACCTGCGTGCGGAGCAGAAGGAACAGAGGCACACGCTGCGCCACTACCAGCATGTGGCCGCCGTGGATCCCGAGAA 

GGCACAGCAGATGCGCTTCCAGGTGCATACCCACCTTCAAGTGATTGAGGAGAGGGTGAATCAGAGCCTGGGCCTGC 

TTGACCAGAACCCCCACCTGGCTCAGGAGCTGCGGCCCCAAATCCGTGAGTGTCTATTACCCTGGCTCCCATTACAG 

ATCTCTGAGGGCAGATCTTGACTCCTAAATGTTGGGCCCCCCCAATTTCATTTATTCCTCTATAACAAACAGCCCAG 

ACCTTAGCAGTGAAAATCAACAATGATTTTTCTTTGTTCATGATTCTGCCATCCGGTCTGCGCTCAGCAGAGTGGTT 

CTTTCAGTGGTCTTGCCAGTGGTCAAGCATGCAGCTGTATTTAGCTAGCAGATCATCTAGGGGCTGGGAGTCTAGCA 

CAAATGGACCTTTCTCTCTCTCCAAGGAAGCGCAAGGCCTCTCTTCTCCGTGGAGCTTCTCCATGTGGTCTCATCAG 

CAGGGTAGCTAGATTCCCTACATGGTGGTTTATGCTCTCTAAGACATCACAGTGGAAGTTGCTAGGTCTTAAGGCTT 

GGGCCCACATTCTATTTGTTAAAGCAAGTTACAAATTCAGTCCAGATTCAAGGGAAGGAACCTATATGCATACCGGA 

AAGTGTGACCTATTGCAGCCCCCACATCTATTGTGTCTTTCTCCTGGATATCTCACACATAACCCTGATTCTCCTAG 

TATTTAAGAAAGCTATCATCTTGAGGCGCGGTGGCTCACGCCTATAATCCCAGCACTTTAGGAGGCCGAGGCGGGTG 

GATCACTTGAGGTCAGGAGTTCGAGACCAGCCTGGCCAACATGGTGAAACCCCGTCTTTACTAAAAATACAAAAATC 

AGCCGGGCATGATGTCGCTTGCCTGTAATCCCAGCTACTTAGGAGGCTGAGGCAAGAGAATTGCTTGAACCCGGGAG 

GTGGAGGTTGCAGTGAGCTGAGATCGCATCATTGCACTCCAGCTGGGCAACAAGAGTGAGACTCTGTCTC 

<210> SEQ ID NO 81 
<211> Length : 4, 104 
<212> Type : DNA 
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<213> ORGANISM : Homo sapiens 

<400> sequence : 81 
>M7 8 07 6_PEA_1 JT2 7 

CGCGGGGCGGGGCTGGCGGCGCCGGCGCAGCCCGGGGGCGGCGGGAGGAGGAGGTGGCGGCGGTGGCGCTGGGAGCT 
CCTGTCACCGCTGGGGCCGGGCCGGGCGGGAGTGCAGGGGACGTGAGGGCGCAAGGGCCGGGACATGGGGCCCGCCA 
GCCCCGCTGCTCGCGGTCTAAGTCGCCGCCCGGGCCAGCCGCCGCTGCCGCTGCTGCTGCCACTATTGCTGCTGCTT 
CTGCGCGCGCAGCCCGCCATCGGGAGCCTGGCCGGTGGGAGCCCCGGCGCGGCCGAGGCCCCGGGGTCGGCCCAGGT 
GGCTGGACTATGCGGGCGCCTAACCCTTCACCGGGACCTGCGCACCGGCCGCTGGGAACCAGACCCACAGCGCTCTC 
GACGCTGTCTCCGGGACCCGCAGCGCGTGCTGGAGTACTGCAGACAGATGTACCCGGAGCTGCAGATTGCACGTGTG 
GAGCAGGCTACGCAGGCCATCCCCATGGAGCGCTGGTGCGGGGGTTCCCGGAGCGGCAGCTGCGCCCACCCCCACCA 
CCAGGTTGTGCCCTTCCGCTGCCTGCCTGGTGAATTTGTGAGTGAGGCCCTGCTGGTGCCTGAAGGCTGCCGGTTCT 
TGCACCAGGAGCGCATGGACCAATGTGAGAGTTCAACCCGGAGGCATCAGGAGGCACAGGAGGCCTGCAGCTCCCAG 
GGCCTCATCCTGCACGGCTCGGGCATGCTCTTACCCTGTGGCTCGGATCGGTTCCGTGGTGTGGAGTATGTGTGCTG 
TCCCCCTCCAGGGACCCCCGACCCATCTGGGACAGCAGTTGGTGACCCCTCCACCCGGTCCTGGCCCCCGGGGAGCA 
GAGTAGAGGGGGCTGAGGACGAGGAAGAGGAGGAATCCTTCCCACAGCCAGTAGATGATTACTTCGTGGAGCCTCCG 
CAGGCTGAAGAGGAAGAGGAAACGGTCCCACCCCCAAGCTCCCATACACTTGCAGTGGTCGGCAAAGTCACTCCCAC 
CCCGAGGCCCACAGACGGTGTGGATATTTACTTTGGCATGCCTGGGGAAATCAGTGAGCACGAGGGGTTCCTGAGGG 
CCAAGATGGACCTGGAGGAGCGTAGGATGCGCCAGATTAATGAGGTGATGCGTGAATGGGCCATGGCAGACAACCAG 
TCCAAGAACCTGCCTAAAGCCGACAGACAGGCCCTGAATGAGCACTTCCAGTCCATTCTGCAGACTCTGGAGGAGCA 
GGTGTCTGGTGAGCGACAGCGCCTGGTGGAAACCCACGCCACCCGCGTCATCGCCCTTATCAACGACCAGCGCCGGG 
CTGCCTTGGAGGGCTTCCTGGCAGCCCTGCAGGCAGATCCGCCTCAGGCGGAGCGTGTCCTGTTGGCCCTGCGGCGC 
TACCTGCGTGCGGAGCAGAAGGAACAGAGGCACACGCTGCGCCACTACCAGCATGTGGCCGCCGTGGATCCCGAGAA 
GGCACAGCAGATGCGCTTCCAGGTGCTCACATCCTTCCAGCTCCCAAATGCGCCGCTATTCCTCAGACGCCCGCGCC 
TCAGGCTCTTCTCTTGTCCCTTAGACCCTCTTTCTGTCTCTTGGACCCCTTCCTATCCCCTGAACACCGCTTCTCTG 
CCCCTTCCCAGTCTCTCAGCTCAGCTTCCTGACCCTGAAACATGGACCCTCACATGCTGTGTCTTTGACCCCTGCTT 
CTTGGCCCTTGGATTCCTACTCCCCCCGCCGTCGATCCTATGTTCTGTCCCTTGGATTTTCACTGCCTTTCCCAGAA 
TCGTCTTTTTTTTTTTTTTTTTTTTGAGACAGGTTCTTGCTCTGTCGCCCAGGCAGGAGAGCAGTGTGCGATCTTGG 
CTCATTGCAACTTCCACCTCCTGGGTTCAAGCAATTCTCCTGCCTCAGCCTCTCGAGTAGCTGGGATTACAGGAGCC 
TGCCACCACACTGGGCTAATTTTTTTTTTTTTTTTTGACAGAGTCTCGCTCTGTTTCCCAGGCTGGAGTGCAGTGAC 
ATGATCTGGGCTCACTGCAACCTCCGCCTACTGGGTTCAAGCTATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGAC 
TACAGGCGGGTGTCACCACATCTGGCTGATTTTTGTATTTTTAGTAGAGACAGGGTTTCACCATACTGGTCAGGCTG 
GTCTTGAACTCGACCTCAGGTGATCCACCCTTGGCCTCCTAAAGTACTCGGATTACAGGTGTGAGCCACCACGCCCG 
GCCCCAGCTAATTTTTGTATTTTTGGTAGACACGGGTTTCAGCATGTTGGCCAGGCTGGTCTTGAACTCCTGACCTC 
AGGTGATCTGCCTGCCTTGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCATGCCCAGCCAGAAACCCCAA 
TAACTTTTGCACCAATCTAATATTTTTAGCAGAGACAGGGTTTTGCCATGTTGCCCAGGCTGGTCTCGAACTCCTGA 
CCTCAGGTGATCTGCCCACCTCGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCATGCCCGGCCAGAAACC 
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CCAATAACTTGCACCAATCTAATAXTTTTAGCAGAGACAGGGTTTTGCCATGTTGCCCAGGCTAGTCTCAAACTCCT 
GACCTCAGGTGATCTGCCTACCTCGGCCTCCCAAAGTGCTGGGATTACAGGCATGAGCCACCGCGCCCGGTCGAGAA 
TCTCCTTCTTGTTCCTTGAACCCTCTTCCTGTCCCTCAACCTCCTTTCTCCATAACTTCACTTGTTTTCCCTGGAAC 
CCCTGTTCTGTGCGCTCAAATTTGAATTCCCCTTTCCTGGATGTTTTCTTCCTGTCTATGAAACTCCATTCTGTGCT 
CTTGAACTCCAAATCTTGCCTTGAACCATGTCATTTCTATATGACCCTCCAATCCTCAATCTCTGTCTCTGGAATCC 
CCTCAAACCCCACTTTCTGTTCCTTGGACTTTATTCTTCAATTTCCTTCTCCTATGGCCCAGTTCCTAACCCTTGTA 
CCACACATCCTGTCCATTGCATGTGCCGCTTTTCCTCAGTCGCTATTGAATTCCTCCTTCATACTGCTTCAGTTTCC 
TCATCTCCAGCCTGCATTGCGCAGTTCATCCTTCATGTCCACTCACCCACAGGTGCATACCCACCTTCAAGTGATTG 
AGGAGAGGGTGAATCAGAGCCTGGGCCTGCTTGACCAGAACCCCCACCTGGCTCAGGAGCTGCGGCCCCAAATCCGT 
GAGTGTCTATTACCCTGGCTCCCATTACAGATCTCTGAGGGCAGATCTTGACTCCTAAATGTTGGGCCCCCCCAATT 
TCATTTATTCCTCTATAACAAACAGCCCAGACCTTAGCAGTGAAAATCAACAATGATTTTTCTTTGTTCATGATTCT 
GCCATCCGGTCTGCGCTCAGCAGAGTGGTTCTTTCAGTGGTCTTGCCAGTGGTCAAGCATGCAGCTGTATTTAGCTA 
GCAGATCATCTAGGGGCTGGGAGTCTAGCACAAATGGACCTTTCTCTCTCTCCAAGGAAGCGCAAGGCCTCTCTTCT 
CCGTGGAGCTTCTCCATGTGGTCTCATCAGCAGGGTAGCTAGATTCCCTACATGGTGGTTTATGCTCTCTAAGACAT 
CACAGTGGAAGTTGCTAGGTCTTAAGGCTTGGGCCCACATTCTATTTGTTAAAGCAAGTTACAAATTCAGTCCAGAT 
TCAAGGGAAGGAACCTATATGCATACCGGAAAGTGTGACCTATTGCAGCCCCCACATCTATTGTGTCTTTCTCCTGG 
ATATCTCACACATAACCCTGATTCTCCTAGTATTTAAGAAAGCTATCATCTTGAGGCGCGGTGGCTCACGCCTATAA 
TCCCAGCACTTTAGGAGGCCGAGGCGGGTGGATCACTTGAGGTCAGGAGTTCGAGACCAGCCTGGCCAACATGGTGA 
AACCCCGTCTTTACTAAAAATACAAAAATCAGCCGGGCATGATGTCGCTTGCCTGTAATCCCAGCTACTTAGGAGGC 
TGAGGCAAGAGAATTGCTTGAACCCGGGAGGTGGAGGTTGCAGTGAGCTGAGATCGCATCATTGCACTCCAGCTGGG 
CAACAAGAGTGAGACTCTGTCTC 

<210> SEQ ID NO 82 

<211> Length : 1,795 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 82 
>M7 807 6_PEA_1_T2 8 

CGCGGGGCGGGGCTGGCGGCGCCGGCGCAGCCCGGGGGCGGCGGGAGGAGGAGGTGGCGGCGGTGGCGCTGGGAGCT 
CCTGTCACCGCTGGGGCCGGGCCGGGCGGGAGTGCAGGGGACGTGAGGGCGCAAGGGCCGGGACATGGGGCCCGCCA 
GCCCCGCTGCTCGCGGTCTAAGTCGCCGCCCGGGCCAGCCGCCGCTGCCGCTGCTGCTGCCACTATTGCTGCTGCTT 
CTGCGCGCGCAGCCCGCCATCGGGAGCCTGGCCGGTGGGAGCCCCGGCGCGGCCGAGGCCCCGGGGTCGGCCCAGGT 
GGCTGGACTATGCGGGCGCCTAACCCTTCACCGGGACCTGCGCACCGGCCGCTGGGAACCAGACCCACAGCGCTCTC 
GACGCTGTCTCCGGGACCCGCAGCGCGTGCTGGAGTACTGCAGACAGATGTACCCGGAGCTGCAGATTGCACGTGTG 
GAGCAGGCTACGCAGGCCATCCCCATGGAGCGCTGGTGCGGGGGTTCCCGGAGCGGCAGCTGCGCCCACCCCCACCA 
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CCAGGTTGTGCCCTTCCGCTGCCTGCCTGGTGAATTTGTGAGTGAGGCCCTGCTGGTGCCTGAAGGCTGCCGGTTCT 

TGCACCAGGAGCGCATGGACCAATGTGAGAGTTCAACCCGGAGGCATCAGGAGGCACAGGAGGCCTGCAGCTCCCAG 

GGCCTCATCCTGCACGGCTCGGGCATGCTCTTACCCTGTGGCTCGGATCGGTTCCGTGGTGTGGAGTATGTGTGCTG 

TCCCCCTCCAGGGACCCCCGACCCATCTGGGACAGCAGTTGGTGACCCCTCCACCCGGTCCTGGCCCCCGGGGAGCA 

GAGTAGAGGGGGCTGAGGACGAGGAAGAGGAGGAATCCTTCCCACAGCCAGTAGATGATTACTTCGTGGAGCCTCCG 

CAGGCTGAAGAGGAAGAGGAAACGGTCCCACCCCCAAGCTCCCATACACTTGCAGTGGTCGGCAAAGTCACTCCCAC 

CCCGAGGCCCACAGACGGTGTGGATATTTACTTTGGCATGCCTGGGGAAATCAGTGAGCACGAGGGGTTCCTGAGGG 

CCAAGATGGACCTGGAGGAGCGTAGGATGCGCCAGATTAATGAGGTGATGCGTGAATGGGCCATGGCAGACAACCAG 

TCCAAGAACCTGCCTAAAGCCGACAGACAGGCCCTGAATGAGCACTTCCAGTCCATTCTGCAGACTCTGGAGGAGCA 

GGTGTCTGGTGAGCGACAGCGCCTGGTGGAAACCCACGCCACCCGCGTCATCGCCCTTATCAACGACCAGCGCCGGG 

CTGCCTTGGAGGGCTTCCTGGCAGCCCTGCAGGCAGATCCGCCTCAGGCGGAGCGTGTCCTGTTGGCCCTGCGGCGC 

TACCTGCGTGCGGAGCAGAAGGAACAGAGGCACACGCTGCGCCACTACCAGCATGTGGCCGCCGTGGATCCCGAGAA 

GGCACAGCAGATGCGCTTCCAGCCCCAGAACCCCAACTCCCAGCCTAGGGCAGCAGGGAGTCTTGAAGTGATCATTT 

CACACCCTTTTGTGAGACGGCTGGAAATTCTTATTTCCCCTTTCCAATTCCAAAATTCCATCCCTAAGAATTCCCAG 

ATAGTCCCAGCAGCCTCCCCACGTGGCACCTCCTCACCTTAATTTATTTTTTAAGTTTATTTATGGCTCTTTAAGGT 

GACCGCCACCTTGGTCCTAGTGTCTATTCCCTGGAATTCACCCTCTCATGTTTCCCTACTAACATCCCAATAAAGTC 
CTCTTCCCTACCAGGCCAGTCTGA 



<210> SEQ ID NO 83 

<211> Length : 2,175 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 83 
>T 9 9 0 8 0_PEA_4_T0 

GCGGTCAGCCCAAGGTCACTTGACCCAGTCAGTGTCCGGCCAACTCTGCAGCTCGGTCCAGCCCTGCCCTTGGGGAG 
CCGGGGAGGGGCGGGAGAGGCTTTTCTGGAGCTCCTTCAAAGAAGAACTTGTACTTTTCTGAGAACGACGCTCCCAG 
ACCTTGGGGTGTGCCCTTGTCTGGCAAAGGGCGGAGGCCCTGGCTGTGCCTCCGCGTGCTTCCGCCGCAGGATGCCG 
GCGTCCGCCCGCCTGGCGGGAGCGGGGCTGCTGCTGGCCTTTCTCCGCGCGCTCGGCTGCGCTGGGCGGGCCCCAGG 
TTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAGGGGTGTTTTTCC 
GTAAGCATACTCAGCTATTTACCATATAAGATAACTCTTAAGAACTGGAGATAGTCAGCTCCCCTGGGTTAATTTGA 
AGCAGAAGAGGGCAGTTGTTATACTGCCCTGTCAGTTGGATGCGGAGTCTTACTCAAAATTCATTCTCAGCATTCTT 
CTTTTATGGTATCTTCTTTGGCACTTAGCAGCGCATCAGGTAGGCATCTTCTATTTTTCTTCATTCCTTAATTTCCT 
TTGTATCCCTCAAATGGTTATTTATTTGGCTGGAGTCTGTTTTGTTCATTAAGCAAACATGTCTTTGCTTTGAACAT 
GTCTTTGATTTGATGGATACTTAAATTCCTCATCAAACATTTTGTTGCTATGCATAACGTTTTCTTTGGCCAACTCC 
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AGCAATTTCCCACATTTTGACATGCAATCATGTTAACTCCCATTTTCTTTTGTAATCCAACATCTTCTATTTAGATA 
ATTACTTTAACAATCAATGACTTAATATTCTAATCATAAATTTATACAAAAATAAAATTACCTCCAAAACATTGCTA 
CCTTTCCTAAACATTCAGTCTTGCCACAGTTTAATAAAAGGAAAGAACATTAAAAAGGATAAGACACTGTAATGATT 
AGATGCTTTTTATAAGCCTAAAGGCATTGTGATTATTTAGACAGAAGAGAAGAAAGTGAAGTGAAAACCTGATAGTT 
ATGTAGTCTCATGGTTTGCTGTTGAGAGGCTGAACACCAGCTGCTTTCCTTTTCTAGGAAGATAATAAAGTGGGCTT 
TGGCTACAACATAAAGATGTTGGGTTAGACAGTTTCACTACAGTAAGAACAACGGGATGAGTTGCCCAGGAAATTGT 
GAAATACTTTCTAATGATCTTTAAAGATATAATGAACACTAATTCATCTGGATTTGTTTAGGTGTGGTCCTGGTTAA 
AGGCAAAGGGAAGGATCAGATAACTTCATGTTTTTTCCATTTAACATACCCAATAGATTCTTGATTAGGGGAAGGGA 
AAATGAGCAAGATACAGTCCAGTATTCTAAAAACAATCAGCCTTAGGGGATCATTTCAAAAGCATCTGTTTTGGACT 
TAAGTCTTTGATACTTAACCAAATTGACTACACAGTGAAAAATTCTAGTGCCTGGGTTTTATAGGGTAGAAGAAAGA 
CATGCAGTCAAGTGGCCAATACTTCATGTGAAGATAAGCAATGAGATCCTTCTTGCTGTCTTTCTTTTGACTGTTCT 
GGGCAATATCAAATTAGTTTCAGTGGCTTGATTCTAGGCCAAGATTCTGGCAACAGATTGTAGTCTTACCTTGTTTT 
CTTCAATCTCACTGGATCTCTCTCTCTTTTTACCCCCCTTAGGCTGAGGGTAAAAAGCTGGGATTGGTAGGCTGGGT 
CCAGAACACTGACCGGGGCACAGTGCAAGGACAATTGCAAGGTCCCATCTCCAAGGTGCGTCATATGCAGGAATGGC 
TTGAAACAAGAGGAAGTCCTAAATCACACATCGACAAAGCAAACTTCAACAATGAAAAAGTCATCTTGAAGTTGGAT 
TACTCAGACTTCCAAATTGTAAAATAATGGCCTGAATTTAAGTTTTCTAAGATAAACTCAGTGGTTTGGTTTTTATT 
ATTAATAGAGATAGAACTATTGTGTGTTAATATTAGCATTAGTCAATAAGTTATTTTAATGTCAGATTTTTGAATGT 
TATTATATATTACCTGTATGATGGAAGGATTACCACTGTACACAAATCTAATCAATAAAAACGTTAGAACCTTCTGC 
TTAGAGTACATTTAAAAAA 

<210> SEQ ID NO 84 

<211> Length : 1,956 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<400> sequence : 84 
>T 9 9 0 8 0_PE A_4 JI 2 

TCCGGGCGCGGAGGTTTGCGCGCCTTGGTGAGCCGTTGGCGTGGTGGTCCCGGAGTGATCCTGGCAGCCGGTGGGGA 
AGACAAGGAGGGTTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAG 
GGGTGTTTTTCCGTAAGCATACTCAGCTATTTACCATATAAGATAACTCTTAAGAACTGGAGATAGTCAGCTCCCCT 
GGGTTAATTTGAAGCAGAAGAGGGCAGTTGTTATACTGCCCTGTCAGTTGGATGCGGAGTCTTACTCAAAATTCATT 
CTCAGCATTCTTCTTTTATGGTATCTTCTTTGGCACTTAGCAGCGCATCAGGTAGGCATCTTCTATTTTTCTTCATT 
CCTTAATTTCCTTTGTATCCCTCAAATGGTTATTTATTTGGCTGGAGTCTGTTTTGTTCATTAAGCAAACATGTCTT 
TGCTTTGAACATGTCTTTGATTTGATGGATACTTAAATTCCTCATCAAACATTTTGTTGCTATGCATAACGTTTTCT 
TTGGCCAACTCCAGCAATTTCCCACATTTTGACATGCAATCATGTTAACTCCCATTTTCTTTTGTAATCCAACATCT 
TCTATTTAGATAATTACTTTAACAATCAATGACTTAATATTCTAATCATAAATTTATACAAAAATAAAATTACCTCC 
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AAAACATTGCTACCTTTCCTAAACATTCAGTCTTGCCACAGTTTAATAAAAGGAAAGAACATTAAAAAGGATAAGAC 
ACTGTAATGATTAGATGCTTTTTATAAGCCTAAAGGCATTGTGATTATTTAGACAGAAGAGAAGAAAGTGAAGTGAA 
AACCTGATAGTTATGTAGTCTCATGGTTTGCTGTTGAGAGGCTGAACACCAGCTGCTTTCCTTTTCTAGGAAGATAA 
TAAAGTGGGCTTTGGCTACAACATAAAGATGTTGGGTTAGACAGTTTCACTACAGTAAGAACAACGGGATGAGTTGC 
CCAGGAAATTGTGAAATACTTTCTAATGATCTTTAAAGATATAATGAACACTAATTCATCTGGATTTGTTTAGGTGT 
GGTCCTGGTTAAAGGCAAAGGGAAGGATCAGATAACTTCATGTTTTTTCCATTTAACATACCCAATAGATTCTTGAT 
TAGGGGAAGGGAAAATGAGCAAGATACAGTCCAGTATTCTAAAAACAATCAGCCTTAGGGGATCATTTCAAAAGCAT 
CTGTTTTGGACTTAAGTCTTTGATACTTAACCAAATTGACTACACAGTGAAAAATTCTAGTGCCTGGGTTTTATAGG 
GTAGAAGAAAGACATGCAGTCAAGTGGCCAATACTTCATGTGAAGATAAGCAATGAGATCCTTCTTGCTGTCTTTCT 
TTTGACTGTTCTGGGCAATATCAAATTAGTTTCAGTGGCTTGATTCTAGGCCAAGATTCTGGCAACAGATTGTAGTC 
TTACCTTGTTTTCTTCAATCTCACTGGATCTCTCTCTCTTTTTACCCCCCTTAGGCTGAGGGTAAAAAGCTGGGATT 
GGTAGGCTGGGTCCAGAACACTGACCGGGGCACAGTGCAAGGACAATTGCAAGGTCCCATCTCCAAGGTGCGTCATA 
TGCAGGAATGGCTTGAAACAAGAGGAAGTCCTAAATCACACATCGACAAAGCAAACTTCAACAATGAAAAAGTCATC 
TTGAAGTTGGATTACTCAGACTTCCAAATTGTAAAATAATGGCCTGAATTTAAGTTTTCTAAGATAAACTCAGTGGT 
TTGGTTTTTATTATTAATAGAGATAGAACTATTGTGTGTTAATATTAGCATTAGTCAATAAGTTATTTTAATGTCAG 
ATTTTTGAATGTTATTATATATTACCTGTATGATGGAAGGATTACCACTGTACACAAATCTAATCAATAAAAACGTT 
AGAACCTTCTGCTTAGAGTACATTTAAAAAA 

<210> SEQ ID NO 85 

<211> Length : 1,804 

<212> Type : DNA 

<213> ORGANISM : Homo sapiens 

<4 00> sequence : 85 
>T990 8 0__PEA_4_T4 

TTTTGAGACAGAGTCTTGCTCTGTTGCCTAGGCTAAAGTGCAGTAGTGCGATCTCGGCTCACTGCAACCTCCACTTC 
CTGGGTTAATTTGAAGCAGAAGAGGGCAGTTGTTATACTGCCCTGTCAGTTGGATGCGGAGTCTTACTCAAAATTCA 
TTCTCAGCATTCTTCTTTTATGGTATCTTCTTTGGCACTTAGCAGCGCATCAGGTAGGCATCTTCTATTTTTCTTCA 
TTCCTTAATTTCCTTTGTATCCCTCAAATGGTTATTTATTTGGCTGGAGTCTGTTTTGTTCATTAAGCAAACATGTC 
TTTGCTTTGAACATGTCTTTGATTTGATGGATACTTAAATTCCTCATCAAACATTTTGTTGCTATGCATAACGTTTT 
CTTTGGCCAACTCCAGCAATTTCCCACATTTTGACATGCAATCATGTTAACTCCCATTTTCTTTTGTAATCCAACAT 
CTTCTATTTAGATAATTACTTTAACAATCAATGACTTAATATTCTAATCATAAATTTATACAAAAATAAAATTACCT 
CCAAAACATTGCTACCTTTCCTAAACATTCAGTCTTGCCACAGTTTAATAAAAGGAAAGAACATTAAAAAGGATAAG 
ACACTGTAATGATTAGATGCTTTTTATAAGCCTAAAGGCATTGTGATTATTTAGACAGAAGAGAAGAAAGTGAAGTG 
AAAACCTGATAGTTATGTAGTCTCATGGTTTGCTGTTGAGAGGCTGAACACCAGCTGCTTTCCTTTTCTAGGAAGAT 
AATAAAGTGGGCTTTGGCTACAACATAAAGATGTTGGGTTAGACAGTTTCACTACAGTAAGAACAACGGGATGAGTT 
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GCCCAGGAAATTGTGAAATACTTTCTAATGATCTTTAAAGATATAATGAACACTAATTCATCTGGATTTGTTTAGGT 
GTGGTCCTGGTTAAAGGCAAAGGGAAGGATCAGATAACTTCATGTTTTTTCCATTTAACATACCCAATAGATTCTTG 
ATTAGGGGAAGGGAAAATGAGCAAGATACAGTCCAGTATTCTAAAAACAATCAGCCTTAGGGGATCATTTCAAAAGC 
ATCTGTTTTGGACTTAAGTCTTTGATACTTAACCAAATTGACTACACAGTGAAAAATTCTAGTGCCTGGGTTTTATA 
GGGTAGAAGAAAGACATGCAGTCAAGTGGCCAATACTTCATGTGAAGATAAGCAATGAGATCCTTCTTGCTGTCTTT 
CTTTTGACTGTTCTGGGCAATATCAAATTAGTTTCAGTGGCTTGATTCTAGGCCAAGATTCTGGCAACAGATTGTAG 
TCTTACCTTGTTTTCTTCAATCTCACTGGATCTCTCTCTCTTTTTACCCCCCTTAGGCTGAGGGTAAAAAGCTGGGA 
TTGGTAGGCTGGGTCCAGAACACTGACCGGGGCACAGTGCAAGGACAATTGCAAGGTCCCATCTCCAAGGTGCGTCA 
TATGCAGGAATGGCTTGAAACAAGAGGAAGTCCTAAATCACACATCGACAAAGCAAACTTCAACAATGAAAAAGTCA 
TCTTGAAGTTGGATTACTCAGACTTCCAAATTGTAAAATAATGGCCTGAATTTAAGTTTTCTAAGATAAACTCAGTG 
GTTTGGTTTTTATTATTAATAGAGATAGAACTATTGTGTGTTAATATTAGCATTAGTCAATAAGTTATTTTAATGTC 
AGATTTTTGAATGTTATTATATATTACCTGTATGATGGAAGGATTACCACTGTACACAAATCTAATCAATAAAAACG 
TTAGAACCTTCTGCTTAGAGTACATTTAAAAAA 

<210> SEQ ID NO 86 

<211> Length : 838 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 86 
>T 9 9 0 8 0_PEA_4__T 6 

GCGGTCAGCCCAAGGTCACTTGACCCAGTCAGTGTCCGGCCAACTCTGCAGCTCGGTCCAGCCCTGCCCTTGGGGAG 
CCGGGGAGGGGCGGGAGAGGCTTTTCTGGAGCTCCTTCAAAGAAGAACTTGTACTTTTCTGAGAACGACGCTCCCAG 
ACCTTGGGGTGTGCCCTTGTCTGGCAAAGGGCGGAGGCCCTGGCTGTGCCTCCGCGTGCTTCCGCCGCAGGATGCCG 
GCGTCCGCCCGCCTGGCGGGAGCGGGGCTGCTGCTGGCCTTTCTCCGCGCGCTCGGCTGCGCTGGGCGGGCCCCAGG 
TTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAGGGGTGTTTTTCC 
GTAAGCATACTCAGGCTGAGGGTAAAAAGCTGGGATTGGTAGGCTGGGTCCAGAACACTGACCGGGGCACAGTGCAA 
GGACAATTGCAAGGTCCCATCTCCAAGGTGCGTCATATGCAGGAATGGCTTGAAACAAGAGGAAGTCCTAAATCACA 
CATCGACAAAGCAAACTTCAACAATGAAAAAGTCATCTTGAAGTTGGATTACTCAGACTTCCAAATTGTAAAATAAT 
GGCCTGAATTTAAGTTTTCTAAGATAAACTCAGTGGTTTGGTTTTTATTATTAATAGAGATAGAACTATTGTGTGTT 
AATATTAGCATTAGTCAATAAGTTATTTTAATGTCAGATTTTTGAATGTTATTATATATTACCTGTATGATGGAAGG 
ATTACCACTGTACACAAATCTAATCAATAAAAACGTTAGAACCTTCTGCTTAGAGTACATTTAAAAAA 

<210> SEQ ID NO 87 
<211> Length : 606 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 87 
>T 9 9 0 8 0_PE A_4_T 9 

TCCGGGCGCGGAGGTTTGCGCGCCTTGGTGAGCCGTTGGCGTGGTGGTCCCGGAGTGATCCTGGCAGCCGGTGGGGA 
AGACAAGGAGGGAAATGACTGTTGAAAACAGAATTGCTGAAACTCACAGCAAGAGCTGTGTTCCAGTTAGCTTTGCT 
ACCAGTTATGCAGGCTGAGGGTAAAAAGCTGGGATTGGTAGGCTGGGTCCAGAACACTGACCGGGGCACAGTGCAAG 
GACAATTGCAAGGTCCCATCTCCAAGGTGCGTCATATGCAGGAATGGCTTGAAACAAGAGGAAGTCCTAAATCACAC 
ATCGACAAAGCAAACTTCAACAATGAAAAAGTCATCTTGAAGTTGGATTACTCAGACTTCCAAATTGTAAAATAATG 
GCCTGAATTTAAGTTTTCTAAGATAAACTCAGTGGTTTGGTTTTTATTATTAATAGAGATAGAACTATTGTGTGTTA 
ATATTAGCATTAGTCAATAAGTTATTTTAATGTCAGATTTTTGAATGTTATTATATATTACCTGTATGATGGAAGGA 
TTACCACTGTACACAAATCTAATCAATAAAAACGTTAGAACCTTCTGCTTAGAGTACATTTAAAAAA 

<210> SEQ ID NO 88 

<211> Length : 698 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 88 
>T 9 9 0 8 0_PEA_4_T 1 0 

TCCGGGCGCGGAGGTTTGCGCGCCTTGGTGAGCCGTTGGCGTGGTGGTCCCGGAGTGATCCTGGCAGCCGGTGGGGA 
AGACAAGGAGGGTTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAG 
GGGTGTTTTTCCGTAAGCATACTCAGGAAATGACTGTTGAAAACAGAATTGCTGAAACTCACAGCAAGAGCTGTGTT 
CCAGTTAGCTTTGCTACCAGTTATGCAGGCTGAGGGTAAAAAGCTGGGATTGGTAGGCTGGGTCCAGAACACTGACC 
GGGGCACAGTGCAAGGACAATTGCAAGGTCCCATCTCCAAGGTGCGTCATATGCAGGAATGGCTTGAAACAAGAGGA 
AGTCCTAAATCACACATCGACAAAGCAAACTTCAACAATGAAAAAGTCATCTTGAAGTTGGATTACTCAGACTTCCA 
AATTGTAAAATAATGGCCTGAATTTAAGTTTTCTAAGATAAACTCAGTGGTTTGGTTTTTATTATTAATAGAGATAG 
AACTATTGTGTGTTAATATTAGCATTAGTCAATAAGTTATTTTAATGTCAGATTTTTGAATGTTATTATATATTACC 
TGTATGATGGAAGGATTACCACTGTACACAAATCTAATCAATAAAAACGTTAGAACCTTCTGCTTAGAGTACATTTA 
AAAAA 

<210> SEQ ID NO 89 
<211> Length : 733 



WO 2006/131783 



PCT/IB2005/004037 



109 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 89 
> T 9 9 0 8 0_P E A__ 4 _T 1 1 

TCCGGGCGCGGAGGTTTGCGCGCCTTGGTGAGCCGTTGGCGTGGTGGTCCCGGAGTGATCCTGGCAGCCGGTGGGGA 
AGACAAGGAGGGTTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAG 
GGGTGTTTTTCCGTAAGCATACTCAGCTTTGTTTTTGGGAGGCTGAGGAAGGAGGATCATTTGAGCCTGGGAGGTTA 
AGGCTGCAATAAGCTGTGACTGTGCCACCATCCTTCAGAAAAAAAAAAGAAAAAGGAAAAGAGGCTGAGGGTAAAAA 
GCTGGGATTGGTAGGCTGGGTCCAGAACACTGACCGGGGCACAGTGCAAGGACAATTGCAAGGTCCCATCTCCAAGG 
TGCGTCATATGCAGGAATGGCTTGAAACAAGAGGAAGTCCTAAATCACACATCGACAAAGCAAACTTCAACAATGAA 
AAAGTCATCTTGAAGTTGGATTACTCAGACTTCCAAATTGTAAAATAATGGCCTGAATTTAAGTTTTCTAAGATAAA 
CTCAGTGGTTTGGTTTTTATTATTAATAGAGATAGAACTATTGTGTGTTAATATTAGCATTAGTCAATAAGTTATTT 
TAATGTCAGATTTTTGAATGTTATTATATATTACCTGTATGATGGAAGGATTACCACTGTACACAAATCTAATCAAT 
AAAA ACGT T AGA AC C TT CTGCT T AGAGT AC AT T T AAA AAA 

<210> SEQ ID NO 90 

<211> Length : 746 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 90 
>T 9 9 0 8 0_PE A_4_T 1 3 

GCGGTCAGCCCAAGGTCACTTGACCCAGTCAGTGTCCGGCCAACTCTGCAGCTCGGTCCAGCCCTGCCCTTGGGGAG 
CCGGGGAGGGGCGGGAGAGGCTTTTCTGGAGCTCCTTCAAAGAAGAACTTGTACTTTTCTGAGAACGACGCTCCCAG 
ACCTTGGGGTGTGCCCTTGTCTGGCAAAGGGCGGAGGCCCTGGCTGTGCCTCCGCGTGCTTCCGCCGCAGGATGCCG 
GCGTCCGCCCGCCTGGCGGGAGCGGGGCTGCTGCTGGCCTTTCTCCGCGCGCTCGGCTGCGCTGGGCGGGCCCCAGG 
CTGAGGGTAAAAAGCTGGGATTGGTAGGCTGGGTCCAGAACACTGACCGGGGCACAGTGCAAGGACAATTGCAAGGT 
CCCATCTCCAAGGTGCGTCATATGCAGGAATGGCTTGAAACAAGAGGAAGTCCTAAATCACACATCGACAAAGCAAA 
CTTCAACAATGAAAAAGTCATCTTGAAGTTGGATTACTCAGACTTCCAAATTGTAAAATAATGGCCTGAATTTAAGT 
TTTCTAAGATAAACTCAGTGGTTTGGTTTTTATTATTAATAGAGATAGAACTATTGTGTGTTAATATTAGCATTAGT 
CAATAAGTTATTTTAATGTCAGATTTTTGAATGTTATTATATATTACCTGTATGATGGAAGGATTACCACTGTACAC 
AAATCTAATCAATAAAAACGTTAGAACCTTCTGCTTAGAGTACATTTAAAAAA 
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<210> SEQ ID NO 91 

<211> Length : 782 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 91 
>T 9 9 0 8 0_PE A_4_T 1 4 

TCCGGGCGCGGAGGTTTGCGCGCCTTGGTGAGCCGTTGGCGTGGTGGTCCCGGAGTGATCCTGGCAGCCGGTGGGGA 
AGACAAGGAGGGTTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAG 
GGGTGTTTTTCCGTAAGCATACTCAGGAAATGACTGTTGAAAACAGAATTGCTGAAACTCACAGCAAGAGCTGTGTT 
CCAGTTAGCTTTGCTACCAGTTATGCAGGTATTAATGAATTTAAAAGGATTTAAATCAAGGAATGTTCTCCAACTAC 
AGTGGAACTAAACCACATTAAAAAAATAAAAAGGATAACTGGAAAATCCCAAAATATTTGGAAACCATATAGCACAC 
TTACTTCTAAAATTGTGGTAGAATACATATAACATAGAAATTATTGTTCTAACCATTTTTAAATGTACAATTCAGTG 
GTCTTAAGCACATTCACATTGTTCTGTTTATCTACAGAACGCTTTTCATCTTGCAAAACTGAAACTCTGTATTCATT 
AAACACTAACTCCCCATTTTCTCCTTCCCCCATGCCCCTGACAATCATAAATCTACATTCTATTAATTCAACTGCTC 
TAGTTACCTCATATAAGTGGAATTTTACAGTATTTGTCCTTTTGTGGCTGGCTTATTTCACTTAGCATAATATCCTC 
AGGGTTCATCTGTGTTATATCATGAAAGTAAAAACAATTTCCTTTCTTTGTAAGACGGAATAATATCCTGTTGTATG 

TGTATACTTTCA 

<210> SEQ ID NO 92 

<211> Length : 627 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 92 
>T 9 9 0 8 0 JPE A_4_T 1 7 

TCCGGGCGCGGAGGTTTGCGCGCCTTGGTGAGCCGTTGGCGTGGTGGTCCCGGAGTGATCCTGGCAGCCGGTGGGGA 
AGACAAGGAGGGTTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAG 
GGGTGTTTTTCCGTAAGCATACTCAGGTATGTGGCCTGCAGGCTTTGGGGTGGTAACCTAGGTTGTGGGCATAGGAA 
CAAGGGACGTGTTCTTGACAAATGTGAACTAGAAGCCGCTGGCTATTTGGTAGCTGCATGGCAAAGAGGTAGTTTGT 
AGAGCAATGAGTATGAAAATGCTGTGCAATCGGTAAAACATGGTGTAAATGGAAGATCACCCTGCTGTTATTATTAG 
TCAGTGGTGCCTGAGCATCTAGATACACATATTAATGAGCTTCTCTCTTCCAAGGGAAATAGAGGGCTTCCCCAGTG 
CTCGCCGTTGTGGCATCATCAAACCAAGTCAGGTTTCTTATAGAAAGGCTAACATTGATTGAAGAGCTGGCATTTAG 
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ATGACCTGATGTCATGTATATAACATATATAAATGCTTCCTCTGCAGCTGCTGCACTTTCTTCAGACCTCTTTCTCC 
AAGCTTCCCTC 

<210> SEQ ID NO 93 

<211> Length : 917 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 93 
>T 9 9 0 8 0_PEA_4_T 1 8 

GCGGTCAGCCCAAGGTCACTTGACCCAGTCAGTGTCCGGCCAACTCTGCAGCTCGGTCCAGCCCTGCCCTTGGGGAG 
CCGGGGAGGGGCGGGAGAGGCTTTTCTGGAGCTCCTTCAAAGAAGAACTTGTACTTTTCTGAGAACGACGCTCCCAG 
ACCTTGGGGTGTGCCCTTGTCTGGCAAAGGGCGGAGGCCCTGGCTGTGCCTCCGCGTGCTTCCGCCGCAGGATGCCG 
GCGTCCGCCCGCCTGGCGGGAGCGGGGCTGCTGCTGGCCTTTCTCCGCGCGCTCGGCTGCGCTGGGCGGGCCCCAGG 
TTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAGGGGTGTTTTTCC 
GTAAGCATACTCAGGAAATGACTGTTGAAAACAGAATTGCTGAAACTCACAGCAAGAGCTGTGTTCCAGTTAGCTTT 
GCTACCAGTTATGCAGGCTGAGGGTAAAAAGCTGGGATTGGTAGGCTGGGTCCAGAACACTGACCGGGGCACAGTGC 
AAGGACAATTGCAAGGTCCCATCTCCAAGGTGCGTCATATGCAGGAATGGCTTGAAACAAGAGGAAGTCCTAAATCA 
CACATCGACAAAGCAAACTTCAACAATGAAAAAGTCATCTTGAAGTTGGATTACTCAGACTTCCAAATTGTAAAATA 
ATGGCCTGAATTTAAGTTTTCTAAGATAAACTCAGTGGTTTGGTTTTTATTATTAATAGAGATAGAACTATTGTGTG 
TTAATATTAGCATTAGTCAATAAGTTATTTTAATGTCAGATTTTTGAATGTTATTATATATTACCTGTATGATGGAA 
GGATTACCACTGTACACAAATCTAATCAATAAAAACGTTAGAACCTTCTGCTTAGAGTACATTTAAAAAA 

<210> SEQ ID NO 94 

<211> Length : 952 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 94 
>T 9 90 8 0_PEA_4_T1 9 

GCGGTCAGCCCAAGGTCACTTGACCCAGTCAGTGTCCGGCCAACTCTGCAGCTCGGTCCAGCCCTGCCCTTGGGGAG 
CCGGGGAGGGGCGGGAGAGGCTTTTCTGGAGCTCCTTCAAAGAAGAACTTGTACTTTTCTGAGAACGACGCTCCCAG 
ACCTTGGGGTGTGCCCTTGTCTGGCAAAGGGCGGAGGCCCTGGCTGTGCCTCCGCGTGCTTCCGCCGCAGGATGCCG 
GCGTCCGCCCGCCTGGCGGGAGCGGGGCTGCTGCTGGCCTTTCTCCGCGCGCTCGGCTGCGCTGGGCGGGCCCCAGG 
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TTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAGGGGTGTTTTTCC 
GTAAGCATACTCAGCTTTGTTTTTGGGAGGCTGAGGAAGGAGGATCATTTGAGCCTGGGAGGTTAAGGCTGCAATAA 
GCTGTGACTGTGCCACCATCCTTCAGAAAAAAAAAAGAAAAAGGAAAAGAGGCTGAGGGTAAAAAGCTGGGATTGGT 
AGGCTGGGTCCAGAACACTGACCGGGGCACAGTGCAAGGACAATTGCAAGGTCCCATCTCCAAGGTGCGTCATATGC 
AGGAATGGCTTGAAACAAGAGGAAGTCCTAAATCACACATCGACAAAGCAAACTTCAACAATGAAAAAGTCATCTTG 
AAGTTGGATTACTCAGACTTCCAAATTGTAAAATAATGGCCTGAATTTAAGTTTTCTAAGATAAACTCAGTGGTTTG 
GTTTTTATTATTAATAGAGATAGAACTATTGTGTGTTAATATTAGCATTAGTCAATAAGTTATTTTAATGTCAGATT 
TTTGAATGTTATTATATATTACCTGTATGATGGAAGGATTACCACTGTACACAAATCTAATCAATAAAAACGTTAGA 
ACCTTCTGCTTAGAGTACATTTAAAAAA 

<210> SEQ ID NO 95 

<211> Length : 1 , 001 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 95 
>T 9 9 0 8 0__PEA_4_T2 0 

GCGGTCAGCCCAAGGTCACTTGACCCAGTCAGTGTCCGGCCAACTCTGCAGCTCGGTCCAGCCCTGCCCTTGGGGAG 
CCGGGGAGGGGCGGGAGAGGCTTTTCTGGAGCTCCTTCAAAGAAGAACTTGTACTTTTCTGAGAACGACGCTCCCAG 
ACCTTGGGGTGTGCCCTTGTCTGGCAAAGGGCGGAGGCCCTGGCTGTGCCTCCGCGTGCTTCCGCCGCAGGATGCCG 
GCGTCCGCCCGCCTGGCGGGAGCGGGGCTGCTGCTGGCCTTTCTCCGCGCGCTCGGCTGCGCTGGGCGGGCCCCAGG 
TTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAGGGGTGTTTTTCC 
GTAAGCATACTCAGGAAATGACTGTTGAAAACAGAATTGCTGAAACTCACAGCAAGAGCTGTGTTCCAGTTAGCTTT 
GCTACCAGTTATGCAGGTATTAATGAATTTAAAAGGATTTAAATCAAGGAATGTTCTCCAACTACAGTGGAACTAAA 
CCACATTAAAAAAATAAAAAGGATAACTGGAAAATCCCAAAATATTTGGAAACCATATAGCACACTTACTTCTAAAA 
TTGTGGTAGAATACATATAACATAGAAATTATTGTTCTAACCATTTTTAAATGTACAATTCAGTGGTCTTAAGCACA 
TTCACATTGTTCTGTTTATCTACAGAACGCTTTTCATCTTGCAAAACTGAAACTCTGTATTCATTAAACACTAACTC 
CCCATTTTCTCCTTCCCCCATGCCCCTGACAATCATAAATCTACATTCTATTAATTCAACTGCTCTAGTTACCTCAT 
ATAAGTGGAATTTTACAGTATTTGTCCTTTTGTGGCTGGCTTATTTCACTTAGCATAATATCCTCAGGGTTCATCTG 
TGTTATATCATGAAAGTAAAAACAATTTCCTTTCTTTGTAAGACGGAATAATATCCTGTTGTATGTGTATACTTTCA 

<210> SEQ ID NO 96 
<211> Length : 846 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 96 
>T9 9 0 8 0_PEA_4_T2 1 

GCGGTCAGCCCAAGGTCACTTGACCCAGTCAGTGTCCGGCCAACTCTGCAGCTCGGTCCAGCCCTGCCCTTGGGGAG 
CCGGGGAGGGGCGGGAGAGGCTTTTCTGGAGCTCCTTCAAAGAAGAACTTGTACTTTTCTGAGAACGACGCTCCCAG 
ACCTTGGGGTGTGCCCTTGTCTGGCAAAGGGCGGAGGCCCTGGCTGTGCCTCCGCGTGCTTCCGCCGCAGGATGCCG 
GCGTCCGCCCGCCTGGCGGGAGCGGGGCTGCTGCTGGCCTTTCTCCGCGCGCTCGGCTGCGCTGGGCGGGCCCCAGG 
TTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAGGGGTGTTTTTCC 
GTAAGCATACTCAGGTATGTGGCCTGCAGGCTTTGGGGTGGTAACCTAGGTTGTGGGCATAGGAACAAGGGACGTGT 
TCTTGACAAATGTGAACTAGAAGCCGCTGGCTATTTGGTAGCTGCATGGCAAAGAGGTAGTTTGTAGAGCAATGAGT 
ATGAAAATGCTGTGCAATCGGTAAAACATGGTGTAAATGGAAGATCACCCTGCTGTTATTATTAGTCAGTGGTGCCT 
GAGCATCTAGATACACATATTAATGAGCTTCTCTCTTCCAAGGGAAATAGAGGGCTTCCCCAGTGCTCGCCGTTGTG 
GCATCATCAAACCAAGTCAGGTTTCTTATAGAAAGGCTAACATTGATTGAAGAGCTGGCATTTAGATGACCTGATGT 
CATGTATATAACATATATAAATGCTTCCTCTGCAGCTGCTGCACTTTCTTCAGACCTCTTTCTCCAAGCTTCCCTC 

<210> SEQ ID NO 97 

<211> Length : 4,539 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 97 
>T 0 8 4 4 6_PEA_1_T2 

GCTACGGAAGGGTTTTCAGCAGGGAGAGATGGGACCAGCATGCTCTCCATCCTGTTGCCCCGCCTCACGTCTGGCCC 
CTGCTTCTCCAGTCCCCCACCCCAGACCACACACGAAGAAGCAGTCCTGTCCTCAGCCCAGCCCTCACCTCCCCCGA 
CCTGCCATCCTGCTTCATGCTCAGGGCGGTGTGTGGAGCGCCCGGGGCTCTGGACCCGCGCTGCCAGATAACAATGC 
TCTCGTTGTCTCTTTGCTCCCATCTCTGGGGGCCTCTGATTCTTTCTGCTCTACAGGCACGCAGCACTGACAGCCTG 
GATGGCCCAGGGGAGGGCTCGGTGCAGCCTCTACCCACTGCTGGGGGGCCCAGTGTGAAGGGGAAGCCTGGGAAGAG 
GCTCTCAGCTCCTCGAGGCCCCTTCCCGCGGCTGGCTGACTGCGCCCATTTCCACTACGAGAACGTTGACTTTGGCC 
ACATTCAGCTCCTGCTGTCTCCAGACCGTGAAGGGCCCAGCCTCTCTGGAGAGAATGAGCTGGTGTTCGGGGTGCAG 
GTGACCTGTCAGGGCCGTTCCTGGCCGGTTCTCCGGAGTTACGATGACTTTCGTTCCCTGGATGCCCACCTCCACCG 
GTGCATATTTGACCGGAGGTTCTCCTGCCTTCCGGAGCTTCCCCCGCCCCCCGAGGGTGCCAGGGCTGCCCAGATGC 
TGGTGCCACTGCTGCTGCAGTACCTGGAGACACTGTCAGGACTGGTGGACAGTAACCTCAACTGCGGGCCTGTGCTC 
ACCTGGATGGAGCTGGACAATCACGGCCGGCGACTGCTCCTCAGTGAGGAGGCGTCACTCAATATCCCTGCAGTGGC 
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GGCCGCCCATGTGATCAAACGGTATACAGCCCAGGCGCCAGATGAGCTGTCCTTTGAGGTGGGAGACATTGTCTCGG 
TGATCGACATGCCACCCACAGAGGATCGGAGCTGGTGGCGGGGCAAGCGAGGCTTCCAGGTCGGGTTCTTCCCCAGT 
GAGTGTGTGGAACTCTTCACAGAGCGGCCAGGTCCGGGCCTGAAGGCGGATGCCGATGGCCCCCCATGTGGCATCCC 
GGCTCCCCAGGGTATCTCGTCTCTGACCTCAGCTGTGCCACGGCCTCGTGGGAAGCTGGCCGGCCTGCTCCGCACCT 
TCATGCGCTCCCGCCCTTCTCGGCAGCGGCTGCGGCAGCGGGGAATCCTGCGACAGAGGGTGTTTGGCTGCGATCTT 
GGCGAGCACCTCAGCAACTCAGGCCAGGATGTGCCCCAGGTGCTGCGCTGCTGCTCCGAGTTCATTGAGGCCCACGG 
GGTGGTGGATGGGATCTACCGGCTCTCAGGCGTGTCTTCCAACATCCAGAGGCTTCGGCACGAGTTTGACAGTGAGA 
GGATCCCGGAGCTGTCTGGCCCTGCATTCCTGCAGGACATCCACAGCGTGTCCTCCCTCTGCAAGCTCTACTTCCGA 
GAGCTTCCGAACCCTCTGCTCACCTACCAGCTCTATGGGAAGTTCAGTGAGGCCATGTCAGTGCCTGGGGAGGAGGA 
GCGTCTGGTGCGGGTGCACGATGTCATCCAGCAGCTGCCCCCACCACATTACAGGACCCTGGAGTACCTGCTGAGGC 
ACCTGGCCCGCATGGCGAGACACAGTGCCAACACCAGCATGCATGCCCGCAACCTGGCCATTGTCTGGGCACCCAAC 
CTGCTACGGTCCATGGAGCTGGAGTCAGTGGGAATGGGTGGCGCGGCGGCGTTCCGGGAAGTTCGGGTGCAGTCGGT 
GGTGGTGGAGTTTCTGCTCACCCATGTGGACGTCCTGTTCAGCGACACCTTCACCTCCGCCGGCCTCGACCCTGCAG 
GCCGCTGCCTGCTCCCCAGGCCCAAGTCCCTTGCGGGCAGCTGCCCCTCCACCCGCCTGCTGACGCTGGAGGAAGCC 
CAGGCACGCACCCAGGGCCGGCTGGGGACGCCCACGGAGCCCACAACTCCCAAGGCCCCGGCCTCACCTGCGGAAAG 
GAGGAAAGGGGAGAGAGGGGAGAAGCAGCGGAAGCCAGGGGGCAGCAGCTGGAAGACGTTCTTTGCACTGGGCCGGG 
GCCCCAGTGTCCCTCGAAAGAAGCCCCTGCCCTGGCTGGGGGGCACCCGTGCCCCACCGCAGCCTTCAGGCAGCAGA 
CCCGACACCGTCACACTGAGATCTGCCAAGAGCGAGGAGTCTCTGTCATCGCAGGCCAGCGGGGCTGGCCTCCAGAG 
GCTGCACAGGCTGCGGCGACCCCACTCCAGCAGCGACGCTTTCCCTGTGGGCCCAGCACCTGCTGGCTCCTGCGAGA 
GCCTGTCCTCGTCCTCCTCCTCCGAGTCCTCCTCCTCTGAGTCCTCCTCTTCCTCCTCTGAGTCCTCAGCAGCTGGG 
CTGGGGGCACTCTCTGGGTCTCCCTCACACCGTACCTCAGCCTGGCTAGATGATGGTGATGAGCTGGACTTCAGCCC 
ACCCCGCTGCCTGGAGGGACTCCGGGGGCTGGACTTTGATCCCTTAACCTTCCGCTGCAGCAGCCCCACCCCAGGGG 
ATCCCGCACCTCCCGCCAGCCCAGCACCCCCCGCCCCTGCCTCTGCCTTCCCACCCAGGGTGACCCCCCAGGCCATC 
TCGCCCCGGGGGCCCACCAGCCCCGCCTCGCCTGCTGCCCTAGACATCTCAGAGCCCCTGGCTGTATCAGTGCCACC 
CGCTGTCCTAGAACTGCTGGGGGCTGGGGGAGCACCTGCCTCAGCCACCCCAACACCAGCTCTCAGCCCCGGCCGGA 
GCCTGCGCCCCCATCTCATACCCCTGCTGCTGCGAGGAGCCGAGGCCCCGCTGACTGACGCCTGCCAGCAGGAGATG 
TGCAGCAAGCTCCGGGGAGCCCAGGGCCCACTCGGTCCTGATATGGAGTCACCACTGCCACCCCCTCCCCTGTCTCT 
CCTGCGCCCTGGGGGTGCCCCACCCCCGCCCCCTAAGAACCCAGCACGCCTCATGGCCCTGGCCCTGGCTGAGCGGG 
CTCAGCAGGTGGCCGAGCAACAGAGCCAGCAGGAGTGTGGGGGCACCCCACCTGCTTCCCAATCCCCCTTCCACCGC 
TCGCTGTCTCTGGAGGTGGGCGGGGAGCCCCTGGGGACCTCAGGGAGTGGGCCACCTCCCAACTCCCTAGCACACCC 
GGGTGCCTGGGTCCCGGGACCCCCACCCTACTTACCAAGGCAACAAAGTGATGGGAGCCTGCTGAGGAGCCAGCGGC 
CCATGGGGACCTCAAGGAGGGGACTCCGAGGCCCTGCCCAGGTCAGTGCCCAGCTCAGGGCAGGTGGCGGGGGCAGG 
GATGCGCCAGAGGCAGCAGCCCAGTCCCCATGTTCTGTCCCCTCACAGGTTCCTACCCCCGGCTTCTTCTCCCCAGC 
CCCCAGGGAGTGCCTGCCACCCTTCCTCGGGGTCCCCAAGCCAGGCTTGTACCCCCTGGGCCCCCCATCCTTCCAGC 
CCAGTTCCCCAGCCCCAGTCTGGAGGAGCTCTCTGGGCCCCCCTGCACCACTCGACAGGGGAGAGAACCTGTACTAT 
GAGATCGGGGCAAGTGAGGGGTCCCCCTATTCTGGCCCCACCCGCTCCTGGAGTCCCTTTCGCTCCATGCCCCCCGA 
CAGGCTCAATGCCTCCTACGGCATGCTTGGCCAATCACCCCCACTCCACAGGTCCCCCGACTTCCTGCTCAGCTACC 
CGCCAGCCCCCTCCTGCTTTCCCCCTGACCACCTTGGCTACTCAGCCCCCCAGCACCCTGCTCGGCGCCCTACACCG 
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CCTGAGCCCCTCTACGTCAACCTAGCTCTAGGGCCCAGGGGTCCCTCACCTGCCTCTTCCTCCTCCTCTTCCCCTCC 
TGCCCACCCCCGAAGCCGTTCAGATCCCGGTCCCCCAGTCCCCCGCCTTCCCCAGAAACAACGGGCACCCTGGGGAC 
CCCGTACCCCTCATAGGGTGCCGGGTCCCTGGGGCCCTCCTGAGCCTCTCCTGCTCTACAGGGCAGCCCCGCCAGCC 
TACGGAAGGGGGGGCGAGCTCCACCGAGGGTCCTTGTACAGAAATGGAGGGCAAAGAGGGGAGGGGGCTGGTCCCCC 
ACCCCCTTACCCCACTCCCAGCTGGTCCCTCCACTCTGAGGGCCAGACCCGAAGCTACTGCTGAGCACCAGCTGGGA 
GGGGCCGTCCTTCCTTCCCTTCACCCTCACTGGATCTTGGCCCAACCAAATCCCTTGTTTTGTATTTTCTTGAACCC 
CGACCACTACCCCAGGTTTCTAACTTTGTAACTTGCTTCTGATGTGGGTCCCTAACCTATAATCTCAGCTTCCCTAC 
CCTGGACTGAAGGGTCTGCCCATCCCCCCACCACCCTCCATCCTGGGGGCCCTCGCACAAATCTGGGGTGGGAGGGG 
CTAGGCTGACCCCATCCTCCTCTCCCTCCAGGAGCCCCCAGCATGTCCTGACCTGTGCACGGGGATGGGGGGACAAC 
TCCTACCCTTCTTTCCCCACATGCCCCACTAAACCATCTGACAACATTAATGAATAAAATGGTGAAAATGTGA 

<210> SEQ ID NO 98 

<211> Length : 968 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 98 
>T0 8 4 4 6_PEA_1_T2 2 

GCTACGGAAGGGTTTTCAGCAGGGAGAGATGGGACCAGCATGCTCTCCATCCTGTTGCCCCGCCTCACGTCTGGCCC 
CTGCTTCTCCAGTCCCCCACCCCAGACCACACACGAAGAAGCAGTCCTGTCCTCAGCCCAGCCCTCACCTCCCCCGA 
CCTGCCATCCTGCTTCATGCTCAGGGCGGTGTGTGGAGCGCCCGGGGCTCTGGACCCGCGCTGCCAGATAACAATGC 
TCTCGTTGTCTCTTTGCTCCCATCTCTGGGGGCCTCTGATTCTTTCTGCTCTACAGGCACGCAGCACTGACAGCCTG 
GATGGCCCAGGGGAGGGCTCGGTGCAGCCTCTACCCACTGCTGGGGGGCCCAGTGTGAAGGGGAAGCCTGGGAAGAG 
GCTCTCAGCTCCTCGAGGCCCCTTCCCGCGGCTGGCTGACTGCGCCCATTTCCACTACGAGAACGTTGACTTTGGCC 
ACATTCAGCTCCTGCTGTCTCCAGACCGTGAAGGGCCCAGCCTCTCTGGAGAGAATGAGCTGGTGTTCGGGGTGCAG 
GTGACCTGTCAGGGCCGTTCCTGGCCGGTTCTCCGGAGTTACGATGACTTTCGTTCCCTGGATGCCCACCTCCACCG 
GTGCATATTTGACCGGAGGTTCTCCTGCCTTCCGGAGCTTCCCCCGCCCCCCGAGGGTGCCAGGGCTGCCCAGATGC 
TGGTGCCACTGCTGCTGCAGTACCTGGAGACACTGTCAGGACTGGTGGACAGTAACCTCAACTGCGGGCCTGTGCTC 
ACCTGGATGGAGGTGGGCCTGGGCAGGGGGCTTGGAGATTCCGAGTGGGTGAGGGGGTGCGTGTGCCACCACGCCCA 
GCACAGAGAGATTTTAGATGGCAACCGTGTGGCATCTGCTGTGGAGGATGAAGGTGCAGAGGTGGATGGGGAAGCCT 
TCAGGTGGGGAAGCCTTTGGGTGGGAGAGTCCTGGGACATGTGA 



<210> SEQ ID NO 99 
<211> Length : 3,615 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 99 
>HUMCA1XIA_T16 

ACACAGTACTCTCAGCTTGTTGGTGGAAGCCCCTCATCTGCCTTCATTCTGAAGGCAGGGCCCGGCAGAGGAAGGAT 
CAGAGGGTCGCGGCCGGAGGGTCCCGGCCGGTGGGGCCAACTCAGAGGGAGAGGAAAGGGCTAGAGACACGAAGAAC 
GCAAACCATCAAATTTAGAAGAAAAAGCCCTTTGACTTTTTCCCCCTCTCCCTCCCCAATGGCTGTGTAGCAAACAT 
CCCTGGCGATACCTTGGAAAGGACGAAGTTGGTCTGCAGTCGCAATTTCGTGGGTTGAGTTCACAGTTGTGAGTGCG 
GGGCTCGGAGATGGAGCCGTGGTCCTCTAGGTGGAAAACGAAACGGTGGCTCTGGGATTTCACCGTAACAACCCTCG 
CATTGACCTTCCTCTTCCAAGCTAGAGAGGTCAGAGGAGCTGCTCCAGTTGATGTACTAAAAGCACTAGATTTTCAC 
AATTCTCCAGAGGGAATATCAAAAACAACGGGATTTTGCACAAACAGAAAGAATTCTAAAGGCTCAGATACTGCTTA 
CAGAGTTTCAAAGCAAGCACAACTCAGTGCCCCAACAAAACAGTTATTTCCAGGTGGAACTTTCCCAGAAGACTTTT 
CAATACTATTTACAGTAAAACCAAAAAAAGGAATTCAGTCTTTCCTTTTATCTATATATAATGAGCATGGTATTCAG 
CAAATTGGTGTTGAGGTTGGGAGATCACCTGTTTTTCTGTTTGAAGACCACACTGGAAAACCTGCCCCAGAAGACTA 
TCCCCTCTTCAGAACTGTTAACATCGCTGACGGGAAGTGGCATCGGGTAGCAATCAGCGTGGAGAAGAAAACTGTGA 
CAATGATTGTTGATTGTAAGAAGAAAACCACGAAACCACTTGATAGAAGTGAGAGAGCAATTGTTGATACCAATGGA 
ATCACGGTTTTTGGAACAAGGATTTTGGATGAAGAAGTTTTTGAGGGGGACATTCAGCAGTTTTTGATCACAGGTGA 
TCCCAAGGCAGCATATGACTACTGTGAGCATTATAGTCCAGACTGTGACTCTTCAGCACCCAAGGCTGCTCAAGCTC 
AGGAACCTCAGATAGATGAGTATGCACCAGAGGATATAATCGAATATGACTATGAGTATGGGGAAGCAGAGTATAAA 
GAGGCTGAAAGTGTAACAGAGGGACCCACTGTAACTGAGGAGACAATAGCACAGACGGAGGCAAACATCGTTGATGA 
TTTTCAAGAATACAACTATGGAACAATGGAAAGTTACCAGACAGAAGCTCCTAGGCATGTTTCTGGGACAAATGAGC 
CAAATCCAGTTGAAGAAATATTTACTGAAGAATATCTAACGGGAGAGGATTATGATTCCCAGAGGAAAAATTCTGAG 
GATACACTATATGAAAACAAAGAAATAGACGGCAGGGATTCTGATCTTCTGGTAGATGGAGATTTAGGCGAATATGA 
TTTTTATGAATATAAAGAATATGAAGATAAACCAACAAGCCCCCCTAATGAAGAATTTGGTCCAGGTGTACCAGCAG 
AAACTGATATTACAGAAACAAGCATAAATGGCCATGGTGCATATGGAGAGAAAGGACAGAAAGGAGAACCAGCAGTG 
GTTGAGCCTGGTATGCTTGTCGAAGGACCACCAGGACCAGCAGGACCTGCAGGTATTATGGGTCCTCCAGGTCTACA 
AGGCCCCACTGGACCCCCTGGTGACCCTGGCGATAGGGGCCCCCCAGGACGTCCTGGCTTACCAGGGGCTGATGGTC 
TACCTGGTCCTCCTGGTACTATGTTGATGTTACCGTTCCGTTATGGTGGTGATGGTTCCAAAGGACCAACCATCTCT 
GCTCAGGAAGCTCAGGCTCAAGCTATTCTTCAGCAGGCTCGGATTGCTCTGAGAGGCCCACCTGGCCCAATGGGTCT 
AACTGGAAGACCAGGTCCTGTGGGGGGGCCTGGTTCATCTGGGGCCAAAGGTGAGAGTGGTGATCCAGGTCCTCAGG 
GCCCTCGAGGCGTCCAGGGTCCCCCTGGTCCAACGGGAAAACCTGGAAAAAGGGGTCGTCCAGGTGCAGATGGAGGA 
AGAGGAATGCCAGGAGAACCTGGGGCAAAGGGAGATCGAGGGTTTGATGGACTTCCGGGTCTGCCAGGTGACAAAGG 
TCACAGGGGTGAACGAGGTCCTCAAGGTCCTCCAGGTCCTCCTGGTGATGATGGAATGAGGGGAGAAGATGGAGAAA 
TTGGACCAAGAGGTCTTCCAGGTGAAGCTGGCCCACGAGGTTTGCTGGGTCCAAGGGGAACTCCAGGAGCTCCAGGG 
CAGCCTGGTATGGCAGGTGTAGATGGCCCCCCAGGACCAAAAGGGAACATGGGTCCCCAAGGGGAGCCTGGGCCTCC 
AGGTCAACAAGGGAATCCAGGACCTCAGGGTCTTCCTGGTCCACAAGGTCCAATTGGTCCTCCTGGTGAAAAAGGAC 
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CACAAGGAAAACCAGGACTTGCTGGACTTCCTGGTGCTGATGGGCCTCCTGGTCATCCTGGGAAAGAAGGCCAGTCT 
GGAGAAAAGGGGGCTCTGGGTCCCCCTGGTCCACAAGGTCCTATTGGATACCCGGGCCCCCGGGGAGTAAAGGGAGC 
AGATGGTGTCAGAGGTCTCAAGGGATCTAAAGGTGAAAAGGGTGAAGATGGTTTTCCAGGATTCAAAGGTGACATGG 
GTCTAAAAGGTGACAGAGGAGAAGTTGGTCAAATTGGCCCAAGAGGGGAAGATGGCCCTGAAGGACCCAAAGGTCGA 
GCAGGCCCAACTGGAGACCCAGGTCCTTCAGGTCAAGCAGGAGAAAAGGGAAAACTTGGAGTTCCAGGATTACCAGG 
ATATCCAGGAAGACAAGGTCCAAAGGGTTCCACTGGATTCCCTGGGTTTCCAGGTGCCAATGGAGAGAAAGGTGCAC 
GGGGAGTAGCTGGCAAACCAGGCCCTCGGGGTCAGCGTGGTCCAACGGGTCCTCGAGGTTCAAGAGGTGCAAGAGGT 
CCCACTGGGAAACCTGGGCCAAAGGGCACTTCAGGTGGCGATGGCCCTCCTGGCCCTCCAGGTGAAAGAGGTCCTCA 
AGGACCTCAGGGTCCAGTTGGATTCCCTGGACCAAAAGGCCCTCCTGGACCACCTGGGAAGGATGGGCTGCCAGGAC 
ACCCTGGGCAACGTGGGGAGACTGGATTTCAAGGCAAGACCGGCCCTCCTGGGCCAGGGGGAGTGGTTGGACCACAG 
GGACCAACCGGTGAGACTGGTCCAATAGGGGAACGTGGGCATCCTGGCCCTCCTGGCCCTCCTGGTGAGCAAGGTCT 
TCCTGGTGCTGCAGGAAAAGAAGGTGCAAAGGGTGATCCAGGTCCTCAAGGTATCTCAGGGAAAGATGGACCAGCAG 
GATTACGTGGTTTCCCAGGGGAAAGAGGTCTTCCTGGAGCTCAGGGTGCACCTGGACTGAAAGGAGGGGAAGGTCCC 
CAGGGCCCACCAGGTCCAGTTGTAAGTATGATGATAATAAATAGCCAGACAATCATGGTTGTGAATTACAGCTCTTC 
TTTCATTACTCTCATGCTGTGATTCCACAGTGTGTGGGAGAGAAAATAAACACATGTCAAT CAAATCAGTCAA 

<210> SEQ ID NO 100 

<211> Length : 2, 648 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 100 
>HUMCA1XIA_T17 

ACACAGTACTCTCAGCTTGTTGGTGGAAGCCCCTCATCTGCCTTCATTCTGAAGGCAGGGCCCGGCAGAGGAAGGAT 
CAGAGGGTCGCGGCCGGAGGGTCCCGGCCGGTGGGGCCAACTCAGAGGGAGAGGAAAGGGCTAGAGACACGAAGAAC 
GCAAACCATCAAATTTAGAAGAAAAAGCCCTTTGACTTTTTCCCCCTCTCCCTCCCCAATGGCTGTGTAGCAAACAT 
CCCTGGCGATACCTTGGAAAGGACGAAGTTGGTCTGCAGTCGCAATTTCGTGGGTTGAGTTCACAGTTGTGAGTGCG 
GGGCTCGGAGATGGAGCCGTGGTCCTCTAGGTGGAAAACGAAACGGTGGCTCTGGGATTTCACCGTAACAACCCTCG 
CATTGACCTTCCTCTTCCAAGCTAGAGAGGTCAGAGGAGCTGCTCCAGTTGATGTACTAAAAGCACTAGAXTTTCAC 
AATTCTCCAGAGGGAATATCAAAAACAACGGGATTTTGCACAAACAGAAAGAATTCTAAAGGCTCAGATACTGCTTA 
CAGAGTTTCAAAGCAAGCACAACTCAGTGCCCCAACAAAACAGTTATTTCCAGGTGGAACTTTCCCAGAAGACTTTT 
CAATACTATTTACAGTAAAACCAAAAAAAGGAATTCAGTCTTTCCTTTTATCTATATATAATGAGCATGGTATTCAG 
CAAATTGGTGTTGAGGTTGGGAGATCACCTGTTTTTCTGTTTGAAGACCACACTGGAAAACCTGCCCCAGAAGACTA 
TCCCCTCTTCAGAACTGTTAACATCGCTGACGGGAAGTGGCATCGGGTAGCAATCAGCGTGGAGAAGAAAACXGTGA 
CAATGATTGTTGATTGTAAGAAGAAAACCACGAAACCACTTGATAGAAGTGAGAGAGCAATTGTTGATACCAATGGA 
ATCACGGTTTTTGGAACAAGGATTT¥GGATGAAGAAGTTTTTGAGGGGGACATTCAGCAGTTTTTGATCACAGGTGA 
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TCCCAAGGCAGCATATGACTACTGTGAGCATTATAGTCCAGACTGTGACTCTTCAGCACCCAAGGCTGCTCAAGCTC 
AGGAACCTCAGATAGATGAGTATGCACCAGAGGATATAATCGAATATGACTATGAGTATGGGGAAGCAGAGTATAAA 
GAGGCTGAAAGTGTAACAGAGGGACCCACTGTAACTGAGGAGACAATAGCACAGACGGAGGCAAACATCGTTGATGA 
TTTTCAAGAATACAACTATGGAACAATGGAAAGTTACCAGACAGAAGCTCCTAGGCATGTTTCTGGGACAAATGAGC 
CAAATCCAGTTGAAGAAATATTTACTGAAGAATATCTAACGGGAGAGGATTATGATTCCCAGAGGAAAAATTCTGAG 
GATACACTATATGAAAACAAAGAAATAGACGGCAGGGATTCTGATCTTCTGGTAGATGGAGATTTAGGCGAATATGA 
TTTTTATGAATATAAAGAATATGAAGATAAACCAACAAGCCCCCCTAATGAAGAATTTGGTCCAGGTGTACCAGCAG 
AAACTGATATTACAGAAACAAGCATAAATGGCCATGGTGCATATGGAGAGAAAGGACAGAAAGGAGAACCAGCAGTG 
GTTGAGCCTGGTATGCTTGTCGAAGGACCACCAGGACCAGCAGGACCTGCAGGTATTATGGGTCCTCCAGGTCTACA 
AGGCCCCACTGGACCCCCTGGTGACCCTGGCGATAGGGGCCCCCCAGGACGTCCTGGCTTACCAGGGGCTGATGGTC 
TACCTGGTCCTCCTGGTACTATGTTGATGTTACCGTTCCGTTATGGTGGTGATGGTTCCAAAGGACCAACCATCTCT 
GCTCAGGAAGCTCAGGCTCAAGCTATTCTTCAGCAGGCTCGGATTGCTCTGAGAGGCCCACCTGGCCCAATGGGTCT 
AACTGGAAGACCAGGTCCTGTGGGGGGGCCTGGTTCATCTGGGGCCAAAGGTGAGAGTGGTGATCCAGGTCCTCAGG 
GCCCTCGAGGCGTCCAGGGTCCCCCTGGTCCAACGGGAAAACCTGGAAAAAGGGGTCGTCCAGGTGCAGATGGAGGA 
AGAGGAATGCCAGGAGAACCTGGGGCAAAGGGAGATCGAGGGTTTGATGGACTTCCGGGTCTGCCAGGTGACAAAGG 
TCACAGGGGTGAACGAGGTCCTCAAGGTCCTCCAGGTCCTCCTGGTGATGATGGAATGAGGGGAGAAGATGGAGAAA 
TTGGACCAAGAGGTCTTCCAGGTGAAGCTGGCCCACGAGGTTTGCTGGGTCCAAGGGGAACTCCAGGAGCTCCAGGG 
CAGCCTGGTATGGCAGGTGTAGATGGCCCCCCAGGACCAAAAGGGAACATGGGTCCCCAAGGGGAGCCTGGGCCTCC 
AGGTCAACAAGGGAATCCAGGACCTCAGGGTCTTCCTGGTCCACAAGGTCCAATTGGTCCTCCTGGTGAAAAAATGT 
GCTGCAATCTAAGTTTCGGAATACTTATACCACTCCAGAAATAATCCTCGTGTTAATATTCAGTCCATGTTTCCACC 
CCCTCAGACCTAGCTAACAATTAATTTGCTTTCTGTCTCTGTAGATTTGAGTTTTCTGAATATTTCATGTGAATGAA 
AC TT ATG AAT AAAT TT ATGAT T TAAT CAT A 

<210> SEQ ID NO 101 

<211> Length : 3,475 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 101 
>HUMCA1XIA_T19 

ACACAGTACTCTCAGCTTGTTGGTGGAAGCCCCTCATCTGCCTTCATTCTGAAGGCAGGGCCCGGCAGAGGAAGGAT 
CAGAGGGTCGCGGCCGGAGGGTCCCGGCCGGTGGGGCCAACTCAGAGGGAGAGGAAAGGGCTAGAGACACGAAGAAC 
GCAAACCATCAAATTTAGAAGAAAAAGCCCTTTGACTTTTTCCCCCTCTCCCTCCCCAATGGCTGTGTAGCAAACAT 
CCCTGGCGATACCTTGGAAAGGACGAAGTTGGTCTGCAGTCGCAATTTCGTGGGTTGAGTTCACAGTTGTGAGTGCG 
GGGCTCGGAGATGGAGCCGTGGTCCTCTAGGTGGAAAACGAAACGGTGGCTCTGGGATTTCACCGTAACAACCCTCG 
CATTGACCTTCCTCTTCCAAGCTAGAGAGGTCAGAGGAGCTGCTCCAGTTGATGTACTAAAAGCACTAGATTTTCAC 
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AATTCTCCAGAGGGAATATCAAAAACAACGGGATTTTGCACAAACAGAAAGAATTCTAAAGGCTCAGATACTGCTTA 

CAGAGTTTCAAAGCAAGCACAACTCAGTGCCCCAACAAAACAGTTATTTCCAGGTGGAACTTTCCCAGAAGACTTTT 

CAATACTATTTACAGTAAAACCAAAAAAAGGAATTCAGTCTTTCCTTTTATCTATATATAATGAGCATGGTATTCAG 

CAAATTGGTGTTGAGGTTGGGAGATCACCTGTTTTTCTGTTTGAAGACCACACTGGAAAACCTGCCCCAGAAGACTA 

TCCCCTCTTCAGAACTGTTAACATCGCTGACGGGAAGTGGCATCGGGTAGCAATCAGCGTGGAGAAGAAAACTGTGA 

CAATGATTGTTGATTGTAAGAAGAAAACCACGAAACCACTTGATAGAAGTGAGAGAGCAATTGTTGATACCAATGGA 

ATCACGGTTTTTGGAACAAGGATTTTGGATGAAGAAGTTTTTGAGGGGGACATTCAGCAGTTTTTGATCACAGGTGA 

TCCCAAGGCAGCATATGACTACTGTGAGCATTATAGTCCAGACTGTGACTCTTCAGCACCCAAGGCTGCTCAAGCTC 

AGGAACCTCAGATAGATGAGTATGCACCAGAGGATATAATCGAATATGACTATGAGTATGGGGAAGCAGAGTATAAA 

GAGGCTGAAAGTGTAACAGAGGGACCCACTGTAACTGAGGAGACAATAGCACAGACGGAGGCAAACATCGTTGATGA 

TTTTCAAGAATACAACTATGGAACAATGGAAAGTTACCAGACAGAAGCTCCTAGGCATGTTTCTGGGACAAATGAGC 

CAAATCCAGTTGAAGAAATATTTACTGAAGAATATCTAACGGGAGAGGATTATGATTCCCAGAGGAAAAATTCTGAG 

GATACACTATATGAAAACAAAGAAATAGACGGCAGGGATTCTGATCTTCTGGTAGATGGAGATTTAGGCGAATATGA 

TTTTTATGAATATAAAGAATATGAAGATAAACCAACAAGCCCCCCTAATGAAGAATTTGGTCCAGGTGTACCAGCAG 

AAACTGATATTACAGAAACAAGCATAAATGGCCATGGTGCATATGGAGAGAAAGGACAGAAAGGAGAACCAGCAGTG 

GTTGAGCCTGGTATGCTTGTCGAAGGACCACCAGGACCAGCAGGACCTGCAGGTATTATGGGTCCTCCAGGTCTACA 

AGGCCCCACTGGACCCCCTGGTGACCCTGGCGATAGGGGCCCCCCAGGACGTCCTGGCTTACCAGGGGCTGATGGTC 

TACCTGGTCCTCCTGGTACTATGTTGATGTTACCGTTCCGTTATGGTGGTGATGGTTCCAAAGGACCAACCATCTCT 

GCTCAGGAAGCTCAGGCTCAAGCTATTCTTCAGCAGGCTCGGATTGCTCTGAGAGGCCCACCTGGCCCAATGGGTCT 

AACTGGAAGACCAGGTCCTGTGGGGGGGCCTGGTTCATCTGGGGCCAAAGGTGAGAGTGGTGATCCAGGTCCTCAGG 

GCCCTCGAGGCGTCCAGGGTCCCCCTGGTCCAACGGGAAAACCTGGAAAAAGGGGTCGTCCAGGTGCAGATGGAGGA 

AGAGGAATGCCAGGAGAACCTGGGGCAAAGGGAGATCGAGGGTTTGATGGACTTCCGGGTCTGCCAGGTGACAAAGG 

TCACAGGGGTGAACGAGGTCCTCAAGGTCCTCCAGGTCCTCCTGGTGATGATGGAATGAGGGGAGAAGATGGAGAAA 

TTGGACCAAGAGGTCTTCCAGGTGAAGCTGGTATGGCAGGTGTAGATGGCCCCCCAGGACCAAAAGGGAACATGGGT 

CCCCAAGGGGAGCCTGGGCCTCCAGGTCAACAAGGGAATCCAGGACCTCAGGGTCTTCCTGGTCCACAAGGTCCAAT 

TGGTCCTCCTGGTGAAAAAGTCTCCTTTTCTTTCTCATTATTTTACAAAAAAGTAATTAAGTTTGCTTGTGACAAAA 

GATTTGTAGGAAGACATGATGAAAGGAAAGTTGTGAAGTTGTCATTGCCATTATATCTCATATATGAATAGCAAATG 

ATGTTCTTGTTAATAACTCAATGTCTGGATCAAAACAAGAATAAATCTATAATTAAACACATGTCTTTTTCCCTGAT 

CTCCACTGTGAGATTCCTTGAGTAATATTTTGTCCCCTGTAGTCATAGCACATTTCTATCTGGCTCTTTCCAACACC 

TTTTTTCTTTCATTATTTTTGTTGATTTCTACAAACATATTAATTAAAAAAAACTAATAGCTTTATCGAAGTGTAAT 

TTAAATACTATAAATATTCCCTTGTTTTAAGTGTACAAGTGAATTATTTTTACTAAATTTACAGATGTGCTGCAATC 

TAAGTTTCGGAATACTTATACCACTCCAGAAATAATCCTCGTGTTAATATTCAGTCCATGTTTCCACCCCCTCAGAC 

CTAGCTAACAATTAATTTGCTTTCTGTCTCTGTAGATTTGAGTTTTCTGAATATTTCATGTGAATGAAACTTATGAA 

TAAATTTATGATTTAATCATATGAATGTGTATGTGACCTTTATGTCTGACTTCTTTCATTTTAGGTTCATCCATGCT 

GTAGCATACATATATTAGTACTTTGCTCCTTATTTTGTTCTCATAGTATTCCATTGATTGGGTATACCAGGTTCTGT 

TTACTTTTACTTGGCAGTTGATAGAATAGGTGTAGTTTATACTTTTTCGCTATTCTCCATACCGGTGCTGTAGTGAA 

TAATTGCATACAAGTCTTTGTATAGATGTGTTTTCATTCTTTTTGGTATATACTTAGAAGCAGAATTCTTGTGTTAT 

GGTAAACTTATATTCAATATTTTGTGAATTCCACTCTTTTCCATATCGATTGTACCATTTTCCCTTCCAAGTAACCA 
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TGTATGAGGATAGTCATTTCTGCACATTCTCACTAATGCTTGTTATTGTCTGTCTTCTTGATTACGATCATTCTCGT 
TGGTGTGAAA 

<210> SEQ ID NO 102 

<211> Length : 1,271 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 102 
>HUMCA1XIA_T2 0 

ACACAGTACTCTCAGCTTGTTGGTGGAAGCCCCTCATCTGCCTTCATTCTGAAGGCAGGGCCCGGCAGAGGAAGGAT 
CAGAGGGTCGCGGCCGGAGGGTCCCGGCCGGTGGGGCCAACTCAGAGGGAGAGGAAAGGGCTAGAGACACGAAGAAC 
GCAAACCATCAAATTTAGAAGAAAAAGCCCTTTGACTTTTTCCCCCTCTCCCTCCCCAATGGCTGTGTAGCAAACAT 
CCCTGGCGATACCTTGGAAAGGACGAAGTTGGTCTGCAGTCGCAATTTCGTGGGTTGAGTTCACAGTTGTGAGTGCG 
GGGCTCGGAGATGGAGCCGTGGTCCTCTAGGTGGAAAACGAAACGGTGGCTCTGGGATTTCACCGTAACAACCCTCG 
CATTGACCTTCCTCTTCCAAGCTAGAGAGGTCAGAGGAGCTGCTCCAGTTGATGTACTAAAAGCACTAGATTTTCAC 
AATTCTCCAGAGGGAATATCAAAAACAACGGGATTTTGCACAAACAGAAAGAATTCTAAAGGCTCAGATACTGCTTA 
CAGAGTTTCAAAGCAAGCACAACTCAGTGCCCCAACAAAACAGTTATTTCCAGGTGGAACTTTCCCAGAAGACTTTT 
CAATACTATTTACAGTAAAACCAAAAAAAGGAATTCAGTCTTTCCTTTTATCTATATATAATGAGCATGGTATTCAG 
CAAATTGGTGTTGAGGTTGGGAGATCACCTGTTTTTCTGTTTGAAGACCACACTGGAAAACCTGCCCCAGAAGACTA 
TCCCCTCTTCAGAACTGTTAACATCGCTGACGGGAAGTGGCATCGGGTAGCAATCAGCGTGGAGAAGAAAACTGTGA 
CAATGATTGTTGATTGTAAGAAGAAAACCACGAAACCACTTGATAGAAGTGAGAGAGCAATTGTTGATACCAATGGA 
ATCACGGTTTTTGGAACAAGGATTTTGGATGAAGAAGTTTTTGAGGGGGACATTCAGCAGTTTTTGATCACAGGTGA 
TCCCAAGGCAGCATATGACTACTGTGAGCATTATAGTCCAGACTGTGACTCTTCAGCACCCAAGGCTGCTCAAGCTC 
AGGAACCTCAGATAGATGAGGTGAGGAGCACAAGACCAGAGAAGGTCTTCGTATTTCAGTGATATGGACATAGCAGT 
GCAGTTTTTGAACTTATACTTATTTATTCCATTTTAATTAAGCTATGCTTTGTATTTTAATTGTGTTGTAATATTTC 
CAGGAAAAAGTGACTTGAATATATTTGGTACTTGTTTTC 

<210> SEQ ID NO 103 

<211> Length : 1,225 

<212> Type : DNA 

<213> Organism : Homo sapiens 



WO 2006/131783 



PCT/IB2005/004037 



121 

<400> sequence : 103 
>T11628_PEA_1_T3 

GCAAGCCAAAACCCTGGGCAGACTCAATCCAAAAATAAACAATCAAAGAGCATGTTGGCCTGGTCCTTTGCTAGGTA 
CTGTAGAGCAGGTGAGAGAGTGAGGGGGAAGGACTCCAAATTAGACCAGTTCTTAGCCATGAAGCAGAGACTCTGAA 
GCCAGACTACCTGGGTCCCAATCTTGGGCTTGGTATTTCCTCGCTGTGTGACTCTGGACTGCGCCATGGGGCTCAGC 
GACGGGGAATGGCAGTTGGTGCTGAACGTCTGGGGGAAGGTGGAGGCTGACATCCCAGGCCATGGGCAGGAAGTCCT 
CATCAGGCTCTTTAAGGGTCACCCAGAGACTCTGGAGAAGTTTGACAAGTTCAAGCACCTGAAGTCAGAGGACGAGA 
TGAAGGCGTCTGAGGACTTAAAGAAGCATGGTGCCACCGTGCTCACCGCCCTGGGTGGCATCCTTAAGAAGAAGGGG 
CATCATGAGGCAGAGATTAAGCCCCTGGCACAGTCGCATGCCACCAAGCACAAGATCCCCGTGAAGTACCTGGAGTT 
CATCTCGGAATGCATCATCCAGGTTCTGCAGAGCAAGCATCCCGGGGACTTTGGTGCTGATGCCCAGGGGGCCATGA 
ACAAGGCCCTGGAGCTGTTCCGGAAGGACATGGCCTCCAACTACAAGGAGCTGGGCTTCCAGGGCTAGGCCCCTGCC 
GCTCCCACCCCCACCCATCTGGGCCCCGGGTTCAAGAGAGAGCGGGGTCTGATCTCGTGTAGCCATATAGAGTTTGC 
TTCTGAGTGTCTGCTTTGTTTAGTAGAGGTGGGCAGGAGGAGCTGAGGGGCTGGGGCTGGGGTGTTGAAGTTGGCTT 
TGCATGCCCAGCGATGCGCCTCCCTGTGGGATGTCATCACCCTGGGAACCGGGAGTGGCCCTTGGCTCACTGTGTTC 
TGCATGGTTTGGATCTGAATTAATTGTCCTTTCTTCTAAATCCCAACCGAACTTCTTCCAACCTCCAAACTGGCTGT 
AACCCCAAATCCAAGCCATTAACTACACCTGACAGTAGCAATTGTCTGATTAATCACTGGCCCCTTGAAGACAGCAG 
AATGTCCCTTTGCAATGAGGAGGAGATCTGGGCTGGGCGGGCCAGCTGGGGAAGCATTTGACTATCTGGAACTTGTG 
TGTGCCTCCTCAGGTATGGCAGTGACTCACCTGGTTTTAATAAAACAACCTGCAACATCTCAAAAAAAGC 

<210> SEQ ID NO 104 

<211> Length : 1,210 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 104 
>T1162 8_PEA_1_T4 

TGTCTCTGAGAGCATTGATGAGGTCAGGAAGCCTCCTGTTGGGTAGAGGAGCAACTAAGAGACTGAACTTGGCCCCC 
ACCCTGAGGCTCACAAGCTTGAATTGCACCTGAGTTCCAAAGGAGAAGTTGACATTCTTCCAGAACATATGCCCAGT 
GTCTTCAACTTGAGATGGAGCTGGGATGCCAAGTCTGCAAATACTGCGCCATGGGGCTCAGCGACGGGGAATGGCAG 
TTGGTGCTGAACGTCTGGGGGAAGGTGGAGGCTGACATCCCAGGCCATGGGCAGGAAGTCCTCATCAGGCTCTTTAA 
GGGXCACCCAGAGACTCTGGAGAAGTTTGACAAGTTCAAGCACCTGAAGTCAGAGGACGAGATGAAGGCGTCTGAGG 
ACTTAAAGAAGCATGGTGCCACCGTGCTCACCGCCCTGGGTGGCATCCTTAAGAAGAAGGGGCATCATGAGGCAGAG 
ATTAAGCCCCTGGCACAGTCGCATGCCACCAAGCACAAGATCCCCGTGAAGTACCTGGAGTTCATCTCGGAATGCAT 
CATCCAGGTTCTGCAGAGCAAGCATCCCGGGGACTTTGGTGCTGATGCCCAGGGGGCCATGAACAAGGCCCTGGAGC 
TGTTCCGGAAGGACATGGCCTCCAACTACAAGGAGCTGGGCTTCCAGGGCTAGGCCCCTGCCGCTCCCACCCCCACC 
CATCTGGGCCCCGGGTTCAAGAGAGAGCGGGGTCTGATCTCGTGTAGCCATATAGAGTTTGCTTCTGAGTGTCTGCT 
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TTGTTTAGTAGAGGTGGGCAGGAGGAGCTGAGGGGCTGGGGCTGGGGTGTTGAAGTTGGCTTTGCATGCCCAGCGAT 
GCGCCTCCCTGTGGGATGTCATCACCCTGGGAACCGGGAGTGGCCCTTGGCTCACTGTGTTCTGCATGGTTTGGATC 
TGAATTAATTGTCCTTTCTTCTAAATCCCAACCGAACTTCTTCCAACCTCCAAACTGGCTGTAACCCCAAATCCAAG 
CCATTAACTACACCTGACAGTAGCAATTGTCTGATTAATCACTGGCCCCTTGAAGACAGCAGAATGTCCCTTTGCAA 
TGAGGAGGAGATCTGGGCTGGGCGGGCCAGCTGGGGAAGCATTTGACTATCTGGAACTTGTGTGTGCCTCCTCAGGT 
ATGGCAGTGACTCACCTGGTTTTAATAAAACAACCTGCAACATCTCAAAAAAAGC 

<210> SEQ ID NO 105 

<211> Length : 1,192 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 105 
>T1 1 62 8__PEA_1_T5 

CTTGGCTGGAGGCTCTGCGAGGACAGCTGGGGAGAAGGGGAGCTGTGCCCAGATTTACCAAAGGGAATTGTCAGCTG 
TCCAAGGGCTAGCAAATTCCTAGGTCACCTAGATTGGATTTTCTGACCATAAAAACTGTGGGCCAGGTGCACAGCTG 
CCTGAGGGGCTCAAACCTGTGCAGACTGCGCCATGGGGCTCAGCGACGGGGAATGGCAGTTGGTGCTGAACGTCTGG 
GGGAAGGTGGAGGCTGACATCCCAGGCCATGGGCAGGAAGTCCTCATCAGGCTCTTTAAGGGTCACCCAGAGACTCT 
GGAGAAGTTTGACAAGTTCAAGCACCTGAAGTCAGAGGACGAGATGAAGGCGTCTGAGGACTTAAAGAAGCATGGTG 
CCACCGTGCTCACCGCCCTGGGTGGCATCCTTAAGAAGAAGGGGCATCATGAGGCAGAGATTAAGCCCCTGGCACAG 
TCGCATGCCACCAAGCACAAGATCCCCGTGAAGTACCTGGAGTTCATCTCGGAATGCATCATCCAGGTTCTGCAGAG 
CAAGCATCCCGGGGACTTTGGTGCTGATGCCCAGGGGGCCATGAACAAGGCCCTGGAGCTGTTCCGGAAGGACATGG 
CCTCCAACTACAAGGAGCTGGGCTTCCAGGGCTAGGCCCCTGCCGCTCCCACCCCCACCCATCTGGGCCCCGGGTTC 
AAGAGAGAGCGGGGTCTGATCTCGTGTAGCCATATAGAGTTTGCTTCTGAGTGTCTGCTTTGTTTAGTAGAGGTGGG 
CAGGAGGAGCTGAGGGGCTGGGGCTGGGGTGTTGAAGTTGGCTTTGCATGCCCAGCGATGCGCCTCCCTGTGGGATG 
TCATCACCCTGGGAACCGGGAGTGGCCCTTGGCTCACTGTGTTCTGCATGGTTTGGATCTGAATTAATTGTCCTTTC 
TTCTAAATCCCAACCGAACTTCTTCCAACCTCCAAACTGGCTGTAACCCCAAATCCAAGCCATTAACTACACCTGAC 
AGTAGCAATTGTCTGATTAATCACTGGCCCCTTGAAGACAGCAGAATGTCCCTTTGCAATGAGGAGGAGATCTGGGC 
TGGGCGGGCCAGCTGGGGAAGCATTTGACTATCTGGAACTTGTGTGTGCCTCCTCAGGTATGGCAGTGACTCACCTG 
GTTTTAATAAAACAACCTGCAACATCTCAAAAAAAGC 

<210> SEQ ID NO 106 
<211> Length : 1,177 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 106 
>T1 1 62 8 J?EA_1_T7 

CTTGGCTGGAGGCTCTGCGAGGACAGCTGGGGAGAAGGGGAGCTGTGATGGAGGCTCGCTCTGTTGCCAGGCTGGAG 
TACAGCGATCTCGGCTCACTGCAACCTCTGCCTCCCGGGTTCAAGTGATTCTCCTGCCTCAGCCTCCCAAGTAGCTG 
GGACTACAGACTGCGCCATGGGGCTCAGCGACGGGGAATGGCAGTTGGTGCTGAACGTCTGGGGGAAGGTGGAGGCT 
GACATCCCAGGCCATGGGCAGGAAGTCCTCATCAGGCTCTTTAAGGGTCACCCAGAGACTCTGGAGAAGTTTGACAA 
GTTCAAGCACCTGAAGTCAGAGGACGAGATGAAGGCGTCTGAGGACTTAAAGAAGCATGGTGCCACCGTGCTCACCG 
CCCTGGGTGGCATCCTTAAGAAGAAGGGGCATCATGAGGCAGAGATTAAGCCCCTGGCACAGTCGCATGCCACCAAG 
CACAAGATCCCCGTGAAGTACCTGGAGTTCATCTCGGAATGCATCATCCAGGTTCTGCAGAGCAAGCATCCCGGGGA 
CTTTGGTGCTGATGCCCAGGGGGCCATGAACAAGGCCCTGGAGCTGTTCCGGAAGGACATGGCCTCCAACTACAAGG 
AGCTGGGCTTCCAGGGCTAGGCCCCTGCCGCTCCCACCCCCACCCATCTGGGCCCCGGGTTCAAGAGAGAGCGGGGT 
CTGATCTCGTGTAGCCATATAGAGTTTGCTTCTGAGTGTCTGCTTTGTTTAGTAGAGGTGGGCAGGAGGAGCTGAGG 
GGCTGGGGCTGGGGTGTTGAAGTTGGCTTTGCATGCCCAGCGATGCGCCTCCCTGTGGGATGTCATCACCCTGGGAA 
CCGGGAGTGGCCCTTGGCTCACTGTGTTCTGCATGGTTTGGATCTGAATTAATTGTCCTTTCTTCTAAATCCCAACC 
GAACTTCTTCCAACCTCCAAACTGGCTGTAACCCCAAATCCAAGCCATTAACTACACCTGACAGTAGCAATTGTCTG 
ATTAATCACTGGCCCCTTGAAGACAGCAGAATGTCCCTTTGCAATGAGGAGGAGATCTGGGCTGGGCGGGCCAGCTG 
GGGAAGCATTTGACTATCTGGAACTTGTGTGTGCCTCCTCAGGTATGGCAGTGACTCACCTGGTTTTAATAAAACAA 
C C T G C AAC A T C T C A A A A AAA G C 

<210> SEQ ID NO 107 

<211> Length : 1,051 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 107 
>T1 1 62 8_PEA_1 JT9 

GCCTTATTTCTCTGCTGGTTGAGCGAAGGGATTGTCTTCCATGGTCTCCGAGATCCCGTCCCACGCTCATGCCCTAG 
AATTCTCTGAGTCCTTGATGCACTTTTGCCTTTGGCGAGGAGGCAGGACAGTCAGGCGTGGAGGCTCTTTAAGGGTC 
ACCCAGAGACTCTGGAGAAGTTTGACAAGTTCAAGCACCTGAAGTCAGAGGACGAGATGAAGGCGTCTGAGGACTTA 
AAGAAGCATGGTGCCACCGTGCTCACCGCCCTGGGTGGCATCCTTAAGAAGAAGGGGCATCATGAGGCAGAGATTAA 
GCCCCTGGCACAGTCGCATGCCACCAAGCACAAGATCCCCGTGAAGTACCTGGAGTTCATCTCGGAATGCATCATCC 
AGGTTCTGCAGAGCAAGCATCCCGGGGACTTTGGTGCTGATGCCCAGGGGGCCATGAACAAGGCCCTGGAGCTGTTC 
CGGAAGGACATGGCCTCCAACTACAAGGAGCTGGGCTTCCAGGGCTAGGCCCCTGCCGCTCCCACCCCCACCCATCT 
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GGGCCCCGGGTTCAAGAGAGAGCGGGGTCTGATCTCGTGTAGCCATATAGAGTTTGCTTCTGAGTGTCTGCTTTGTT 
TAGTAGAGGTGGGCAGGAGGAGCTGAGGGGCTGGGGCTGGGGTGTTGAAGTTGGCTTTGCATGCCCAGCGATGCGCC 
TCCCTGTGGGATGTCATCACCCTGGGAACCGGGAGTGGCCCTTGGCTCACTGTGTTCTGCATGGTTTGGATCTGAAT 
TAATTGTCCTTTCTTCTAAATCCCAACCGAACTTCTTCCAACCTCCAAACTGGCTGTAACCCCAAATCCAAGCCATT 
AACTACACCTGACAGTAGCAATTGTCTGATTAATCACTGGCCCCTTGAAGACAGCAGAATGTCCCTTTGCAATGAGG 
AGGAGATCTGGGCTGGGCGGGCCAGCTGGGGAAGCATTTGACTATCTGGAACTTGTGTGTGCCTCCTCAGGTATGGC 
AGTGACTCACCTGGTTTTAATAAAACAACCTGCAACATCTCAAAAAAAGC 

<210> SEQ ID NO 108 

<211> Length : 1,267 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 108 
>T 1 1 62 8_PEA_1_T1 1 

TCCTCCCCTTCTTTCCACACGCACAACCACCCCACCCCCTGTGGCCTGAGCTGTCCTGCCTCGCCACAATGGCACCT 
GCCCTAAAATAGCTTCCCATGTGAGGGCTAGAGAAAGGAAAAGATTAGACCCTCCCTGGATGAGAGAGAGAAAGTGA 
AGGAGGGCAGGGGAGGGGGACAGCGAGCCATTGAGCGATCTTTGTCAAGCATCCCAGAAGGTATAAAAACGCCCTTG 
GGACCAGGCAGCCTCAAACCCCAGCTGTTGGGGCCAGGACACCCAGTGAGCCCATACTTGCTCTTTTTGTCTTCTTC 
AGACTGCGCCATGGGGCTCAGCGACGGGGAATGGCAGTTGGTGCTGAACGTCTGGGGGAAGGTGGAGGCTGACATCC 
CAGGCCATGGGCAGGAAGTCCTCATCAGGCTCTTTAAGGGTCACCCAGAGACTCTGGAGAAGTTTGACAAGTTCAAG 
CACCTGAAGTCAGAGGACGAGATGAAGGCGTCTGAGGACTTAAAGAAGCATGGTGCCACCGTGCTCACCGCCCTGGG 
TGGCATCCTTAAGAAGAAGGGGCATCATGAGGCAGAGATTAAGCCCCTGGCACAGTCGCATGCCACCAAGCACAAGA 
TCCCCGTGAAGTACCTGGAGTTCATCTCGGAATGCATCATCCAGGTTCTGCAGAGCAAGCATCCCGGGGACTTTGGT 
GCTGATGCCCAGGGGGCCATGAACAAGGGCTAGGCCCCTGCCGCTCCCACCCCCACCCATCTGGGCCCCGGGTTCAA 
GAGAGAGCGGGGTCTGATCTCGTGTAGCCATATAGAGTTTGCTTCTGAGTGTCTGCTTTGTTTAGTAGAGGTGGGCA 
GGAGGAGCTGAGGGGCTGGGGCTGGGGTGTTGAAGTTGGCTTTGCATGCCCAGCGATGCGCCTCCCTGTGGGATGTC 
ATCACCCTGGGAACCGGGAGTGGCCCTTGGCTCACTGTGTTCTGCATGGTTTGGATCTGAATTAATTGTCCTTTCTT 
CTAAATCCCAACCGAACTTCTTCCAACCTCCAAACTGGCTGTAACCCCAAATCCAAGCCATTAACTACACCTGACAG 
TAGCAATTGTCTGATTAATCACTGGCCCCTTGAAGACAGCAGAATGTCCCTTTGCAATGAGGAGGAGATCTGGGCTG 
GGCGGGCCAGCTGGGGAAGCATTTGACTATCTGGAACTTGTGTGTGCCTCCTCAGGTATGGCAGTGACTCACCTGGT 
TTTAATAAAACAACCTGCAACATCTCAAAAAAAGC 

<210> SEQ ID NO 109 
<211> Length : 3, 897 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 109 
>HUMCEA_PEA_1_T 8 

CTCAGGGCAGAGGGAGGAAGGACAGCAGACCAGACAGTCACAGCAGCCTTGACAAAACGTTCCTGGAACTCAAGCTC 
TTCTCCACAGAGGAGGACAGAGCAGACAGCAGAGACCATGGAGTCTCCCTCGGCCCCTCCCCACAGATGGTGCATCC 
CCTGGCAGAGGCTCCTGCTCACAGCCTCACTTCTAACCTTCTGGAACCCGCCCACCACTGCCAAGCTCACTATTGAA 
TCCACGCCGTTCAATGTCGCAGAGGGGAAGGAGGTGCTTCTACTTGTCCACAATCTGCCCCAGCATCTTTTTGGCTA 
CAGCTGGTACAAAGGTGAAAGAGTGGATGGCAACCGTCAAATTATAGGATATGTAATAGGAACTCAACAAGCTACCC 
CAGGGCCCGCATACAGTGGTCGAGAGATAATATACCCCAATGCATCCCTGCTGATCCAGAACATCATCCAGAATGAC 
ACAGGATTCTACACCCTACACGTCATAAAGTCAGATCTTGTGAATGAAGAAGCAACTGGCCAGTTCCGGGTATACCC 
GGAGCTGCCCAAGCCCTCCATCTCCAGCAACAACTCCAAACCCGTGGAGGACAAGGATGCTGTGGCCTTCACCTGTG 
AACCTGAGACTCAGGACGCAACCTACCTGTGGTGGGTAAACAATCAGAGCCTCCCGGTCAGTCCCAGGCTGCAGCTG 
TCCAATGGCAACAGGACCCTCACTCTATTCAATGTCACAAGAAATGACACAGCAAGCTACAAATGTGAAACCCAGAA 
CCCAGTGAGTGCCAGGCGCAGTGATTCAGTCATCCTGAATGTCCTCTGTGAGTATATCTGCTCCTCTCTGGCCCAGG 
CTGCCAGCCCAAATCCACAGGGCCAGAGGCAGGATTTCTCAGTCCCTCTCAGGTTCAAGTACACAGACCCTCAACCC 
TGGACATCCAGACTGTCTGTGACTTTCTGCCCCAGAAAAACCTGGGCAGACCAAGTCTTGACCAAGAATAGGAGGGG 
AGGGGCTGCTTCTGTCCTGGGAGGCTCAGGGTCCACACCCTATGATGGGAGAAACAGGTGAATATCTCAGACTCAGG 
CTCAGTAGATACAAGAGGGGTTTGGCTGAGACTTTAGGATTGTGATTCAGCTTAGAGGGACACTGTGGTCCTTCCAT 
AGACCAGGAACTTCCACTTCCCTCTGACAATATCACCTGTGGCTTTATTTTGTTTGCTCCAGATGGCCCGGATGCCC 
CCACCATTTCCCCTCTAAACACATCTTACAGATCAGGGGAAAATCTGAACCTCTCCTGCCACGCAGCCTCTAACCCA 
CCTGCACAGTACTCTTGGTTTGTCAATGGGACTTTCCAGCAATCCACCCAAGAGCTCTTTATCCCCAACATCACTGT 
GAATAATAGTGGATCCTATACGTGCCAAGCCCATAACTCAGACACTGGCCTCAATAGGACCACAGTCACGACGATCA 
CAGTCTATGCAGAGCCACCCAAACCCTTCATCACCAGCAACAACTCCAACCCCGTGGAGGATGAGGATGCTGTAGCC 
TTAACCTGTGAACCTGAGATTCAGAACACAACCTACCTGTGGTGGGTAAATAATCAGAGCCTCCCGGTCAGTCCCAG 
GCTGCAGCTGTCCAATGACAACAGGACCCTCACTCTACTCAGTGTCACAAGGAATGATGTAGGACCCTATGAGTGTG 
GAATCCAGAACGAATTAAGTGTTGACCACAGCGACCCAGTCATCCTGAATGTCCTCTATGGCCCAGACGACCCCACC 
ATTTCCCCCTCATACACCTATTACCGTCCAGGGGTGAACCTCAGCCTCTCCTGCCATGCAGCCTCTAACCCACCTGC 
ACAGTATTCTTGGCTGATTGATGGGAACATCCAGCAACACACACAAGAGCTCTTTATCTCCAACATCACTGAGAAGA 
ACAGCGGACTCTATACCTGCCAGGCCAATAACTCAGCCAGTGGCCACAGCAGGACTACAGTCAAGACAATCACAGTC 
TCTGCGGAGCTGCCCAAGCCCTCCATCTCCAGCAACAACTCC7VAACCCGTGGAGGACAAGGATGCTGTGGCCTTCAC 
CTGTGAACCTGAGGCTCAGAACACAACCTACCTGTGGTGGGTAAATGGTCAGAGCCTCCCAGTCAGTCCCAGGCTGC 
AGCTGTCCAATGGCAACAGGACCCTCACTCTATTCAATGTCACAAGAAATGACGCAAGAGCCTATGTATGTGGAATC 
CAGAACTCAGTGAGTGCAAACCGCAGTGACCCAGTCACCCTGGATGTCCTCTATGGGCCGGACACCCCCATCATTTC 
CCCCCCAGACTCGTCTTACCTTTCGGGAGCGAACCTCAACCTCTCCTGCCACTCGGCCTCTAACCCATCCCCGCAGT 
ATTCTTGGCGTATCAATGGGATACCGCAGCAACACACACAAGTTCTCTTTATCGCCAAAATCACGCCAAATAATAAC 
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GGGACCTATGCCTGTTTTGTCTCTAACTTGGCTACTGGCCGCAATAATTCCATAGTCAAGAGCATCACAGTCTCTGC 
ATCTGGAACTTCTCCTGGTCTCTCAGCTGGGGCCACTGTCGGCATCATGATTGGAGTGCTGGTTGGGGTTGCTCTGA 
TATAGCAGCCCTGGTGTAGTTTCTTCATTTCAGGAAGACTGACAGTTGTTTTGCTTCTTCCTTAAAGCATTTGCAAC 
AGCTACAGTCTAAAATTGCTTCTTTACCAAGGATATTTACAGAAAAGACTCTGACCAGAGATCGAGACCATCCTAGC 
CAACATCGTGAAACCCCATCTCTACTAAAAATACAAAAATGAGCTGGGCTTGGTGGCGCGCACCTGTAGTCCCAGTT 
ACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGGAGGTGGAGATTGCAGTGAGCCCAGATCGCACCACTGCA 
CTCCAGTCTGGCAACAGAGCAAGACTCCATCTCAAAAAGAAAAGAAAAGAAGACTCTGACCTGTACTCTTGAATACA 
AGTTTCTGATACCACTGCACTGTCTGAGAATTTCCAAAACTTTAATGAACTAACTGACAGCTTCATGAAACTGTCCA 
CCAAGATCAAGCAGAGAAAATAATTAATTTCATGGGACTAAATGAACTAATGAGGATAATATTTTCATAATTTTTTA 
TTTGAAATTTTGCTGATTCTTTAAATGTCTTGTTTCCCAGATTTCAGGAAACTTTTTTTCTTTTAAGCTATCCACAG 
CTTACAGCAATTTGATAAAATATACTTTTGTGAACAAAAATTGAGACATTTACATTTTCTCCCTATGTGGTCGCTCC 
AGACTTGGGAAACTATTCATGAATATTTATATTGTATGGTAATATAGTTATTGCACAAGTTCAATAAAAATCTGCTC 
TTTGTATAACAGAATACATTTGAAAACATTGGTTATATTACCAAGACTTTGACTAGAATGTCGTATTTGAGGATATA 
AACCCATAGGTAATAAACCCACAGGTACTACAAACAAAGTCTGAAGTCAGCCTTGGTTTGGCTTCCTAGTGTCAATT 
AAACTTCTAAAAGTTTAATCTGAGATTCCTTATAAAAACTTCCAGCAAAGCAACTTTAAAAAAGTCTGTGTGGGCCG 
GGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGATCCGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCCAGA 
CCATCCTGGCTAACACAGTGAAACCCCGTCTCTACTAAAAATACAAAAAAAGTTAGCCGGGCGTGGTGGTGGGGGCC 
TGTAGTCCCAGCTACTCAGGAGGCTGAGGCAGGAGAACGGCATGAACCCGGGAGGCAGGGCTTGCAGTGAGCCAAGA 
TCATGCCGCTGCACTCCAGCCTGGGAGACAAAGTGAGACTCCGTCAA 

<210> SEQ ID NO 110 

<211> Length : 3,347 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 110 
>HUMCEA_PEA_1_T 9 

CTCAGGGCAGAGGGAGGAAGGACAGCAGACCAGACAGTCACAGCAGCCTTGACAAAACGTTCCTGGAACTCAAGCTC 
TTCTCCACAGAGGAGGACAGAGCAGACAGCAGAGACCATGGAGTCTCCCTCGGCCCCTCCCCACAGATGGTGCATCC 
CCTGGCAGAGGCTCCTGCTCACAGCCTCACTTCTAACCTTCTGGAACCCGCCCACCACTGCCAAGCTCACTATTGAA 
TCCACGCCGTTCAATGTCGCAGAGGGGAAGGAGGTGCTTCTACTTGTCCACAATCTGCCCCAGCATCTTTTTGGCTA 
CAGCTGGTACAAAGGTGAAAGAGTGGATGGCAACCGTCAAATTATAGGATATGTAATAGGAACTCAACAAGCTACCC 
CAGGGCCCGCATACAGTGGTCGAGAGATAATATACCCCAATGCATCCCTGCTGATCCAGAACATCATCCAGAATGAC 
ACAGGATTCTACACCCTACACGTCATAAAGTCAGATCTTGTGAATGAAGAAGCAACTGGCCAGTTCCGGGTATACCC 
GGAGCTGCCCAAGCCCTCCATCTCCAGCAACAACTCCAAACCCGTGGAGGACAAGGATGCTGTGGCCTTCACCTGTG 
AACCTGAGACTCAGGACGCAACCTACCTGTGGTGGGTAAACAATCAGAGCCTCCCGGTCAGTCCCAGGCTGCAGCTG 
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TCCAATGGCAACAGGACCCTCACTCTATTCAATGTCACAAGAAATGACACAGCAAGCTACAAATGTGAAACCCAGAA 
CCCAGTGAGTGCCAGGCGCAGTGATTCAGTCATCCTGAATGTCCTCTATGGCCCGGATGCCCCCACCATTTCCCCTC 
TAAACACATCTTACAGATCAGGGGAAAATCTGAACCTCTCCTGCCACGCAGCCTCTAACCCACCTGCACAGTACTCT 
TGGTTTGTCAATGGGACTTTCCAGCAATCCACCCAAGAGCTCTTTATCCCCAACATCACTGTGAATAATAGTGGATC 
CTATACGTGCCAAGCCCATAACTCAGACACTGGCCTCAATAGGACCACAGTCACGACGATCACAGTCTATGCAGAGC 
CACCCAAACCCTTCATCACCAGCAACAACTCCAACCCCGTGGAGGATGAGGATGCTGTAGCCTTAACCTGTGAACCT 
GAGATTCAGAACACAACCTACCTGTGGTGGGTAAATAATCAGAGCCTCCCGGTCAGTCCCAGGCTGCAGCTGTCCAA 
TGACAACAGGACCCTCACTCTACTCAGTGTCACAAGGAATGATGTAGGACCCTATGAGTGTGGAATCCAGAACGAAT 
TAAGTGTTGACCACAGCGACCCAGTCATCCTGAATGTCCTCTATGGCCCAGACGACCCCACCATTTCCCCCTCATAC 
ACCTATTACCGTCCAGGGGTGAACCTCAGCCTCTCCTGCCATGCAGCCTCTAACCCACCTGCACAGTATTCTTGGCT 
GATTGATGGGAACATCCAGCAACACACACAAGAGCTCTTTATCTCCAACATCACTGAGAAGAACAGCGGACTCTATA 
CCTGCCAGGCCAATAACTCAGCCAGTGGCCACAGCAGGACTACAGTCAAGACAATCACAGTCTCTGCGGAGCTGCCC 
AAGCCCTCCATCTCCAGCAACAACTCCAAACCCGTGGAGGACAAGGATGCTGTGGCCTTCACCTGTGAACCTGAGGC 
TCAGAACACAACCTACCTGTGGTGGGTAAATGGTCAGAGCCTCCCAGTCAGTCCCAGGCTGCAGCTGTCCAATGGCA 
ACAGGACCCTCACTCTATTCAATGTCACAAGAAATGACGCAAGAGCCTATGTATGTGGAATCCAGAACTCAGTGAGT 
GCAAACCGCAGTGACCCAGTCACCCTGGATGTCCTCTATGGGCCGGACACCCCCATCATTTCCCCCCCAGACTCGTC 
TTACCTTTCGGGAGCGAACCTCAACCTCTCCTGCCACTCGGCCTCTAACCCATCCCCGCAGTATTCTTGGCGTATCA 
ATGGGATACCGCAGCAACACACACAAGTTCTCTTTATCGCCAAAATCACGCCAAATAATAACGGGACCTATGCCTGT 
TTTGTCTCTAACTTGGCTACTGGCCGCAATAATTCCATAGTCAAGAGCATCACAGTCTCTGGTAAGTGGCTCCCTGG 
AGCATCAGCATCATATTCTGGGGTGGAGTCTATCTGGTTCTCACCAAAGAGCCAAGAAGACATTTTCTTTCCCAGTC 
TGTGTTCCATGGGCACAAGGAAATCCCAAATTCTATCCTGAGCCCCCTCACTCCATCTCGGCCAACTCTCTCCTCCC 
CGGCTTCTCTGATATCTCACGGCTGACCTCGGGTCCAGCCTGGAATGTGGGGAGGGGCCTCCCTTAGCCCCAGAAGG 
CCCCCAATAGTGAAAGGGACTTCATAGTCCAGAAGAAAGAAGGGTCCTTAAGGTCGAGTTGCTCCTCTCTATCACCA 
ATATGTCCCTTTCTGTCACCTCTTTGTGTTTTTTCACCTACTCTGTGAGCTACAAGGAACAAGGAGGCTTTGA7VACC 
AGCCCACACTTTTTCCCCAAATGAGAGGAGGAAGCCCCTTGGATGAGGCAGGAGCAGCTCAGACTCTGCTCCCTGCT 
CTGCGCCCGGCTCACCCGGTGACTGGCTCTGCCCTGGCTCCACTTGGGGTGGGACCGGGGCATGTGGAGAAGGTGTC 
CAGGTGGCCTGTTTTGAATCTGGGTAAATCAAGCTGCCAATCCACAGCAGAGCCTCCCTTGGGTCAGGTTGCAGGGA 
AATGGGAAAAGAGGGAGCCTCGGGACAGACTCCTGAGCTGTGTCCTGGCTCTGAAGTCACTGGCTGTATGAGGCTGT 
GGACACAGCACATAGGACACAGCAGAGGAAAGTGAGTGACACACACTTGGAGAAATAGGGAGATTCAGCCATAGGGG 
CTCTGCATGGGAGGGAACAGGCAGTGCCAAAAAGTGTGTGTTTATAGAGAGGGTAAGACTATCAGCCACTATATATA 
TCTAACATAAAACTTACCATTAACCATTTCTAAGTGTACAATTAAGTGAAACAGCATAAATATCAATCAAGTATATT 
GCCCGGTGTGGTGGCTCATCCCTGTAATCCCAGCACTTTGGGAGGCCAAGGCGAGTGGATCACCTGAGGTCAGGAGT 
TCAAGATACAGAAAAAAAAAAATAGCTAGGCATGGTGGTGGGTGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCA 
GGAGAATCGCTCGAACCTGGGCGGTGTAGTTTGCAGTGAGCCGAGATTGAGCCACTGCACTCCAGCCTGGGTGACAG 
AGTGAGACTACATCACAAAAAAAAAAAAAAAAAAGG 



<210> SEQ ID NO 111 
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<211> Length : 1,886 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 111 
>HUMCEA__PEA_1_T2 0 

CTCAGGGCAGAGGGAGGAAGGACAGCAGACCAGACAGTCACAGCAGCCTTGACAAAACGTTCCTGGAACTCAAGCTC 
TTCTCCACAGAGGAGGACAGAGCAGACAGCAGAGACCATGGAGTCTCCCTCGGCCCCTCCCCACAGATGGTGCATCC 
CCTGGCAGAGGCTCCTGCTCACAGCCTCACTTCTAACCTTCTGGAACCCGCCCACCACTGCCAAGCTCACTATTGAA 
TCCACGCCGTTCAATGTCGCAGAGGGGAAGGAGGTGCTTCTACTTGTCCACAATCTGCCCCAGCATCTTTTTGGCTA 
CAGCTGGTACAAAGGTGAAAGAGTGGATGGCAACCGTCAAATTATAGGATATGTAATAGGAACTCAACAAGCTACCC 
CAGGGCCCGCATACAGTGGTCGAGAGATAATATACCCCAATGCATCCCTGCTGATCCAGAACATCATCCAGAATGAC 
ACAGGATTCTACACCCTACACGTCATAAAGTCAGATCTTGTGAATGAAGAAGCAACTGGCCAGTTCCGGGTATACCC 
GGAGCTGCCCAAGCCCTCCATCTCCAGCAACAACTCCAAACCCGTGGAGGACAAGGATGCTGTGGCCTTCACCTGTG 
AACCTGAGACTCAGGACGCAACCTACCTGTGGTGGGTAAACAATCAGAGCCTCCCGGTCAGTCCCAGGCTGCAGCTG 
TCCAATGGCAACAGGACCCTCACTCTATTCAATGTCACAAGAAATGACACAGCAAGCTACAAATGTGAAACCCAGAA 
CCCAGTGAGTGCCAGGCGCAGTGATTCAGTCATCCTGAATGTCCTCTATGGCCCGGATGCCCCCACCATTTCCCCTC 
TAAACACATCTTACAGATCAGGGGAAAATCTGAACCTCTCCTGCCACGCAGCCTCTAACCCACCTGCACAGTACTCT 
TGGTTTGTCAATGGGACTTTCCAGCAATCCACCCAAGAGCTCTTTATCCCCAACATCACTGTGAATAATAGTGGATC 
CTATACGTGCCAAGCCCATAACTCAGACACTGGCCTCAATAGGACCACAGTCACGACGATCACAGTCTATGCAGAGC 
CACCCAAACCCTTCATCACCAGCAACAACTCCAACCCCGTGGAGGATGAGGATGCTGTAGCCTTAACCTGTGAACCT 
GAGATTCAGAACACAACCTACCTGTGGTGGGTAAATAATCAGAGCCTCCCGGTCAGTCCCAGGCTGCAGCTGTCCAA 
TGACAACAGGACCCTCACTCTACTCAGTGTCACAAGGAATGATGTAGGACCCTATGAGTGTGGAATCCAGAACGAAT 
TAAGTGTTGACCACAGCGACCCAGTCATCCTGAATGTCCTCTATGGCCCAGACGACCCCACCATTTCCCCCTCATAC 
ACCTATTACCGTCCAGGGGTGAACCTCAGCCTCTCCTGCCATGCAGCCTCTAACCCACCTGCACAGTATTCTTGGCT 
GATTGATGGGAACATCCAGCAACACACACAAGAGCTCTTTATCTCCAACATCACTGAGAAGAACAGCGGACTCTATA 
CCTGCCAGGCCAATAACTCAGCCAGTGGCCACAGCAGGACTACAGTCAAGACAATCACAGTCTCTGCGGAGCTGCCC 
AAGCCCTCCATCTCCAGCAACAACTCCAACCCCGTGGAGGACAAGGATGCTGTGGCCTTCACCTGTGAACCTGAGGT 
TCAGAACACAACCTACCTGTGGTGGGTAAATGGTCAGAGCCTCCCGGTCAGTCCCAGGCTGCAGCTGTCCAATGGCA 
ACATGACCCTCACTCTACTCAGCTGTCAAAAGGAACGATGCAGGATCCTATGAATGTGAAATACAGAACCCAGCGAG 
TGCCAACCGCAGTGACCCAGTCACCCTGAATGTCCTCT 

<210> SEQ ID NO 112 
<211> Length : 2,429 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 112 
>HUMCEA_PEA_1 JT2 5 

CTCAGGGCAGAGGGAGGAAGGACAGCAGACCAGACAGTCACAGCAGCCTTGACAAAACGTTCCTGGAACTCAAGCTC 

TTCTCCACAGAGGAGGACAGAGCAGACAGCAGAGACCATGGAGTCTCCCTCGGCCCCTCCCCACAGATGGTGCATCC 

CCTGGCAGAGGCTCCTGCTCACAGCCTCACTTCTAACCTTCTGGAACCCGCCCACCACTGCCAAGCTCACTATTGAA 

TCCACGCCGTTCAATGTCGCAGAGGGGAAGGAGGTGCTTCTACTTGTCCACAATCTGCCCCAGCATCTTTTTGGCTA 

CAGCTGGTACAAAGGTGAAAGAGTGGATGGCAACCGTCAAATTATAGGATATGTAATAGGAACTCAACAAGCTACCC 

CAGGGCCCGCATACAGTGGTCGAGAGATAATATACCCCAATGCATCCCTGCTGATCCAGAACATCATCCAGAATGAC 

ACAGGATTCTACACCCTACACGTCATAAAGTCAGATCTTGTGAATGAAGAAGCAACTGGCCAGTTCCGGGTATACCC 

GGAGCTGCCCAAGCCCTCCATCTCCAGCAACAACTCCAAACCCGTGGAGGACAAGGATGCTGTGGCCTTCACCTGTG 

AACCTGAGACTCAGGACGCAACCTACCTGTGGTGGGTAAACAATCAGAGCCTCCCGGTCAGTCCCAGGCTGCAGCTG 

TCCAATGGCAACAGGACCCTCACTCTATTCAATGTCACAAGAAATGACACAGCAAGCTACAAATGTGAAACCCAGAA 

CCCAGTGAGTGCCAGGCGCAGTGATTCAGTCATCCTGAATGTCCTCTATGGGCCGGACACCCCCATCATTTCCCCCC 

CAGACTCGTCTTACCTTTCGGGAGCGAACCTCAACCTCTCCTGCCACTCGGCCTCTAACCCATCCCCGCAGTATTCT 

TGGCGTATCAATGGGATACCGCAGCAACACACACAAGTTCTCTTTATCGCCAAAATCACGCCAAATAATAACGGGAC 

CTATGCCTGTTTTGTCTCTAACTTGGCTACTGGCCGCAATAATTCCATAGTCAAGAGCATCACAGTCTCTGCATCTG 

GAACTTCTCCTGGTCTCTCAGCTGGGGCCACTGTCGGCATCATGATTGGAGTGCTGGTTGGGGTTGCTCTGATATAG 

CAGCCCTGGTGTAGTTTCTTCATTTCAGGAAGACTGACAGTTGTTTTGCTTCTTCCTTAAAGCATTTGCAACAGCTA 

CAGTCTAAAATTGCTTCTTTACCAAGGATATTTACAGAAAAGACTCTGACCAGAGATCGAGACCATCCTAGCCAACA 

TCGTGAAACCCCATCTCTACTAAAAATACAAAAATGAGCTGGGCTTGGTGGCGCGCACCTGTAGTCCCAGTTACTCG 

GGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGGAGGTGGAGATTGCAGTGAGCCCAGATCGCACCACTGCACTCCA 

GTCTGGCAACAGAGCAAGACTCCATCTCAAAAAGAAAAGAAAAGAAGACTCTGACCTGTACTCTTGAATACAAGTTT 

CTGATACCACTGCACTGTCTGAGAATTTCCAAAACTTTAATGAACTAACTGACAGCTTCATGAAACTGTCCACCAAG 

ATCAAGCAGAGAAAATAATTAATTTCATGGGACTAAATGAACTAATGAGGATAATATTTTCATAATTTTTTATTTGA 

AATTTTGCTGATTCTTTAAATGTCTTGTTTCCCAGATTTCAGGAAACTTTTTTTCTTTTAAGCTATCCACAGCTTAC 

AGCAATTTGATAAAATATACTTTTGTGAACAAAAATTGAGACATTTACATTTTCTCCCTATGTGGTCGCTCCAGACT 

TGGGAAACTATTCATGAATATTTATATTGTATGGTAATATAGTTATTGCACAAGTTCAATAAAAATCTGCTCTTTGT 

ATAACAGAATACATTTGAAAACATTGGTTATATTACCAAGACTTTGACTAGAATGTCGTATTTGAGGATATAAACCC 

ATAGGTAATAAACCCACAGGTACTACAAACAAAGTCTGAAGTCAGCCTTGGTTTGGCTTCCTAGTGTCAATTAAACT 

TCTAAAAGTTTAATCTGAGATTCCTTATAAAAACTTCCAGCAAAGCAACTTTAAAAAAGTCTGTGTGGGCCGGGCGC 

GGTGGCTCACGCCTGTAATCCCAGCACTTTGATCCGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCCAGACCATC 

CTGGCTAACACAGTGAAACCCCGTCTCTACTAAAAATACAAAAAAAGTTAGCCGGGCGTGGTGGTGGGGGCCTGTAG 

TCCCAGCTACTCAGGAGGCTGAGGCAGGAGAACGGCATGAACCCGGGAGGCAGGGCTTGCAGTGAGCCAAGATCATG 

CCGCTGCACTCCAGCCTGGGAGACAAAGTGAGACTCCGTCAA 
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<210> SEQ ID NO 113 

<211> Length : 2, 429 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 113 
>HUMCEA_PEA_1_T2 6 

CTCAGGGCAGAGGGAGGAAGGACAGCAGACCAGACAGTCACAGCAGCCTTGACAAAACGTTCCTGGAACTCAAGCTC 

TTCTCCACAGAGGAGGACAGAGCAGACAGCAGAGACCATGGAGTCTCCCTCGGCCCCTCCCCACAGATGGTGCATCC 

CCTGGCAGAGGCTCCTGCTCACAGCCTCACTTCTAACCTTCTGGAACCCGCCCACCACTGCCAAGCTCACTATTGAA 

TCCACGCCGTTCAATGTCGCAGAGGGGAAGGAGGTGCTTCTACTTGTCCACAATCTGCCCCAGCATCTTTTTGGCTA 

CAGCTGGTACAAAGGTGAAAGAGTGGATGGCAACCGTCAAATTATAGGATATGTAATAGGAACTCAACAAGCTACCC 

CAGGGCCCGCATACAGTGGTCGAGAGATAATATACCCCAATGCATCCCTGCTGATCCAGAACATCATCCAGAATGAC 

ACAGGATTCTACACCCTACACGTCATAAAGTCAGATCTTGTGAATGAAGAAGCAACTGGCCAGTTCCGGGTATACCC 

GGAGCTGCCCAAGCCCTCCATCTCCAGCAACAACTCCAAACCCGTGGAGGACAAGGATGCTGTGGCCTTCACCTGTG 

AACCTGAGGCTCAGAACACAACCTACCTGTGGTGGGTAAATGGTCAGAGCCTCCCAGTCAGTCCCAGGCTGCAGCTG 

TCCAATGGCAACAGGACCCTCACTCTATTCAATGTCACAAGAAATGACGCAAGAGCCTATGTATGTGGAATCCAGAA 

CTCAGTGAGTGCAAACCGCAGTGACCCAGTCACCCTGGATGTCCTCTATGGGCCGGACACCCCCATCATTTCCCCCC 

CAGACTCGTCTTACCTTTCGGGAGCGAACCTCAACCTCTCCTGCCACTCGGCCTCTAACCCATCCCCGCAGTATTCT 

TGGCGTATCAATGGGATACCGCAGCAACACACACAAGTTCTCTTTATCGCCAAAATCACGCCAAATAATAACGGGAC 

CTATGCCTGTTTTGTCTCTAACTTGGCTACTGGCCGCAATAATTCCATAGTCAAGAGCATCACAGTCTCTGCATCTG 

GAACTTCTCCTGGTCTCTCAGCTGGGGCCACTGTCGGCATCATGATTGGAGTGCTGGTTGGGGTTGCTCTGATATAG 

CAGCCCTGGTGTAGTTTCTTCATTTCAGGAAGACTGACAGTTGTTTTGCTTCTTCCTTAAAGCATTTGCAACAGCTA 

CAGTCTAAAATTGCTTCTTTACCAAGGATATTTACAGAAAAGACTCTGACCAGAGATCGAGACCATCCTAGCCAACA 

TCGTGAAACCCCATCTCTACTAAAAATACAAAAATGAGCTGGGCTTGGTGGCGCGCACCTGTAGTCCCAGTTACTCG 

GGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGGAGGTGGAGATTGCAGTGAGCCCAGATCGCACCACTGCACTCCA 

GTCTGGCAACAGAGCAAGACTCCATCTCAAAAAGAAAAGAAAAGAAGACTCTGACCTGTACTCTTGAATACAAGTTT 

CTGATACCACTGCACTGTCTGAGAATTTCCAAAACTTTAATGAACTAACTGACAGCTTCATGAAACTGTCCACCAAG 

ATCAAGCAGAGAAAATAATTAATTTCATGGGACTAAATGAACTAATGAGGATAATATTTTCATAATTTTTTATTTGA 

AATTTTGCTGATTCTTTAAATGTCTTGTTTCCCAGATTTCAGGAAACTTTTTTTCTTTTAAGCTATCCACAGCTTAC 

AGCAATTTGATAAAATATACTTTTGTGAACAAAAATTGAGACATTTACATTTTCTCCCTATGTGGTCGCTCCAGACT 

TGGGAAACTATTCAXGAATATTTATATTGTATGGTAATATAGTTATTGCACAAGTTCAATAAAAATCTGCTCTTTGT 

ATAACAGAATACATTTGAAAACATTGGTTATATTACCAAGACTTTGACTAGAATGTCGTATTTGAGGATATAAACCC 

ATAGGTAATAAACCCACAGGTACTACAAACA7VAGTCTGAAGTCAGCCTTGGTTTGGCTTCCTAGTGTCAATTAAACT 

TCTAAAAGTTTAATCTGAGATTCCTTATAAAAACTTCCAGCAAAGCAACTTTAAAAAAGTCTGTGTGGGCCGGGCGC 

GGTGGCTCACGCCTGTAATCCCAGCACTTTGATCCGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCCAGACCATC 
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CTGGCTAACACAGTGAAACCCCGTCTCTACTAAAAATACAAAAAAAGTTAGCCGGGCGTGGTGGTGGGGGCCTGTAG 
TCCCAGCTACTCAGGAGGCTGAGGCAGGAGAACGGCATGAACCCGGGAGGCAGGGCTTGCAGTGAGCCAAGATCATG 
CCGCTGCACTCCAGCCTGGGAGACAAAGTGAGACTCCGTCAA 

<210> SEQ ID NO 114 

<211> Length : 3,898 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 114 

>R3 5 1 3 7_PE A_1_PEA_1_PEA_1_T3 

TGCCACCTCACCCACTGCCTCTGCCTCCCTGGGGCAGAGCTGTTCCCAGACGGGTGGGGCGGGGCCCAACTGTCCCC 
AGCTCCTTCAGCCCTTTCTGTCCCTCCCAGTGAGGCCAGCTGCGGTGAAGAGGGTGCTCTCTTGCCTGGAGTTGCCT 
CTGCTACGGCTGCCCCCTCCCAGCCCTGGCCCACTAAGCCAGACCCAGCTGTCGCCATTCCCACTTCTGGTCCTGCC 
ACCTCCTGAGCTGCCTTCCCGCCTGGTCTGGGTAGAGTCATGGCCTCGAGCACAGGTGACCGGAGCCAGGCGGTGAG 
GCATGGACTGAGGGCGAAGGTGCTGACGCTGGACGGCATGAACCCGCGTGTGCGGAGAGTGGAGTACGCAGTGCGTG 
GCCCCATAGTGCAGCGAGCCTTGGAGCTGGAGCAGGAGCTGCGCCAGGGTGTGAAGAAGCCTTTCACCGAGGTCATC 
CGTGCCAACATCGGGGACGCACAGGCTATGGGGCAGAGGCCCATCACCTTCCTGCGCCAGGTCTTGGCCCTCTGTGT 
TAACCCTGATCTTCTGAGCAGCCCCAACTTCCCTGACGATGCCAAGAAAAGGGCGGAGCGCATCTTGCAGGCGTGTG 
GGGGCCACAGTCTGGGGGCCTACAGCGTCAGCTCCGGCATCCAGCTGATCCGGGAGGACGTGGCGCGGTACATTGAG 
AGGCGTGACGGAGGCATCCCTGCGGACCCCAACAACGTCTTCCTGTCCACAGGGGCCAGCGATGCCATCGTGACGGT 
GCTGAAGCTGCTGGTGGCCGGCGAGGGCCACACACGCACGGGTGTGCTCATCCCCATCCCCCAGTACCCACTCTACT 
CGGCCACGCTGGCAGAGCTGGGCGCAGTGCAGGTGGATTACTACCTGGACGAGGAGCGTGCCTGGGCGCTGGACGTG 
GCCGAGCTTCACCGTGCACTGGGCCAGGCGCGTGACCACTGCCGCCCTCGTGCGCTCTGTGTCATCAACCCTGGCAA 
CCCCACCGGGCAGGTGCAGACCCGCGAGTGCATCGAGGCCGTGATCCGCTTCGCCTTCGAAGAGCGGCTCTTTCTGC 
TGGCGGACGAGGTGCGCGGCGCGGGGGAGCGGGAAGCCGGGCAACAGTCCGCCCCCGTGACGCCTTGCGCCCTTCCA 
GGTGTACCAGGACAACGTGTACGCCGCGGGTTCGCAGTTCCACTCATTCAAGAAGGTGCTCATGGAGATGGGGCCGC 
CCTACGCCGGGCAGCAGGAGCTTGCCTCCTTCCACTCCACCTCCAAGGGCTACATGGGCGAGTGCGGGTTCCGCGGC 
GGCTATGTGGAGGTGGTGAACATGGACGCTGCAGTGCAGCAGCAGATGCTGAAGCTGATGAGTGTGCGGCTGTGCCC 
GCCGGTGCCAGGACAGGCCCTGCTGGACCTGGTGGTCAGCCCGCCCGCGCCCACCGACCCCTCCTTTGCGCAGTTCC 
AGGCTGAGAAGCAGGCAGTGCTGGCAGAGCTGGCGGCCAAGGCCAAGCTCACCGAGCAGGTCTTCAATGAGGCTCCT 
GGCATCAGCTGCAACCCAGTGCAGGGCGCCATGTACTCCTTCCCGCGCGTGCAGCTGCCCCCGCGGGCGGTGGAGCG 
CGCTCAGGAGCTGGGCCTGGCCCCCGATATGTTCTTCTGCCTGCGCCTCCTGGAGGAGACCGGCATCTGCGTGGTGC 
CAGGGAGCGGCTTTGGGCAGCGGGAAGGCACCTACCACTTCCGGATGACCATTCTGCCCCCCTTGGAGAAACTGCGG 
CTGCTGCTGGAGAAGCTGAGCAGGTTCCATGCCAAGTTCACCCTCGAGTACTCCTGAGCACCCCAGCTGGGGCCAGG 
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CTGGGTCGCCCTGGACTGTGTGCTCAGGAGCCCTGGGAGGCTCTGGAGCCCACTGTACTTGCTCTTGATGCCTGGCG 
GGGTGGGGTGGGGGGGGTGCTGGGCCCCTGCCTCTCTGCAGGTCCCTAATAAAGCTGTGTGGCAGTCTGACTCCAAA 
AAGGAAGCGTTGGCAGCTGCGTGGCCCGCTCCCACCTGCCTACCCTTCTTGCAGGCCTGAGTCCCTTCAGAGAAGGG 
ACCTTCCACGGCCACCACCCACCTCTTCCTCCTGAAGACCCCGTGCCCACCATAGGCTGGGTCTTCCCTCTGGCCTC 
TGGTTGTGGGGCAGAGCCCGTCAGATCACACAGAAATGGGTTGAGAGGGTCCAGAGTGTGAGGAAAGCGCAGGCCCC 
ACACCCCTTTGTGGAAGCCCCCAAGAATCTAGGGAGCCAGGGGCCCAGGTGGCCACCCGAAGAAACACAGCCTTTCC 
TGAGGAAGGCACAGTGAACTGCCTCCTTCCTGGCTCCCTTTCCTGTGAGGTCCATGTCTTCCCTGGGGCAGGGGGAA 
ATACTAAACCAGCATGGTGCTGGCTGGTCAGGGTGACTGACAGCTCAGGAAGGAAGTCTTGGTTCTCTTACCCAAGG 
AAGCAGGGGTGGGGCCACTGTCTGGGGGGCCAGAGAC CACCTTTGGTGTCATTGTGTGGTGCAGTCCTTCGGCTGGG 
TGGAGTGGGAGGCAGAGGGAGAGGATAAGGGAGCGTCCCAGGGGAGGGCTGGGGCTGGAGGAGGCAGGGGCTGGGCT 
GAGCGGGAGTGGGCAGCGCTGTGTCCTGGCCTGGGGAGCATGGCTGAGCACCTACTACATGCAGACACTGCTGGGGG 
GTCACTCAATCTGCACAGATGCTCTTCTGAAGTAGGCATGATAGTCCCCATTTGATAGACGTGGAAACCTGCAACCC 
AAAAACCTGCTGAGCTGACAAGAAACCCCCTCAGAGGCCTCGGCAGCCAGAAAAATGCGTTGGGTCCAGTGCCCTCA 
AGTCCGCCAAGGACATGGGCTGGCTTTAGAGACTCACAAACTTGGGAGATAGGACTGGCCAAGGGCACCTGGTTTTT 
TTCGCTCTGGAGATGGTTCTTAACCACAGGCCACACACACTTCACAGCCTCATCTGGCCCTCGGGAGCCCCAGAGGG 
CACAGCTCTGGGCAGGAGACACAGCAGGTGGGCCCCTCCCTTGGCAGGGCGGGCTCGAATCAGGCAGGGTGCTCCTA 
GCCTTGTCACCGGACACCGAAGGGCGTCACGGGCAGTGGCTGGCGGTGTCCTTCTGAGCTAAGCTGGGTCCTGACCT 
TTCACACTTCCCTCCTAACTTCCATGGCTCTGTCACCGCCTTACGGAGGAGCTGAAGCCACAGACACAGCAAGGTTG 
GGGTCCGCACCGGAAGTATCCAGTGGTAGACGGCGGAACCCTTAAGAAACGGACGCCTTCATGCGGGCGGCTGGAGA 
AGCGGGGGCTGGGCACTGCAGCAACCACGCTTCGGCTGACACCAGGAAGGAAGCACGCCTGGAGCGGATCCGAAACT 
ACTGAGAGGGGCCAGGGCTGGCCGTGGGCGCAGGCCGATCCTTACATTCGAGGCCGGCCCAGCTCTGTAGCTTCCCC 
CTCTGGGCCTCTCACCCGCGCAGGACCTCGGTGGGAGCGCGCACGTGGCGGGGCGGGGGGCCGCGGGCCCGAGCCCG 
GACTGGCCACCGGGGGCGCCCGCGAGCTGACGCTCTGGCCCGTTGCGGTCTCTGTGGCCCGCGCGACCTTCCGGCCC 
TGGAGCCGGTGGCCGCGGGGCTCCAGCGACGCCGTGTGGTCCGTGCTCCGCTCTGTGGCTCCAGGGGGCCGAGAAAC 
TGCTGAGAGTCCGGCCCGGCCGGCAGTGCTGGGCGCGGGCCGAGGGCCCCGGGAGGCAGCGGCCCCGCCCTCTTTAC 
CTGCGGCCTCGCAGAGCATGCTGGGAGCCGCGGGAGGCAGTGGCCCCGCTCCCCTCACCTGCGGTATCGCAGAGCAT 
GGTGGGAGCCCCGGGAGGCAGTGGCCCCGCCCCCTTCCCTGCGGCCGC 

<210> SEQ ID NO 115 

<211> Length : 3,774 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 115 
>R35137 PEA 1 PEA 1 PEA 1 T5 
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TGCCACCTCACCCACTGCCTCTGCCTCCCTGGGGCAGAGCTGTTCCCAGACGGGTGGGGCGGGGCCCAACTGTCCCC 

AGCTCCTTCAGCCCTTTCTGTCCCTCCCAGTGAGGCCAGCTGCGGTGAAGAGGGTGCTCTCTTGCCTGGAGTTCCCT 

CTGCTACGGCTGCCCCCTCCCAGCCCTGGCCCACTAAGCCAGACCCAGCTGTCGCCATTCCCACTTCTGGTCCTGCC 

ACCTCCTGAGCTGCCTTCCCGCCTGGTCTGGGTAGAGTCATGGCCTCGAGCACAGGTGACCGGAGCCAGGCGGTGAG 

GCATGGACTGAGGGCGAAGGTGCTGACGCTGGACGGCATGAACCCGCGTGTGCGGAGAGTGGAGTACGCAGTGCGTG 

GCCCCATAGTGCAGCGAGCCTTGGAGCTGGAGCAGGAGCTGCGCCAGGGTGTGAAGAAGCCTTTCACCGAGGTCATC 

CGTGCCAACATCGGGGACGCACAGGCTATGGGGCAGAGGCCCATCACCTTCCTGCGCCAGGTCTTGGCCCTCTGTGT 

TAACCCTGATCTTCTGAGCAGCCCCAACTTCCCTGACGATGCCAAGAAAAGGGCGGAGCGCATCTTGCAGGCGTGTG 

GGGGCCACAGTCTGGGGGCCTACAGCGTCAGCTCCGGCATCCAGCTGATCCGGGAGGACGTGGCGCGGTACATTGAG 

AGGCGTGACGGAGGCATCCCTGCGGACCCCAACAACGTCTTCCTGTCCACAGGGGCCAGCGATGCCATCGTGACGGT 

GCTGAAGCTGCTGGTGGCCGGCGAGGGCCACACACGCACGGGTGTGCTCATCCCCATCCCCCAGTACCCACTCTACT 

CGGCCACGCTGGCAGAGCTGGGCGCAGTGCAGGTGGATTACTACCTGGACGAGGAGCGTGCCTGGGCGCTGGACGTG 

GCCGAGCTTCACCGTGCACTGGGCCAGGCGCGTGACCACTGCCGCCCTCGTGCGCTCTGTGTCATCAACCCTGGCAA 

CCCCACCGGGCAGGTGCAGACCCGCGAGTGCATCGAGGCCGTGATCCGCTTCGCCTTCGAAGAGCGGCTCTTTCTGC 

TGGCGGACGAGGTGTACCAGGACAACGTGTACGCCGCGGGTTCGCAGTTCCACTCATTCAAGAAGGTGCTCATGGAG 

ATGGGGCCGCCCTACGCCGGGCAGCAGGAGCTTGCCTCCTTCCACTCCACCTCCAAGGGCTACATGGGCGAGTGCGG 

GTTCCGCGGCGGCTATGTGGAGGTGGTGAACATGGACGCTGCAGTGCAGCAGCAGATGCTGAAGCTGATGAGTGTGC 

GGCTGTGCCCGCCGGTGCCAGGACAGGCCCTGCTGGACCTGGTGGTCAGCCCGCCCGCGCCCACCGACCCCTCCTTT 

GCGCAGTTCCAGGCTGAGAAGCAGGCAGTGCTGGCAGAGCTGGCGGCCAAGGCCAAGCTCACCGAGCAGGTCTTCAA 

TGAGGCTCCTGGCATCAGCTGCAACCCAGTGCAGGGCGCCATGTACTCCTTCCCGCGCGTGCAGCTGCCCCCGCGGG 

CGGTGGAGCGCGCTCAGGAGCXGGGCCTGGCCCCCGATATGTTCTTCTGCCTGCGCCTCCTGGAGGAGACCGGCATC 

TGCGTGGTGCCAGGGAGCGGCTTTGGGCAGCGGGAAGGCACCTACCACTTCCGGATGACCATTCTGCCCCCCTTGGA 

GAAACTGCGGCTGCTGCTGGAGAAGCTGAGCAGGTTCCATGCCAAGTTCACCCTCGAGAGCCCTGGGAGGCTCTGGA 

GCCCACTGTACTTGCTCTTGATGCCTGGCGGGGTGGGGTGGGGGGGGTGCTGGGCCCCTGCCTCTCTGCAGGTCCCT 

AATAAAGCTGTGTGGCAGTCTGACTCCAAAAAGGAAGCGTTGGCAGCTGCGTGGCCCGCTCCCACCTGCCTACCCTT 

CTTGCAGGCCTGAGTCCCTTCAGAGAAGGGACCTTCCACGGCCACCACCCACCTCTTCCTCCTGAAGACCCCGTGCC 

CACCATAGGCTGGGTCTTCCCTCTGGCCTCTGGTTGTGGGGCAGAGCCCGTCAGATCACACAGAAATGGGTTGAGAG 

GGTCCAGAGTGTGAGGAAAGCGCAGGCCCCACACCCCTTTGTGGAAGCCCCCAAGAATCTAGGGAGCCAGGGGCCCA 

GGTGGCCACCCGAAGAAACACAGCCTTTCCTGAGGAAGGCACAGTGAACTGCCTCCTTCCTGGCTCCCTTTCCTGTG 

AGGTCCATGTCTTCCCTGGGGCAGGGGGAAATACTAAACCAGCATGGTGCTGGCTGGTCAGGGTGACTGACAGCTCA 

GGAAGGAAGTCTTGGTTCTCTTACCCAAGGAAGCAGGGGTGGGGCCACTGTCTGGGGGGCCAGAGACCACCTTTGGT 

GTCATTGTGTGGTGCAGTCCTTCGGCTGGGTGGAGTGGGAGGCAGAGGGAGAGGATAAGGGAGCGTCCCAGGGGAGG 

GCTGGGGCTGGAGGAGGCAGGGGCTGGGCTGAGCGGGAGTGGGCAGCGCTGTGTCCTGGCCTGGGGAGCATGGCTGA 

GCACCTACTACATGCAGACACTGCTGGGGGGTCACTCAATCTGCACAGATGCTCTTCTGAAGTAGGCATGATAGTCC 

CCATTTGATAGACGTGGAAACCTGCAACCCAAAAACCTGCTGAGCTGACAAGAAACCCCCTCAGAGGCCTCGGCAGC 

CAGAAAAATGCGTTGGGTCCAGTGCCCTCAAGTCCGCCAAGGACATGGGCTGGCTTTAGAGACTCACAAACTTGGGA 

GATAGGACTGGCCAAGGGCACCTGGTTTTTTTCGCTCTGGAGATGGTTCTTAACCACAGGCCACACACACTTCACAG 

CCTCATCTGGCCCTCGGGAGCCCCAGAGGGCACAGCTCTGGGCAGGAGACACAGCAGGTGGGCCCCTCCCTTGGCAG 
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GGCGGGCTCGAATCAGGCAGGGTGCTCCTAGCCTTGTCACCGGACACCGAAGGGCGTCACGGGCAGTGGCTGGCGGT 
GTCCTTCTGAGCTAAGCTGGGTCCTGACCTTTCACACTTCCCTCCTAACTTCCATGGCTCTGTCACCGCCTTACGGA 
GGAGCTGAAGCCACAGACACAGCAAGGTTGGGGTCCGCACCGGAAGTATCCAGTGGTAGACGGCGGAACCCTTAAGA 
AACGGACGCCTTCATGCGGGCGGCTGGAGAAGCGGGGGCTGGGCACTGCAGCAACCACGCTTCGGCTGACACCAGGA 
AGGAAGCACGCCTGGAGCGGATCCGAAACTACTGAGAGGGGCCAGGGCTGGCCGTGGGCGCAGGCCGATCCTTACAT 
TCGAGGCCGGCCCAGCTCTGTAGCTTCCCCCTCTGGGCCTCTCACCCGCGCAGGACCTCGGTGGGAGCGCGCACGTG 
GCGGGGCGGGGGGCCGCGGGCCCGAGCCCGGACTGGCCACCGGGGGCGCCCGCGAGCTGACGCTCTGGCCCGTTGCG 
GTCTCTGTGGCCCGCGCGACCTTCCGGCCCTGGAGCCGGTGGCCGCGGGGCTCCAGCGACGCCGTGTGGTCCGTGCT 
CCGCTCTGTGGCTCCAGGGGGCCGAGAAACTGCTGAGAGTCCGGCCCGGCCGGCAGTGCTGGGCGCGGGCCGAGGGC 
CCCGGGAGGCAGCGGCCCCGCCCTCTTTACCTGCGGCCTCGCAGAGCATGCTGGGAGCCGCGGGAGGCAGTGGCCCC 
GCTCCCCTCACCTGCGGTATCGCAGAGCATGGTGGGAGCCCCGGGAGGCAGTGGCCCCGCCCCCTTCCCTGCGGCCG 

C 

<210> SEQ ID NO 116 

<211> Length : 3, 998 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 116 

>R3 5 1 37_PEA_1_PEA_1_PEA_1_T1 0 

TGCCACCTCACCCACTGCCTCTGCCTCCCTGGGGCAGAGCTGTTCCCAGACGGGTGGGGCGGGGCCCAACTGTCCCC 
AGCTCCTTCAGCCCTTTCTGTCCCTCCCAGTGAGGCCAGCTGCGGTGAAGAGGGTGCTCTCTTGCCTGGAGTTCCCT 
CTGCTACGGCTGCCCCCTCCCAGCCCTGGCCCACTAAGCCAGACCCAGCTGTCGCCATTCCCACTTCTGGTCCTGCC 
ACCTCCTGAGCTGCCTTCCCGCCTGGTCTGGGTAGAGTCATGGCCTCGAGCACAGGTGACCGGAGCCAGGCGGTGAG 
GCATGGACTGAGGGCGAAGGTGCTGACGCTGGACGGCATGAACCCGCGTGTGCGGAGAGTGGAGTACGCAGTGCGTG 
GCCCCATAGTGCAGCGAGCCTTGGAGCTGGAGCAGGAGCTGCGCCAGGGTGTGAAGAAGCCTTTCACCGAGGTCATC 
CGTGCCAACATCGGGGACGCACAGGCTATGGGGCAGAGGCCCATCACCTTCCTGCGCCAGGTCTTGGCCCTCTGTGT 
TAACCCTGATCTTCTGAGCAGCCCCAACTTCCCTGACGATGCCAAGAAAAGGGCGGAGCGCATCTTGCAGGCGTGTG 
GGGGCCACAGTCTGGGGGCCTACAGCGTCAGCTCCGGCATCCAGCTGATCCGGGAGGACGTGGCGCGGTACATTGAG 
AGGCGTGACGGAGGCATCCCTGCGGACCCCAACAACGTCTTCCTGTCCACAGGGGCCAGCGATGCCATCGTGACGGT 
GCTGAAGCTGCTGGTGGCCGGCGAGGGCCACACACGCACGGGTGTGCTCATCCCCATCCCCCAGTACCCACTCTACT 
CGGCCACGCTGGCAGAGCTGGGCGCAGTGCAGGTGGATTACTACCTGGACGAGGAGCGTGCCTGGGCGCTGGACGTG 
GCCGAGCTTCACCGTGCACTGGGCCAGGCGCGTGACCACTGCCGCCCTCGTGCGCTCTGTGTCATCAACCCTGGCAA 
CCCCACCGGGCAGGTGCAGACCCGCGAGTGCATCGAGGCCGTGATCCGCTTCGCCTTCGAAGAGCGGCTCTTTCTGC 
TGGCGGACGAGGTGCGCGGCGCGGGGGAGCGGGAAGCCGGGCAACAGTCCGCCCCCGTGACGCCTTGCGCCCTTCCA 
GGTGTACCAGGACAACGTGTACGCCGCGGGTTCGCAGTTCCACTCATTCAAGAAGGTGCTCATGGAGATGGGGCCGC 
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CCTACGCCGGGCAGCAGGAGCTTGCCTCCTTCCACTCCACCTCCAAGGGCTACATGGGCGAGTGCGTGCGTACGAGG 

CGGGTGGGGGCTCGCGGGCCATGGCCAGGCCCTCCTCGCCCGATGGGCCACCCCCTCCTCCGCACCTGACCTGGCCG 

TGCGCAGGTGCGGGTTCCGCGGCGGCTATGTGGAGGTGGTGAACATGGACGCTGCAGTGCAGCAGCAGATGCTGAAG 

CTGATGAGTGTGCGGCTGTGCCCGCCGGTGCCAGGACAGGCCCTGCTGGACCTGGTGGTCAGCCCGCCCGCGCCCAC 

CGACCCCTCCTTTGCGCAGTTCCAGGCTGAGAAGCAGGCAGTGCTGGCAGAGCTGGCGGCCAAGGCCAAGCTCACCG 

AGCAGGTCTTCAATGAGGCTCCTGGCATCAGCTGCAACCCAGTGCAGGGCGCCATGTACTCCTTCCCGCGCGTGCAG 

CTGCCCCCGCGGGCGGTGGAGCGCGCTCAGGAGCTGGGCCTGGCCCCCGATATGTTCTTCTGCCTGCGCCTCCTGGA 

GGAGACCGGCATCTGCGTGGTGCCAGGGAGCGGCTTTGGGCAGCGGGAAGGCACCTACCACTTCCGGATGACCATTC 

TGCCCCCCTTGGAGAAACTGCGGCTGCTGCTGGAGAAGCTGAGCAGGTTCCATGCCAAGTTCACCCTCGAGTACTCC 

TGAGCACCCCAGCTGGGGCCAGGCTGGGTCGCCCTGGACTGTGTGCTCAGGAGCCCTGGGAGGCTCTGGAGCCCACT 

GTACTTGCTCTTGATGCCTGGCGGGGTGGGGTGGGGGGGGTGCTGGGCCCCTGCCTCTCTGCAGGTCCCTAATAAAG 

CTGTGTGGCAGTCTGACTCCAAAAAGGAAGCGTTGGCAGCTGCGTGGCCCGCTCCCACCTGCCTACCCTTCTTGCAG 

GCCTGAGTCCCTTCAGAGAAGGGACCTTCCACGGCCACCACCCACCTCTTCCTCCTGAAGACCCCGTGCCCACCATA 

GGCTGGGTCTTCCCTCTGGCCTCTGGTTGTGGGGCAGAGCCCGTCAGATCACACAGAAATGGGTTGAGAGGGTCCAG 

AGTGTGAGGAAAGCGCAGGCCCCACACCCCTTTGTGGAAGCCCCCAAGAATCTAGGGAGCCAGGGGCCCAGGTGGCC 

ACCCGAAGAAACACAGCCTTTCCTGAGGAAGGCACAGTGAACTGCCTCCTTCCTGGCTCCCTTTCCTGTGAGGTCCA 

TGTCTTCCCTGGGGCAGGGGGAAATACTAAACCAGCATGGTGCTGGCTGGTCAGGGTGACTGACAGCTCAGGAAGGA 

AGTCTTGGTTCTCTTACCCAAGGAAGCAGGGGTGGGGCCACTGTCTGGGGGGCCAGAGACCACCTTTGGTGTCATTG 

TGTGGTGCAGTCCTTCGGCTGGGTGGAGTGGGAGGCAGAGGGAGAGGATAAGGGAGCGTCCCAGGGGAGGGCTGGGG 

CTGGAGGAGGCAGGGGCTGGGCTGAGCGGGAGTGGGCAGCGCTGTGTCCTGGCCTGGGGAGCATGGCTGAGCACCTA 

CTACATGCAGACACTGCTGGGGGGTCACTCAATCTGCACAGATGCTCTTCTGAAGTAGGCATGATAGTCCCCATTTG 

ATAGACGTGGAAACCTGCAACCCAAAAACCTGCTGAGCTGACAAGAAACCCCCTCAGAGGCCTCGGCAGCCAGAAAA 

ATGCGTTGGGTCCAGTGCCCTCAAGTCCGCCAAGGACATGGGCTGGCTTTAGAGACTCACAAACTTGGGAGATAGGA 

CTGGCCAAGGGCACCTGGTTTTTTTCGCTCTGGAGATGGTTCTTAACCACAGGCCACACACACTTCACAGCCTCATC 

TGGCCCTCGGGAGCCCCAGAGGGCACAGCTCTGGGCAGGAGACACAGCAGGTGGGCCCCTCCCTTGGCAGGGCGGGC 

TCGAATCAGGCAGGGTGCTCCTAGCCTTGTCACCGGACACCGAAGGGCGTCACGGGCAGTGGCTGGCGGTGTCCTTC 

TGAGCTAAGCTGGGTCCTGACCTTTCACACTTCCCTCCTAACTTCCATGGCTCTGTCACCGCCTTACGGAGGAGCTG 

AAGCCACAGACACAGCAAGGTTGGGGTCCGCACCGGAAGTATCCAGTGGTAGACGGCGGAACCCTTAAGAAACGGAC 

GCCTTCATGCGGGCGGCTGGAGAAGCGGGGGCTGGGCACTGCAGCAACCACGCTTCGGCTGACACCAGGAAGGAAGC 

ACGCCTGGAGCGGATCCGAAACTACTGAGAGGGGCCAGGGCTGGCCGTGGGCGCAGGCCGATCCTTACATTCGAGGC 

CGGCCCAGCTCTGTAGCTTCCCCCTCTGGGCCTCTCACCCGCGCAGGACCTCGGTGGGAGCGCGCACGTGGCGGGGC 

GGGGGGCCGCGGGCCCGAGCCCGGACTGGCCACCGGGGGCGCCCGCGAGCTGACGCTCTGGCCCGTTGCGGTCTCTG 

TGGCCCGCGCGACCTTCCGGCCCTGGAGCCGGTGGCCGCGGGGCTCCAGCGACGCCGTGTGGTCCGTGCTCCGCTCT 

GTGGCTCCAGGGGGCCGAGAAACTGCTGAGAGTCCGGCCCGGCCGGCAGTGCTGGGCGCGGGCCGAGGGCCCCGGGA 

GGCAGCGGCCCCGCCCTCTTTACCTGCGGCCTCGCAGAGCATGCTGGGAGCCGCGGGAGGCAGTGGCCCCGCTCCCC 

TCACCTGCGGTATCGCAGAGCATGGTGGGAGCCCCGGGAGGCAGTGGCCCCGCCCCCTTCCCTGCGGCCGC 
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<210> SEQ ID NO 117 

<211> Length : 4,071 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 117 

>R3 5 1 3 7_PEA_1_PE A_1_PEA_1_T 1 1 

TGCCACCTCACCCACTGCCTCTGCCTCCCTGGGGCAGAGCTGTTCCCAGACGGGTGGGGCGGGGCCCAACTGTCCCC 

AGCTCCTTCAGCCCTTTCTGTCCCTCCCAGTGAGGCCAGCTGCGGTGAAGAGGGTGCTCTCTTGCCTGGAGTTCCCT 

CTGCTACGGCTGCCCCCTCCCAGCCCTGGCCCACTAAGCCAGACCCAGCTGTCGCCATTCCCACTTCTGGTCCTGCC 

ACCTCCTGAGCTGCCTTCCCGCCTGGTCTGGGTAGAGTCATGGCCTCGAGCACAGGTGACCGGAGCCAGGCGGTGAG 

GCATGGACTGAGGGCGAAGGTGCTGACGCTGGACGGCATGAACCCGCGTGTGCGGAGAGTGGAGTACGCAGTGCGTG 

GCCCCATAGTGCAGCGAGCCTTGGAGCTGGAGCAGGAGCTGCGCCAGGGTGTGAAGAAGCCTTTCACCGAGGTCATC 

CGTGCCAACATCGGGGACGCACAGGCTATGGGGCAGAGGCCCATCACCTTCCTGCGCCAGGTCTTGGCCCTCTGTGT 

TAACCCTGATCTTCTGAGCAGCCCCAACTTCCCTGACGATGCCAAGAAAAGGGCGGAGCGCATCTTGCAGGCGTGTG 

GGGGCCACAGTCTGGGGGCCTACAGCGTCAGCTCCGGCATCCAGCTGATCCGGGAGGACGTGGCGCGGTACATTGAG 

AGGCGTGACGGAGGCATCCCTGCGGACCCCAACAACGTCTTCCTGTCCACAGGGGCCAGCGATGCCATCGTGACGGT 

GCTGAAGCTGCTGGTGGCCGGCGAGGGCCACACACGCACGGGTGTGCTCATCCCCATCCCCCAGTACCCACTCTACT 

CGGCCACGCTGGCAGAGCTGGGCGCAGTGCAGGTGGATTACTACCTGGACGAGGAGCGTGCCTGGGCGCTGGACGTG 

GCCGAGCTTCACCGTGCACTGGGCCAGGCGCGTGACCACTGCCGCCCTCGTGCGCTCTGTGTCATCAACCCTGGCAA 

CCCCACCGGGCAGGTGCAGACCCGCGAGTGCATCGAGGCCGTGATCCGCTTCGCCTTCGAAGAGCGGCTCTTTCTGC 

TGGCGGACGAGGTGTACCAGGACAACGTGTACGCCGCGGGTTCGCAGTTCCACTCATTCAAGAAGGTGCTCATGGAG 

ATGGGGCCGCCCTACGCCGGGCAGCAGGAGCTTGCCTCCTTCCACTCCACCTCCAAGGGCTACATGGGCGAGTGCGT 

GCGTACGAGGCGGGTGGGGGCTCGCGGGCCATGGCCAGGCCCTCCTCGCCCGATGGGCCACCCCCTCCTCCGCACCT 

GACCTGGCCGTGCGCAGGTGCGGGTTCCGCGGCGGCTATGTGGAGGTGGTGAACATGGACGCTGCAGTGCAGCAGCA 

GATGCTGAAGCTGATGAGTGTGCGGCTGTGCCCGCCGGTGCCAGGACAGGCCCTGCTGGACCTGGTGGTCAGCCCGC 

CCGCGCCCACCGACCCCTCCTTTGCGCAGTTCCAGGCTGAGAAGCAGGCAGTGCTGGCAGAGCTGGCGGCCAAGGCC 

AAGCTCACCGAGCAGGTCTTCAATGAGGCTCCTGGCATCAGCTGCAACCCAGTGCAGGGCGCCATGTACTCCTTCCC 

GCGCGTGCAGCTGCCCCCGCGGGCGGTGGAGCGCGCTCAGGTCAGGCGGGGGCGGGGCCTGCGGGGTGGGCAGGGGG 

GGCCGGGCATCCCTCTCTGACGGCTCTCCGTCCACAGGAGCTGGGCCTGGCCCCCGATATGTTCTTCTGCCTGCGCC 

TCCTGGAGGAGACCGGCATCTGCGTGGTGCCAGGGAGCGGCTTTGGGCAGCGGGAAGGCACCTACCACTTCCGGTGA 

GGCCTGGCCCTCACTCCCTGTCCCGCCACCCTGGCCCTTCACTCACTGTCAACTCCTTTCAGGATGACCATTCTGCC 

CCCCTTGGAGAAACTGCGGCTGCTGCTGGAGAAGCTGAGCAGGTTCCATGCCAAGTTCACCCTCGAGTACTCCTGAG 

CACCCCAGCTGGGGCCAGGCTGGGTCGCCCTGGACTGTGTGCTCAGGAGCCCTGGGAGGCTCTGGAGCCCACTGTAC 

TTGCTCTTGATGCCTGGCGGGGTGGGGTGGGGGGGGTGCTGGGCCCCTGCCTCTCTGCAGGTCCCTAATAAAGCTGT 

GTGGCAGTCTGACTCCAAAAAGGAAGCGTTGGCAGCTGCGTGGCCCGCTCCCACCTGCCTACCCTTCTTGCAGGCCT 
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GAGTCCCTTCAGAGAAGGGACCTTCCACGGCCACCACCCACCTCTTCCTCCTGAAGACCCCGTGCCCACCATAGGCT 

GGGTCTTCCCTCTGGCCTCTGGTTGTGGGGCAGAGCCCGTCAGATCACACAGAAATGGGTTGAGAGGGTCCAGAGTG 

TGAGGAAAGCGCAGGCCCCACACCCCTTTGTGGAAGCCCCCAAGAATCTAGGGAGCCAGGGGCCCAGGTGGCCACCC 

GAAGAAACACAGCCTTTCCTGAGGAAGGCACAGTGAACTGCCTCCTTCCTGGCTCCCTTTCCTGTGAGGTCCATGTC 

TTCCCTGGGGCAGGGGGAAATACTAAACCAGCATGGTGCTGGCTGGTCAGGGTGACTGACAGCTCAGGAAGGAAGTC 

TTGGTTCTCTTACCCAAGGAAGCAGGGGTGGGGCCACTGTCTGGGGGGCCAGAGACCACCTTTGGTGTCATTGTGTG 

GTGCAGTCCTTCGGCTGGGTGGAGTGGGAGGCAGAGGGAGAGGATAAGGGAGCGTCCCAGGGGAGGGCTGGGGCTGG 

AGGAGGCAGGGGCTGGGCTGAGCGGGAGTGGGCAGCGCTGTGTCCTGGCCTGGGGAGCATGGCTGAGCACCTACTAC 

ATGCAGACACTGCTGGGGGGTCACTCAATCTGCACAGATGCTCTTCTGAAGTAGGCATGATAGTCCCCATTTGATAG 

ACGTGGAAACCTGCAACCCAAAAACCTGCTGAGCTGACAAGAAACCCCCTCAGAGGCCTCGGCAGCCAGAAAAATGC 

GTTGGGTCCAGTGCCCTCAAGTCCGCCAAGGACATGGGCTGGCTTTAGAGACTCACAAACTTGGGAGATAGGACTGG 

CCAAGGGCACCTGGTTTTTTTCGCTCTGGAGATGGTTCTTAACCACAGGCCACACACACTTCACAGCCTCATCTGGC 

CCTCGGGAGCCCCAGAGGGCACAGCTCTGGGCAGGAGACACAGCAGGTGGGCCCCTCCCTTGGCAGGGCGGGCTCGA 

ATCAGGCAGGGTGCTCCTAGCCTTGTCACCGGACACCGAAGGGCGTCACGGGCAGTGGCTGGCGGTGTCCTTCTGAG 

CTAAGCTGGGTCCTGACCTTTCACACTTCCCTCCTAACTTCCATGGCTCTGTCACCGCCTTACGGAGGAGCTGAAGC 

CACAGACACAGCAAGGTTGGGGTCCGCACCGGAAGTATCCAGTGGTAGACGGCGGAACCCTTAAGAAACGGACGCCT 

TCATGCGGGCGGCTGGAGAAGCGGGGGCTGGGCACTGCAGCAACCACGCTTCGGCTGACACCAGGAAGGAAGCACGC 

CTGGAGCGGATCCGAAACTACTGAGAGGGGCCAGGGCTGGCCGTGGGCGCAGGCCGATCCTTACATTCGAGGCCGGC 

CCAGCTCTGTAGCTTCCCCCTCTGGGCCTCTCACCCGCGCAGGACCTCGGTGGGAGCGCGCACGTGGCGGGGCGGGG 

GGCCGCGGGCCCGAGCCCGGACTGGCCACCGGGGGCGCCCGCGAGCTGACGCTCTGGCCCGTTGCGGTCTCTGTGGC 

CCGCGCGACCTTCCGGCCCTGGAGCCGGTGGCCGCGGGGCTCCAGCGACGCCGTGTGGTCCGTGCTCCGCTCTGTGG 

CTCCAGGGGGCCGAGAAACTGCTGAGAGTCCGGCCCGGCCGGCAGTGCTGGGCGCGGGCCGAGGGCCCCGGGAGGCA 

GCGGCCCCGCCCTCTTTACCTGCGGCCTCGCAGAGCATGCTGGGAGCCGCGGGAGGCAGTGGCCCCGCTCCCCTCAC 

CTGCGGTATCGCAGAGCATGGTGGGAGCCCCGGGAGGCAGTGGCCCCGCCCCCTTCCCTGCGGCCGC 

<210> SEQ ID NO 118 

<211> Length : 4,138 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 118 

>R3 5 1 3 7 JPE A_1_PE A_1_PEA_1_T1 2 

TGCCACCTCACCCACTGCCTCTGCCTCCCTGGGGCAGAGCTGTTCCCAGACGGGTGGGGCGGGGCCCAACTGTCCCC 
AGCTCCTTCAGCCCTTTCTGTCCCTCCCAGTGAGGCCAGCTGCGGTGAAGAGGGTGCTCTCTTGCCTGGAGTTCCCT 
CTGCTACGGCTGCCCCCTCCCAGCCCTGGCCCACTAAGCCAGACCCAGCTGTCGCCATTCCCACTTCTGGTCCTGCC 
ACCTCCTGAGCTGCCTTCCCGCCTGGTCTGGGTAGAGTCATGGCCTCGAGCACAGGTGACCGGAGCCAGGCGGTGAG 
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GCATGGACTGAGGGCGAAGGTGCTGACGCTGGACGGCATGAACCCGCGTGTGCGGAGAGTGGAGTACGCAGTGCGTG 

GCCCCATAGTGCAGCGAGCCTTGGAGCTGGAGCAGGAGCTGCGCCAGGGTGTGAAGAAGCCTTTCACCGAGGTCATC 

CGTGCCAACATCGGGGACGCACAGGCTATGGGGCAGAGGCCCATCACCTTCCTGCGCCAGGTCTTGGCCCTCTGTGT 

TAACCCTGATCTTCTGAGCAGCCCCAACTTCCCTGACGATGCCAAGAAAAGGGCGGAGCGCATCTTGCAGGCGTGTG 

GGGGCCACAGTCTGGGGGCCTACAGCGTCAGCTCCGGCATCCAGCTGATCCGGGAGGACGTGGCGCGGTACATTGAG 

AGGCGTGACGGAGGCATCCCTGCGGACCCCAACAACGTCTTCCTGTCCACAGGGGCCAGCGATGCCATCGTGACGGT 

GCTGAAGCTGCTGGTGGCCGGCGAGGGCCACACACGCACGGGTGTGCTCATCCCCATCCCCCAGTACCCACTCTACT 

CGGCCACGCTGGCAGAGCTGGGCGCAGTGCAGGTGGATTACTACCTGGACGAGGAGCGTGCCTGGGCGCTGGACGTG 

GCCGAGCTTCACCGTGCACTGGGCCAGGCGCGTGACCACTGCCGCCCTCGTGCGCTCTGTGTCATCAACCCTGGCAA 

CCCCACCGGGCAGGTGCAGACCCGCGAGTGCATCGAGGCCGTGATCCGCTTCGCCTTCGAAGAGCGGCTCTTTCTGC 

TGGCGGACGAGGTGCGCGGCGCGGGGGAGCGGGAAGCCGGGCAACAGTCCGCCCCCGTGACGCCTTGCGCCCTTCCA 

GGTGTACCAGGACAACGTGTACGCCGCGGGTTCGCAGTTCCACTCATTCAAGAAGGTGCTCATGGAGATGGGGCCGC 

CCTACGCCGGGCAGCAGGAGCTTGCCTCCTTCCACTCCACCTCCAAGGGCTACATGGGCGAGTGCGTGCGTACGAGG 

CGGGTGGGGGCTCGCGGGCCATGGCCAGGCCCTCCTCGCCCGATGGGCCACCCCCTCCTCCGCACCTGACCTGGCCG 

TGCGCAGGTGCGGGTTCCGCGGCGGCTATGTGGAGGTGGTGAACATGGACGCTGCAGTGCAGCAGCAGATGCTGAAG 

CTGATGAGTGTGCGGCTGTGCCCGCCGGTGCCAGGACAGGCCCTGCTGGACCTGGTGGTCAGCCCGCCCGCGCCCAC 

CGACCCCTCCTTTGCGCAGTTCCAGGCTGAGAAGCAGGCAGTGCTGGCAGAGCTGGCGGCCAAGGCCAAGCTCACCG 

AGCAGGTCTTCAATGAGGCTCCTGGCATCAGCTGCAACCCAGTGCAGGGCGCCATGTACTCCTTCCCGCGCGTGCAG 

CTGCCCCCGCGGGCGGTGGAGCGCGCTCAGGTCAGGCGGGGGCGGGGCCTGCGGGGTGGGCAGGGGGGGCCGGGCAT 

CCCTCTCTGACGGCTCTCCGTCCACAGGAGCTGGGCCTGGCCCCCGATATGTTCTTCTGCCTGCGCCTCCTGGAGGA 

GACCGGCATCTGCGTGGTGCCAGGGAGCGGCTTTGGGCAGCGGGAAGGCACCTACCACTTCCGGTGAGGCCTGGCCC 

TCACTCCCTGTCCCGCCACCCTGGCCCTTCACTCACTGTCAACTCCTTTCAGGATGACCATTCTGCCCCCCTTGGAG 

AAACTGCGGCTGCTGCTGGAGAAGCTGAGCAGGTTCCATGCCAAGTTCACCCTCGAGTACTCCTGAGCACCCCAGCT 

GGGGCCAGGCTGGGTCGCCCTGGACTGTGTGCTCAGGAGCCCTGGGAGGCTCTGGAGCCCACTGTACTTGCTCTTGA 

TGCCTGGCGGGGTGGGGTGGGGGGGGTGCTGGGCCCCTGCCTCTCTGCAGGTCCCTAATAAAGCTGTGTGGCAGTCT 

GACTCCAAAAAGGAAGCGTTGGCAGCTGCGTGGCCCGCTCCCACCTGCCTACCCTTCTTGCAGGCCTGAGTCCCTTC 

AGAGAAGGGACCTTCCACGGCCACCACCCACCTCTTCCTCCTGAAGACCCCGTGCCCACCATAGGCTGGGTCTTCCC 

TCTGGCCTCTGGTTGTGGGGCAGAGCCCGTCAGATCACACAGAAATGGGTTGAGAGGGTCCAGAGTGTGAGGAAAGC 

GCAGGCCCCACACCCCTTTGTGGAAGCCCCCAAGAATCTAGGGAGCCAGGGGCCCAGGTGGCCACCCGAAGAAACAC 

AGCCTTTCCTGAGGAAGGCACAGTGAACTGCCTCCTTCCTGGCTCCCTTTCCTGTGAGGTCCATGTCTTCCCTGGGG 

CAGGGGGAAATACTAAACCAGCATGGTGCTGGCTGGTCAGGGTGACTGACAGCTCAGGAAGGAAGTCTTGGTTCTCT 

TACCCAAGGAAGCAGGGGTGGGGCCACTGTCTGGGGGGCCAGAGACCACCTTTGGTGTCATTGTGTGGTGCAGTCCT 

TCGGCTGGGTGGAGTGGGAGGCAGAGGGAGAGGATAAGGGAGCGTCCCAGGGGAGGGCTGGGGCTGGAGGAGGCAGG 

GGCTGGGCTGAGCGGGAGTGGGCAGCGCTGTGTCCTGGCCTGGGGAGCATGGCTGAGCACCTACTACATGCAGACAC 

TGCTGGGGGGTCACTCAATCTGCACAGATGCTCTTCTGAAGTAGGCATGATAGTCCCCATTTGATAGACGTGGAAAC 

CTGCAACCCAAAAACCTGCTGAGCTGACAAGAAACCCCCTCAGAGGCCTCGGCAGCCAGAAAAATGCGTTGGGTCCA 

GTGCCCTCAAGTCCGCCAAGGACATGGGCTGGCTTTAGAGACTCACAAACTTGGGAGATAGGACTGGCCAAGGGCAC 

CTGGTTTTTTTCGCTCTGGAGATGGTTCTTAACCACAGGCCACACACACTTCACAGCCTCATCTGGCCCTCGGGAGC 
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CCCAGAGGGCACAGCTCTGGGCAGGAGACACAGCAGGTGGGCCCCTCCCTTGGCAGGGCGGGCTCGAATCAGGCAGG 
GTGCTCCTAGCCTTGTCACCGGACACCGAAGGGCGTCACGGGCAGTGGCTGGCGGTGTCCTTCTGAGCTAAGCTGGG 
TCCTGACCTTTCACACTTCCCTCCTAACTTCCATGGCTCTGTCACCGCCTTACGGAGGAGCTGAAGCCACAGACACA 
GCAAGGTTGGGGTCCGCACCGGAAGTATCCAGTGGTAGACGGCGGAACCCTTAAGAAACGGACGCCTTCATGCGGGC 
GGCTGGAGAAGCGGGGGCTGGGCACTGCAGCAACCACGCTTCGGCTGACACCAGGAAGGAAGCACGCCTGGAGCGGA 
TCCGAAACTACTGAGAGGGGCCAGGGCTGGCCGTGGGCGCAGGCCGATCCTTACATTCGAGGCCGGCCCAGCTCTGT 
AGCTTCCCCCTCTGGGCCTCTCACCCGCGCAGGACCTCGGTGGGAGCGCGCACGTGGCGGGGCGGGGGGCCGCGGGC 
CCGAGCCCGGACTGGCCACCGGGGGCGCCCGCGAGCTGACGCTCTGGCCCGTTGCGGTCTCTGTGGCCCGCGCGACC 
TTCCGGCCCTGGAGCCGGTGGCCGCGGGGCTCCAGCGACGCCGTGTGGTCCGTGCTCCGCTCTGTGGCTCCAGGGGG 
CCGAGAAACTGCTGAGAGTCCGGCCCGGCCGGCAGTGCTGGGCGCGGGCCGAGGGCCCCGGGAGGCAGCGGCCCCGC 
CCTCTTTACCTGCGGCCTCGCAGAGCATGCTGGGAGCCGCGGGAGGCAGTGGCCCCGCTCCCCTCACCTGCGGTATC 
GCAGAGCATGGTGGGAGCCCCGGGAGGCAGTGGCCCCGCCCCCTTCCCTGCGGCCGC 

<210> SEQ ID NO 119 

<211> Length : 1,250 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 119 

>R3 5 1 3 7_PEA_1_PEA__1 JPEA_1_T 1 4 

TGCCACCTCACCCACTGCCTCTGCCTCCCTGGGGCAGAGCTGTTCCCAGACGGGTGGGGCGGGGCCCAACTGTCCCC 
AGCTCCTTCAGCCCTTTCTGTCCCTCCCAGTGAGGCCAGCTGCGGTGAAGAGGGTGCTCTCTTGCCTGGAGTTCCCT 
CTGCTACGGCTGCCCCCTCCCAGCCCTGGCCCACTAAGCCAGACCCAGCTGTCGCCATTCCCACTTCTGGTCCTGCC 
ACCTCCTGAGCTGCCTTCCCGCCTGGTCTGGGTAGAGTCATGGCCTCGAGCACAGGTGACCGGAGCCAGGCGGTGAG 
GCATGGACTGAGGGCGAAGGTGCTGACGCTGGACGGCATGAACCCGCGTGTGCGGAGAGTGGAGTACGCAGTGCGTG 
GCCCCATAGTGCAGCGAGCCTTGGAGCTGGAGCAGGAGCTGCGCCAGGGTGTGAAGAAGCCTTTCACCGAGGTCATC 
CGTGCCAACATCGGGGACGCACAGGCTATGGGGCAGAGGCCCATCACCTTCCTGCGCCAGGTCTTGGCCCTCTGTGT 
TAACCCTGATCTTCTGAGCAGCCCCAACTTCCCTGACGATGCCAAGAAAAGGGCGGAGCGCATCTTGCAGGCGTGTG 
GGGGCCACAGTCTGGGGGCCTACAGCGTCAGCTCCGGCATCCAGCTGATCCGGGAGGACGTGGCGCGGTACATTGAG 
AGGCGTGACGGAGGCATCCCTGCGGACCCCAACAACGTCTTCCTGTCCACAGGGGCCAGCGATGCCATCGTGACGGT 
GCTGAAGCTGCTGGTGGCCGGCGAGGGCCACACACGCACGGGTGTGCTCATCCCCATCCCCCAGTACCCACTCTACT 
CGGCCACGCTGGCAGAGCTGGGCGCAGTGCAGGTGGATTACTACCTGGACGAGGAGCGTGCCTGGGCGCTGGACGTG 
GCCGAGCTTCACCGTGCACTGGGCCAGGCGCGGAGCGGCTTTGGGCAGCGGGAAGGCACCTACCACTTCCGGATGAC 
CATTCTGCCCCCCTTGGAGAAACTGCGGCTGCTGCTGGAGAAGCTGAGCAGGTTCCATGCCAAGTTCACCCTCGAGT 
ACTCCTGAGCACCCCAGCTGGGGCCAGGCTGGGTCGCCCTGGACTGTGTGCTCAGGAGCCCTGGGAGGCTCTGGAGC 
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CCACTGTACTTGCTCTTGATGCCTGGCGGGGTGGGGTGGGGGGGGTGCTGGGCCCCTGCCTCTCTGCAGGTCCCTAA 
TAAAGCTGTGTGGCAGTC 

<210> SEQ ID NO 120 

<211> Length : 1,292 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 120 
>Z2 52 99_PEA_2_T1 

GTTTCCTGCTTATGCAATAGTAGCTGGGAGAGGCCGAAAGAATTCTGGTGGGGCCACACCCACTGGTGAAAGAATAA 
ATAGTGAGGTTTGGCATTGGCCATCAGAGTCACTCCTGCCTTCACCATGAAGTCCAGCGGCCTCTTCCCCTTCCTGG 
TGCTGCTTGCCCTGGGAACTCTGGCACCTTGGGCTGTGGAAGGCTCTGGAAAGTCCTTCAAAGCTGGAGTCTGTCCT 
CCTAAGAAATCTGCCCAGTGCCTTAGATACAAGAAACCTGAGTGCCAGAGTGACTGGCAGTGTCCAGGGAAGAAGAG 
ATGTTGTCCTGACACTTGTGGCATCAAATGCCTGGATCCTGTTGACACCCCAAACCCAACAAGGAGGAAGCCTGGGA 
AGTGCCCAGTGACTTATGGCCAATGTTTGATGCTTAACCCCCCCAATTTCTGTGAGATGGATGGCCAGTGCAAGCGT 
GACTTGAAGTGTTGCATGGGCATGTGTGGGAAATCCTGCGTTTCCCCTGTGAAAGGTAAGCAGGGGATGAGGGCACA 
CTGAGCTCCCTCCAGCCCTCTCAGCCTCAACCCTCTGGAGGCCCAGGCATATGGGCAGGGGGACTCCTGAACCCTAC 
TCCAAGCACAGCCTCTGTCTGACTCCCTTGTCCTTCAAGAGAACTGTTCTCCAGGTCTCAGGGCCAGGATTTCCATA 
GGATCGCCTGTGGCTTTGATTCTATTCTAGTGTCTCTGGGTGGGGGTCCTGGGCAAGTGTCTTTCTGAGTCTCAGTT 
TCTTTATCGGTAAAATGTACATAATGAGATTGAAAGTGCTCTGCAAAGCACTATGTGCACTAAGAATTTATTATTCA 
GGTTGTTTCCATCATGTTTTCTGAGGTGAAATCACAAAGGATCAGTGGAGTTTGAGGATTATCTAGTTCAATGCTTT 
GAGTTTAGAGTTTTACGGTGAAAATGAGACTTGTCTCCTGACACTAAGTCTCTCTCAACTATAGCGCTATCTTGCTA 
TTTTCTCTATCTCAGAAGGATCCTTGGGCAGGAGGAAGGATGTGGATATGATTTGGCTGGTTTCTATGCTGAAGCTC 
TTATCTGATTTTCTCTCACAGCTTGATTCCTGCCATATGGAGGAGGCTCTGGAGTCCTGCTCTGTGTGGTCCAGGTC 
CTTTCCACCCTGAGACTTGGCTCCACCACTGATATCCTCCTTTGGGGAAAGGCTTGGCACACAGCAGGCTTTCAAGA 
AGTGCCAGTTGATCAATGAATAAATAAACGAGCCTATTTCTCTTTGCAAAACCTGCTTCT 

<210> SEQ ID NO 121 

<211> Length : 886 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 121 
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> Z 2 5 2 9 9_PEA_2_T2 

GTTTCCTGCTTATGCAATAGTAGCTGGGAGAGGCCGAAAGAATTCTGGTGGGGCCACACCCACTGGTGAAAGAATAA 
ATAGTGAGGTTTGGCATTGGCCATCAGAGTCACTCCTGCCTTCACCATGAAGTCCAGCGGCCTCTTCCCCTTCCTGG 
TGCTGCTTGCCCTGGGAACTCTGGCACCTTGGGCTGTGGAAGGCTCTGGAAAGTCCTTCAAAGCTGGAGTCTGTCCT 
CCTAAGAAATCTGCCCAGTGCCTTAGATACAAGAAACCTGAGTGCCAGAGTGACTGGCAGTGTCCAGGGAAGAAGAG 
ATGTTGTCCTGACACTTGTGGCATCAAATGCCTGGATCCTGTTGACACCCCAAACCCAACAAGGAGGAAGCCTGGGA 
AGTGCCCAGTGACTTATGGCCAATGTTTGATGCTTAACCCCCCCAATTTCTGTGAGATGGATGGCCAGTGCAAGCGT 
GACTTGAAGTGTTGCATGGGCATGTGTGGGAAATCCTGCGTTTCCCCTGTGAAAGGTGAAAAGAGACATCACAAGCA 
ATTGAGGGACCAGGAAGTGGATCCTCTAGAGATGAGGAGGCATTCTGCTGGATGACTTTTAAAAATGTTTTCTCCAG 
AGTCATCTCTCTCATTAACAATGTTTTTTGTCTTAGAAATTTCTTGTTGATTTTTAAACTTACATGATTTCTTGTTT 
TGGTATGAATACAGGCTGCTTCAGTCCTTCAATAAGCCCATCACACTTTTTCACCATGTCATCTATCAGCACTTTTT 
CTGCAGTGTTACGAACATCAGCTTCATCACTGTCAGCCTGCGTTTTGCCTGCAACCCATCAAATGAGGTCAGGAGAG 
GAGTTTTCCACTTTTGGCTTCATGTTGGTGCTCAAAACT 

<210> SEQ ID NO 122 

<211> Length : 696 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 122 
>Z252 99_PEA_2_T3 

GTTTCCTGCTTATGCAATAGTAGCTGGGAGAGGCCGAAAGAATTCTGGTGGGGCCACACCCACTGGTGAAAGAATAA 
ATAGTGAGGTTTGGCATTGGCCATCAGAGTCACTCCTGCCTTCACCATGAAGTCCAGCGGCCTCTTCCCCTTCCTGG 
TGCTGCTTGCCCTGGGAACTCTGGCACCTTGGGCTGTGGAAGGCTCTGGAAAGTCCTTCAAAGCTGGAGTCTGTCCT 
CCTAAGAAATCTGCCCAGTGCCTTAGATACAAGAAACCTGAGTGCCAGAGTGACTGGCAGTGTCCAGGGAAGAAGAG 
ATGTTGTCCTGACACTTGTGGCATCAAATGCCTGGATCCTGTTGACACCCCAAACCCAACAAGGAGGAAGCCTGGGA 
AGTGCCCAGTGACTTATGGCCAATGTTTGATGCTTAACCCCCCCAATTTCTGTGAGATGGATGGCCAGTGCAAGCGT 
GACTTGAAGTGTTGCATGGGCATGTGTGGGAAATCCTGCGTTTCCCCTGTGAAAGGCTGCTTCAGTCCTTCAATAAG 
CCCATCACACTTTTTCACCATGTCATCTATCAGCACTTTTTCTGCAGTGTTACGAACATCAGCTTCATCACTGTCAG 
CCTGCGTTTTGCCTGCAACCCATCAAATGAGGTCAGGAGAGGAGTTTTCCACTTTTGGCTTCATGTTGGTGCTCAAA 
ACT 

<210> SEQ ID NO 123 
<211> Length : 706 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 123 
> Z 2 5 2 9 9_PEA_2_T 6 

GTTTCCTGCTTATGCAATAGTAGCTGGGAGAGGCCGAAAGAATTCTGGTGGGGCCACACCCACTGGTGAAAGAATAA 
ATAGTGAGGTTTGGCATTGGCCATCAGAGTCACTCCTGCCTTCACCATGAAGTCCAGCGGCCTCTTCCCCTTCCTGG 
TGCTGCTTGCCCTGGGAACTCTGGCACCTTGGGCTGTGGAAGGCTCTGGAAAGTCCTTCAAAGCTGGAGTCTGTCCT 
CCTAAGAAATCTGCCCAGTGCCTTAGATACAAGAAACCTGAGTGCCAGAGTGACTGGCAGTGTCCAGGGAAGAAGAG 
ATGTTGTCCTGACACTTGTGGCATCAAATGCCTGGATCCTGTTGACACCCCAAACCCAAGAGGAAGCCTGGGAAGTG 
CCCAGTGACTTATGGCCAATGTTTGATGCTTAACCCCCCCAATTTCTGTGAGATGGATGGCCAGTGCAAGCGTGACT 
TGAAGTGTTGCATGGGCATGTGTGGGAAATCCTGCGTTTCCCCTGTGAAAGCTTGATTCCTGCCATATGGAGGAGGC 
TCTGGAGTCCTGCTCTGTGTGGTCCAGGTCCTTTCCACCCTGAGACTTGGCTCCACCACTGATATCCTCCTTTGGGG 
AAAGGCTTGGCACACAGCAGGCTTTCAAGAAGTGCCAGTTGATCAATGAATAAATAAACGAGCCTATTTCTCTTTGC 

AAAACCTGCTTCT 

<210> SEQ ID NO 124 

<211> Length : 560 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 124 
>Z 2 5 2 9 9_PE A__2 JT 9 

GTTTCCTGCTTATGCAATAGTAGCTGGGAGAGGCCGAAAGAATTCTGGTGGGGCCACACCCACTGGTGAAAGAATAA 
ATAGTGAGGTTTGGCATTGGCCATCAGAGTCACTCCTGCCTTCACCATGAAGTCCAGCGGCCTCTTCCCCTTCCTGG 
TGCTGCTTGCCCTGGGAACTCTGGCACCTTGGGCTGTGGAAGGCTCTGGAAAGTCCTTCAAAGCTGGAGTCTGTCCT 
CCTAAGAAATCTGCCCAGTGCCTTAGATACAAGAAACCTGAGTGCCAGAGTGACTGGCAGTGTCCAGGGAAGAAGAG 
ATGTTGTCCTGACACTTGTGGCATCAAATGCCTGGATCCTGTTGACACCCCAAACCCAACTTGATTCCTGCCATATG 
GAGGAGGCTCTGGAGTCCTGCTCTGTGTGGTCCAGGTCCTTTCCACCCTGAGACTTGGCTCCACCACTGATATCCTC 
CTTTGGGGAAAGGCTTGGCACACAGCAGGCTTTCAAGAAGTGCCAGTTGATCAATGAATAAATAAACGAGCCTATTT 

CTCTTTGCAAAACCTGCTTCT 



<210> SEQ ID NO 125 
<211> Length : 3, 194 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 125 
>HSSTROL3___T5 

CAAGCCCAGCAGCCCCGGGGCGGATGGCTCCGGCCGCCTGGCTCCGCAGCGCGGCCGCGCGCGCCCTCCTGCCCCCG 
ATGCTGCTGCTGCTGCTCCAGCCGCCGCCGCTGCTGGCCCGGGCTCTGCCGCCGGACGTCCACCACCTCCATGCCGA 
GAGGAGGGGGCCACAGCCCTGGCATGCAGCCCTGCCCAGTAGCCCGGCACCTGCCCCTGCCACGCAGGAAGCCCCCC 
GGCCTGCCAGCAGCCTCAGGCCTCCCCGCTGTGGCGTGCCCGACCCATCTGATGGGCTGAGTGCCCGCAACCGACAG 
AAGAGGTTCGTGCTTTCTGGCGGGCGCTGGGAGAAGACGGACCTCACCTACAGGATCCTTCGGTTCCCATGGCAGTT 
GGTGCAGGAGCAGGTGCGGCAGACGATGGCAGAGGCCCTAAAGGTATGGAGCGATGTGACGCCACTCACCTTTACTG 
AGGTGCACGAGGGCCGTGCTGACATCATGATCGACTTCGCCAGGTACTGGCATGGGGACGACCTGCCGTTTGATGGG 
CCTGGGGGCATCCTGGCCCATGCCTTCTTCCCCAAGACTCACCGAGAAGGGGATGTCCACTTCGACTATGATGAGAC 
CTGGACTATCGGGGATGACCAGGGCACAGACCTGCTGCAGGTGGCAGCCCATGAATTTGGCCACGTGCTGGGGCTGC 
AGCACACAACAGCAGCCAAGGCCCTGATGTCCGCCTTCTACACCTTTCGCTACCCACTGAGTCTCAGCCCAGATGAC 
TGCAGGGGCGTTCAACACCTATATGGCCAGCCCTGGCCCACTGTCACCTCCAGGACCCCAGCCCTGGGCCCCCAGGC 
TGGGATAGACACCAATGAGATTGCACCGCTGGAGCCAGACGCCCCGCCAGATGCCTGTGAGGCCTCCTTTGACGCGG 
TCTCCACCATCCGAGGCGAGCTCTTTTTCTTCAAAGCGGGCTTTGTGTGGCGCCTCCGTGGGGGCCAGCTGCAGCCC 
GGCTACCCAGCATTGGCCTCTCGCCACTGGCAGGGACTGCCCAGCCCTGTGGACGCTGCCTTCGAGGATGCCCAGGG 
CCACATTTGGTTCTTCCAAGGTGCTCAGTACTGGGTGTACGACGGTGAAAAGCCAGTCCTGGGCCCCGCACCCCTCA 
CCGAGCTGGGCCTGGTGAGGTTCCCGGTCCATGCTGCCTTGGTCTGGGGTCCCGAGAAGAACAAGATCTACTTCTTC 
CGAGGCAGGGACTACTGGCGTTTCCACCCCAGCACCCGGCGTGTAGACAGTCCCGTGCCCCGCAGGGCCACTGACTG 
GAGAGGGGTGCCCTCTGAGATCGACGCTGCCTTCCAGGATGCTGATGGTGCGTTGGGGGTGAGGCAGCTGGTGGGAG 
GTGGGCACAGCAGCCGCTTCTCCCACCTGGTGGTGGCTGGGCTCCCACATGCCTGCCACAGGAAGTCTGGCTCTTCA 
TCACAGGTCCTTTGTCCAGAGCCATCTGCCCTCCTCTCGGTGGCCGGCTAGTGCTACATTCCATATTGCAGATGAGG 
AAACTGAGGGTCAGAGAAGTGCAAGGTCTTACCCTGGTTTTTCAGCCACAGCCAGTAGAACAATAAACTGCTGTACA 
CTGAGGGCCAACAATGCTCTAAGCTCCTTACTGGTCTCATCCAGTTCTCAGAACAGCCCTCTGATGTGACACCTGTT 
GTGAACCCAGTTTCCAGAGGAGCAAACAGAGGCTCAGGCAATGAGGCCCCTAACCTGGACTACCCTGGTGGTCCCTG 
CTCCTAACCACTGACCCACCCAGCCTCCCACAACCACAGGGGGCTAGAGCCAGTCCAGTGCTCCCTCCCCTGCTAGG 
CTCCTCTTCTGTGCTCTTTCTCCCACATCAGGACCCACTGGGAGAGCTATCCTAGGGTAGCCTCCAGCTCCAGGACT 
CCAGGGTGCCCGTCAATAGCCTGGCTAATTTAATAGATGCAGGAGAGAGTGATGTGGAGGGTGGTGGGGGCAACGGG 
ACTTGCTTTCCTGAGAGGTGGGACTCAGGCCTCTGAGGCTCTGGGTACCTGTCAGGCTGGGTATTAGCCCAGCCCAG 
ATTCCGGGGCAGGCAGAAGGGCTCCCTAGAGGGAAGAGAGGTTCTGAAAGGCCGGCCCTGGATCCTGCAGGACTCGA 
GGAACTCAGCAGTGGCCAAGGGCTTCCCACTCAGCCCTCCCTTAGTGCCCATCCCTGGGCACAGCCTGACAGGCAGG 
AGTAGGGCCCAGTGTCCACTCGCCCAGGCTTGACCACCTTCTCTTCTCAGGCTATGCCTACTTCCTGCGCGGCCGCC 
TCTACTGGAAGTTTGACCCTGTGAAGGTGAAGGCTCTGGAAGGCTTCCCCCGTCTCGTGGGTCCTGACTTCTTTGGC 
TGTGCCGAGCCTGCCAACACTTTCCTCTGACCATGGCTTGGATGCCCTCAGGGGTGCTGACCCCTGCCAGGCCACGA 
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ATATCAGGCTAGAGACCCATGGCCATCTTTGTGGCTGTGGGCACCAGGCATGGGACTGAGCCCATGTCTCCTCAGGG 
GGATGGGGTGGGGTACAACCACCATGACAACTGCCGGGAGGGCCACGCAGGTCGTGGTCACCTGCCAGCGACTGTCT 
CAGACTGGGCAGGGAGGCTTTGGCATGACTTAAGAGGAAGGGCAGTCTTGGGCCCGCTATGCAGGTCCTGGCAAACC 
TGGCTGCCCTGTCTCCATCCCTGTCCCTCAGGGTAGCACCATGGCAGGACTGGGGGAACTGGAGTGTCCTTGCTGTA 
TCCCTGTTGTGAGGTTCCTTCCAGGGGCTGGCACTGAAGCAAGGGTGCTGGGGCCCCATGGCCTTCAGCCCTGGCTG 
AGCAACTGGGCTGTAGGGCAGGGCCACTTCCTGAGGTCAGGTCTTGGTAGGTGCCTGCATCTGTCTGCCTTCTGGCT 
GACAATCCTGGAAATCTGTTCTCCAGAATCCAGGCCAAAAAGTTCACAGTCAAATGGGGAGGGGTATTCTTCATGCA 
GGAGACCCCAGGCCCTGGAGGCTGCAACATACCTCAATCCTGTCCCAGGCCGGATCCTCCTGAAGCCCTTTTCGCAG 
CACTGCTATCCTCCAAAGCCATTGTAAATGTGTGTACAGTGTGTATAAACCTTCTTCTTCTTTTTTTTTTTTAAACT 
GAGGATTGTCATTAAACACAGTTGTTTTCTACCTGCC 

<210> SEQ ID NO 126 

<211> Length : 2,705 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 126 
>HSSTROL3_T8 

CAAGCCCAGCAGCCCCGGGGCGGATGGCTCCGGCCGCCTGGCTCCGCAGCGCGGCCGCGCGCGCCCTCCTGCCCCCG 

ATGCTGCTGCTGCTGCTCCAGCCGCCGCCGCTGCTGGCCCGGGCTCTGCCGCCGGACGTCCACCACCTCCATGCCGA 

GAGGAGGGGGCCACAGCCCTGGCATGCAGCCCTGCCCAGTAGCCCGGCACCTGCCCCTGCCACGCAGGAAGCCCCCC 

GGCCTGCCAGCAGCCTCAGGCCTCCCCGCTGTGGCGTGCCCGACCCATCTGATGGGCTGAGTGCCCGCAACCGACAG 

AAGAGGTTCGTGCTTTCTGGCGGGCGCTGGGAGAAGACGGACCTCACCTACAGGATCCTTCGGTTCCCATGGCAGTT 

GGTGCAGGAGCAGGTGCGGCAGACGATGGCAGAGGCCCTAAAGGTATGGAGCGATGTGACGCCACTCACCTTTACTG 

AGGTGCACGAGGGCCGTGCTGACATCATGATCGACTTCGCCAGGTACTGGCATGGGGACGACCTGCCGTTTGATGGG 

CCTGGGGGCATCCTGGCCCATGCCTTCTTCCCCAAGACTCACCGAGAAGGGGATGTCCACTTCGACTATGATGAGAC 

CTGGACTATCGGGGATGACCAGGGCACAGACCTGCTGCAGGTGGCAGCCCATGAATTTGGCCACGTGCTGGGGCTGC 

AGCACACAACAGCAGCCAAGGCCCTGATGTCCGCCTTCTACACCTTTCGCTACCCACTGAGTCTCAGCCCAGATGAC 

TGCAGGGGCGTTCAACACCTATATGGCCAGCCCTGGCCCACTGTCACCTCCAGGACCCCAGCCCTGGGCCCCCAGGC 

TGGGATAGACACCAATGAGATTGCACCGCTGGAGCCAGACGCCCCGCCAGATGCCTGTGAGGCCTCCTTTGACGCGG 

TCTCCACCATCCGAGGCGAGCTCTTTTTCTTCAAAGCGGGCTTTGTGTGGCGCCTCCGTGGGGGCCAGCTGCAGCCC 

GGCTACCCAGCATTGGCCTCTCGCCACTGGCAGGGACTGCCCAGCCCTGTGGACGCTGCCTTCGAGGATGCCCAGGG 

CCACATTTGGTTCTTCCAAGAGCTGGGATTTCCATCCTCAACTGGCAGAGATGAGAGCCTGGAGCATTGCAGATGCC 

AGGGACTTCACAAATGAAGGCACAGCATGGGAAACCTGCGTGGGTTCCAGGGCAGTCCAGCCTGCAGGGGCCCAGGG 

AGTGGTCAGTAGGCATTTGTCACAGCCAAATGCCAGTGGAAGGAGCAGCCGCCCAGGCAGCCCTCTACTGATGAGAG 

TAACCTCACCCGTGCACTAGTTTACAGAGCATTCACTGCCCCAGCTTATCCCAGGCCTCCCGCTTCCCTCTGCGGGT 
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GGGGTGCTGAGCAGGCATTATTGGCCTGCATGTTTTACTGATGAGGAAACTGAGGCTGGGAGAGTCTGTGGTAGGGG 
TCAAGCAGGTCCACAGTGGCGGGGCATGGCAGTGGTGGCTGGGCAGGTCCTTGCAGCCTTCCCTCTCCGGCAGGTGC 
TCAGTACTGGGTGTACGACGGTGAAAAGCCAGTCCTGGGCCCCGCACCCCTCACCGAGCTGGGCCTGGTGAGGTTCC 
CGGTCCATGCTGCCTTGGTCTGGGGTCCCGAGAAGAACAAGATCTACTTCTTCCGAGGCAGGGACTACTGGCGTTTC 
CACCCCAGCACCCGGCGTGTAGACAGTCCCGTGCCCCGCAGGGCCACTGACTGGAGAGGGGTGCCCTCTGAGATCGA 
CGCTGCCTTCCAGGATGCTGATGGCTATGCCTACTTCCTGCGCGGCCGCCTCTACTGGAAGTTTGACCCTGTGAAGG 
TGAAGGCTCTGGAAGGCTTCCCCCGTCTCGTGGGTCCTGACTTCTTTGGCTGTGCCGAGCCTGCCAACACTTTCCTC 
TGACCATGGCTTGGATGCCCTCAGGGGTGCTGACCCCTGCCAGGCCACGAATATCAGGCTAGAGACCCATGGCCATC 
TTTGTGGCTGTGGGCACCAGGCATGGGACTGAGCCCATGTCTCCTCAGGGGGATGGGGTGGGGTACAACCACCATGA 
CAACTGCCGGGAGGGCCACGCAGGTCGTGGTCACCTGCCAGCGACTGTCTCAGACTGGGCAGGGAGGCTTTGGCATG 
ACTTAAGAGGAAGGGCAGTCTTGGGCCCGCTATGCAGGTCCTGGCAAACCTGGCTGCCCTGTCTCCATCCCTGTCCC 
TCAGGGTAGCACCATGGCAGGACTGGGGGAACTGGAGTGTCCTTGCTGTATCCCTGTTGTGAGGTTCCTTCCAGGGG 
CTGGCACTGAAGCAAGGGTGCTGGGGCCCCATGGCCTTCAGCCCTGGCTGAGCAACTGGGCTGTAGGGCAGGGCCAC 
TTCCTGAGGTCAGGTCTTGGTAGGTGCCTGCATCTGTCTGCCTTCTGGCTGACAATCCTGGAAATCTGTTCTCCAGA 
ATCCAGGCCAAAAAGTTCACAGTCAAATGGGGAGGGGTATTCTTCATGCAGGAGACCCCAGGCCCTGGAGGCTGCAA 
CATACCTCAATCCTGTCCCAGGCCGGATCCTCCTGAAGCCCTTTTCGCAGCACTGCTATCCTCCAAAGCCATTGTAA 
ATGTGTGTACAGTGTGTATAAACCTTCTTCTTCTTTTTTTTTTTTAAACTGAGGATTGTCATTAAACACAGTTGTTT 
TCTACCTGCC 

<210> SEQ ID NO 127 

<211> Length : 3, 32 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 127 
>HSSTROL3_T9 

CAAGCCCAGCAGCCCCGGGGCGGATGGCTCCGGCCGCCTGGCTCCGCAGCGCGGCCGCGCGCGCCCTCCTGCCCCCG 
ATGCTGCTGCTGCTGCTCCAGCCGCCGCCGCTGCTGGCCCGGGCTCTGCCGCCGGACGTCCACCACCTCCATGCCGA 
GAGGAGGGGGCCACAGCCCTGGCATGCAGCCCTGCCCAGTAGCCCGGCACCTGCCCCTGCCACGCAGGAAGCCCCCC 
GGCCTGCCAGCAGCCTCAGGCCTCCCCGCTGTGGCGTGCCCGACCCATCTGATGGGCTGAGTGCCCGCAACCGACAG 
AAGAGGTTCGTGCTTTCTGGCGGGCGCTGGGAGAAGACGGACCTCACCTACAGGATCCTTCGGTTCCCATGGCAGTT 
GGTGCAGGAGCAGGTGCGGCAGACGATGGCAGAGGCCCTAAAGGTATGGAGCGATGTGACGCCACTCACCTTTACTG 
AGGTGCACGAGGGCCGTGCTGACATCATGATCGACTTCGCCAGGTACTGGCATGGGGACGACCTGCCGTTTGATGGG 
CCTGGGGGCATCCTGGCCCATGCCTTCTTCCCCAAGACTCACCGAGAAGGGGATGTCCACTTCGACTATGATGAGAC 
CTGGACTATCGGGGATGACCAGGGCACAGACCTGCTGCAGGTGGCAGCCCATGAATTTGGCCACGTGCTGGGGCTGC 
AGCACACAACAGCAGCCAAGGCCCTGATGTCCGCCTTCTACACCTTTCGCTACCCACTGAGTCTCAGCCCAGATGAC 
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TGCAGGGGCGTTCAACACCTATATGGCCAGCCCTGGCCCACTGTCACCTCCAGGACCCCAGCCCTGGGCCCCCAGGC 

TGGGATAGACACCAATGAGATTGCACCGCTGGAGCCAGACGCCCCGCCAGATGCCTGTGAGGCCTCCTTTGACGCGG 

TCTCCACCATCCGAGGCGAGCTCTTTTTCTTCAAAGCGGGCTTTGTGTGGCGCCTCCGTGGGGGCCAGCTGCAGCCC 

GGCTACCCAGCATTGGCCTCTCGCCACTGGCAGGGACTGCCCAGCCCTGTGGACGCTGCCTTCGAGGATGCCCAGGG 

CCACATTTGGTTCTTCCAAGAGCTGGGATTTCCATCCTCAACTGGCAGAGATGAGAGCCTGGAGCATTGCAGATGCC 

AGGGACTTCACAAATGAAGGCACAGCATGGGAAACCTGCGTGGGTTCCAGGGCAGTCCAGCCTGCAGGGGCCCAGGG 

AGTGGTGCTCAGTACTGGGTGTACGACGGTGAAAAGCCAGTCCTGGGCCCCGCACCCCTCACCGAGCTGGGCCTGGT 

GAGGTTCCCGGTCCATGCTGCCTTGGTCTGGGGTCCCGAGAAGAACAAGATCTACTTCTTCCGAGGCAGGGACTACT 

GGCGTTTCCACCCCAGCACCCGGCGTGTAGACAGTCCCGTGCCCCGCAGGGCCACTGACTGGAGAGGGGTGCCCTCT 

GAGATCGACGCTGCCTTCCAGGATGCTGATGGTGCGTTGGGGGTGAGGCAGCTGGTGGGAGGTGGGCACAGCAGCCG 

CTTCTCCCACCTGGTGGTGGCTGGGCXCCCACATGCCTGCCACAGGAAGTCTGGCTCTTCATCACAGGTCCTTTGTC 

CAGAGCCATCTGCCCTCCTCTCGGTGGCCGGCTAGTGCTACATTCCATATTGCAGATGAGGAAACTGAGGGTCAGAG 

AAGTGCAAGGTCTTACCCTGGTTTTTCAGCCACAGCCAGTAGAACAATAAACTGCTGTACACTGAGGGCCAACAATG 

CTCTAAGCTCCTTACTGGTCTCATCCAGTTCTCAGAACAGCCCTCTGATGTGACACCTGTTGTGAACCCAGTTTCCA 

GAGGAGCAAACAGAGGCTCAGGCAATGAGGCCCCTAACCTGGACTACCCTGGTGGTCCCTGCTCCTAACCACTGACC 

CACCCAGCCTCCCACAACCACAGGGGGCTAGAGCCAGTCCAGTGCTCCCTCCCCTGCTAGGCTCCTCTTCTGTGCTC 

TTTCTCCCACATCAGGACCCACTGGGAGAGCTATCCTAGGGTAGCCTCCAGCTCCAGGACTCCAGGGTGCCCGTCAA 

TAGCCTGGCTAATTTAATAGATGCAGGAGAGAGTGATGTGGAGGGTGGTGGGGGCAACGGGACTTGCTTTCCTGAGA 

GGTGGGACTCAGGCCTCTGAGGCTCTGGGTACCTGTCAGGCTGGGTATTAGCCCAGCCCAGATTCCGGGGCAGGCAG 

AAGGGCTCCCTAGAGGGAAGAGAGGTTCTGAAAGGCCGGCCCTGGATCCTGCAGGACTCGAGGAACTCAGCAGTGGC 

CAAGGGCTTCCCACTCAGCCCTCCCTTAGTGCCCATCCCTGGGCACAGCCTGACAGGCAGGAGTAGGGCCCAGTGTC 

CACTCGCCCAGGCTTGACCACCTTCTCTTCTCAGGCTATGCCTACTTCCTGCGCGGCCGCCTCTACTGGAAGTTTGA 

CCCTGTGAAGGTGAAGGCTCTGGAAGGCTTCCCCCGTCTCGTGGGTCCTGACTTCTTTGGCTGTGCCGAGCCTGCCA 

ACACTTTCCTCTGACCATGGCTTGGATGCCCTCAGGGGTGCTGACCCCTGCCAGGCCACGAATATCAGGCTAGAGAC 

CCATGGCCATCTTTGTGGCTGTGGGCACCAGGCATGGGACTGAGCCCATGTCTCCTCAGGGGGATGGGGTGGGGTAC 

AACCACCATGACAACTGCCGGGAGGGCCACGCAGGTCGTGGTCACCTGCCAGCGACTGTCTCAGACTGGGCAGGGAG 

GCTTTGGCATGACTTAAGAGGAAGGGCAGTCTTGGGCCCGCTATGCAGGTCCTGGCAAACCTGGCTGCCCTGTCTCC 

ATCCCTGTCCCTCAGGGTAGCACCATGGCAGGACTGGGGGAACTGGAGTGTCCTTGCTGTATCCCTGTTGTGAGGTT 

CCTTCCAGGGGCTGGCACTGAAGCAAGGGTGCTGGGGCCCCATGGCCTTCAGCCCTGGCTGAGCAACTGGGCTGTAG 

GGCAGGGCCACTTCCTGAGGTCAGGTCTTGGTAGGTGCCTGCATCTGTCTGCCTTCTGGCTGACAATCCTGGAAATC 

TGTTCTCCAGAATCCAGGCCAAAAAGTTCACAGTCAAATGGGGAGGGGTATTCTTCATGCAGGAGACCCCAGGCCCT 

GGAGGCTGCAACATACCTCAATCCTGTCCCAGGCCGGATCCTCCTGAAGCCCTTTTCGCAGCACTGCTATCCTCCAA 

AGCCATTGTAAATGTGTGTACAGTGTGTATAAACCTTCTTCTTCTTTTTTTTTTTTAAACTGAGGATTGTCATTAAA 

CACAGTTGTTTTCTACCTGCC 



<210> SEQ ID NO 128 
<211> Length : 3,052 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 128 
>HSSTROL3_T10 

CAAGCCCAGCAGCCCCGGGGCGGATGGCTCCGGCCGCCTGGCTCCGCAGCGCGGCCGCGCGCGCCCTCCTGCCCCCG 

ATGCTGCTGCTGCTGCTCCAGCCGCCGCCGCTGCTGGCCCGGGCTCTGCCGCCGGACGTCCACCACCTCCATGCCGA 

GAGGAGGGGGCCACAGCCCTGGCATGCAGCCCTGCCCAGTAGCCCGGCACCTGCCCCTGCCACGCAGGAAGCCCCCC 

GGCCTGCCAGCAGCCTCAGGCCTCCCCGCTGTGGCGTGCCCGACCCATCTGATGGGCTGAGTGCCCGCAACCGACAG 

AAGAGGTTCGTGCTTTCTGGCGGGCGCTGGGAGAAGACGGACCTCACCTACAGGATCCTTCGGTTCCCATGGCAGTT 

GGTGCAGGAGCAGGTGCGGCAGACGATGGCAGAGGCCCTAAAGGTATGGAGCGATGTGACGCCACTCACCTTTACTG 

AGGTGCACGAGGGCCGTGCTGACATCATGATCGACTTCGCCAGGTACTGGCATGGGGACGACCTGCCGTTTGATGGG 

CCTGGGGGCATCCTGGCCCATGCCTTCTTCCCCAAGACTCACCGAGAAGGGGATGTCCACTTCGACTATGATGAGAC 

CTGGACTATCGGGGATGACCAGGGCACAGACCTGCTGCAGGTGGCAGCCCATGAATTTGGCCACGTGCTGGGGCTGC 

AGCACACAACAGCAGCCAAGGCCCTGATGTCCGCCTTCTACACCTTTCGCTACCCACTGAGTCTCAGCCCAGATGAC 

TGCAGGGGCGTTCAACACCTATATGGCCAGCCCTGGCCCACTGTCACCTCCAGGACCCCAGCCCTGGGCCCCCAGGC 

TGGGATAGACACCAATGAGATTGCACCGCTGGAGCCAGACGCCCCGCCAGATGCCTGTGAGGCCTCCTTTGACGCGG 

TCTCCACCATCCGAGGCGAGCTCTTTTTCTTCAAAGCGGGCTTTGTGTGGCGCCTCCGTGGGGGCCAGCTGCAGCCC 

GGCTACCCAGCATTGGCCTCTCGCCACTGGCAGGGACTGCCCAGCCCTGTGGACGCTGCCTTCGAGGATGCCCAGGG 

CCACATTTGGTTCTTCCAAGGGACTACTGGCGTTTCCACCCCAGCACCCGGCGTGTAGACAGTCCCGTGCCCCGCAG 

GGCCACTGACTGGAGAGGGGTGCCCTCTGAGATCGACGCTGCCTTCCAGGATGCTGATGGTGCGTTGGGGGTGAGGC 

AGCTGGTGGGAGGTGGGCACAGCAGCCGCTTCTCCCACCTGGTGGTGGCTGGGCTCCCACATGCCTGCCACAGGAAG 

TCTGGCTCTTCATCACAGGTCCTTTGTCCAGAGCCATCTGCCCTCCTCTCGGTGGCCGGCTAGTGCTACATTCCATA 

TTGCAGATGAGGAAACTGAGGGTCAGAGAAGTGCAAGGTCTTACCCTGGTTTTTCAGCCACAGCCAGTAGAACAATA 

AACTGCTGTACACTGAGGGCCAACAATGCTCTAAGCTCCTTACTGGTCTCATCCAGTTCTCAGAACAGCCCTCTGAT 

GTGACACCTGTTGTGAACCCAGTTTCCAGAGGAGCAAACAGAGGCTCAGGCAATGAGGCCCCTAACCTGGACTACCC 

TGGTGGTCCCTGCTCCTAACCACTGACCCACCCAGCCTCCCACAACCACAGGGGGCTAGAGCCAGTCCAGTGCTCCC 

TCCCCTGCTAGGCTCCTCTTCTGTGCTCTTTCTCCCACATCAGGACCCACTGGGAGAGCTATCCTAGGGTAGCCTCC 

AGCTCCAGGACTCCAGGGTGCCCGTCAATAGCCTGGCTAATTTAATAGATGCAGGAGAGAGTGATGTGGAGGGTGGT 

GGGGGCAACGGGACTTGCTTTCCTGAGAGGTGGGACTCAGGCCTCTGAGGCTCTGGGTACCTGTCAGGCTGGGTATT 

AGCCCAGCCCAGATTCCGGGGCAGGCAGAAGGGCTCCCTAGAGGGAAGAGAGGTTCTGAAAGGCCGGCCCTGGATCC 

TGCAGGACTCGAGGAACTCAGCAGTGGCCAAGGGCTTCCCACTCAGCCCTCCCTTAGTGCCCATCCCTGGGCACAGC 

CTGACAGGCAGGAGTAGGGCCCAGTGTCCACTCGCCCAGGCTTGACCACCTTCTCTTCTCAGGCTATGCCTACTTCC 

TGCGCGGCCGCCTCTACTGGAAGTTTGACCCTGTGAAGGTGAAGGCTCTGGAAGGCTTCCCCCGTCTCGTGGGTCCT 

GACTTCTTTGGCTGTGCCGAGCCTGCCAACACTTTCCTCTGACCATGGCTTGGATGCCCTCAGGGGTGCTGACCCCT 

GCCAGGCCACGAATATCAGGCTAGAGACCCATGGCCATCTTTGTGGCTGTGGGCACCAGGCATGGGACTGAGCCCAT 

GTCTCCTCAGGGGGATGGGGTGGGGTACAACCACCATGACAACTGCCGGGAGGGCCACGCAGGTCGTGGTCACCTGC 
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CAGCGACTGTCTCAGACTGGGCAGGGAGGCTTTGGCATGACTTAAGAGGAAGGGCAGTCTTGGGCCCGCTATGCAGG 
TCCTGGCAAACCTGGCTGCCCTGTCTCCATCCCTGTCCCTCAGGGTAGCACCATGGCAGGACTGGGGGAACTGGAGT 
GTCCTTGCTGTATCCCTGTTGTGAGGTTCCTTCCAGGGGCTGGCACTGAAGCAAGGGTGCTGGGGCCCCATGGCCTT 
CAGCCCTGGCTGAGCAACTGGGCTGTAGGGCAGGGCCACTTCCTGAGGTCAGGTCTTGGTAGGTGCCTGCATCTGTC 
TGCCTTCTGGCTGACAATCCTGGAAATCTGTTCTCCAGAATCCAGGCCAAAAAGTTCACAGTCAAATGGGGAGGGGT 
ATTCTTCATGCAGGAGACCCCAGGCCCTGGAGGCTGCAACATACCTCAATCCTGTCCCAGGCCGGATCCTCCTGAAG 
CCCTTTTCGCAGCACTGCTATCCTCCAAAGCCATTGTAAATGTGTGTACAGTGTGTATAAACCTTCTTCTTCTTTTT 
TTTTTTTAAACTGAGGATTGTCATTAAACACAGTTGTTTTCTACCTGCC 

<210> SEQ ID NO 129 

<211> Length : 2, 359 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 129 
>HSSTROL3_Tll 

CAAGCCCAGCAGCCCCGGGGCGGATGGCTCCGGCCGCCTGGCTCCGCAGCGCGGCCGCGCGCGCCCTCCTGCCCCCG 
ATGCTGCTGCTGCTGCTCCAGCCGCCGCCGCTGCTGGCCCGGGCTCTGCCGCCGGACGTCCACCACCTCCATGCCGA 
GAGGAGGGGGCCACAGCCCTGGCATGCAGCCCTGCCCAGTAGCCCGGCACCTGCCCCTGCCACGCAGGAAGCCCCCC 
GGCCTGCCAGCAGCCTCAGGCCTCCCCGCTGTGGCGTGCCCGACCCATCTGATGGGCTGAGTGCCCGCAACCGACAG 
AAGAGGTTCGTGCTTTCTGGCGGGCGCTGGGAGAAGACGGACCTCACCTACAGGATCCTTCGGTTCCCATGGCAGTT 
GGTGCAGGAGCAGGTGCGGCAGACGATGGCAGAGGCCCTAAAGGTATGGAGCGATGTGACGCCACTCACCTTTACTG 
AGGTGCACGAGGGCCGTGCTGACATCATGATCGACTTCGCCAGGTACTGGCATGGGGACGACCTGCCGTTTGATGGG 
CCTGGGGGCATCCTGGCCCATGCCTTCTTCCCCAAGACTCACCGAGAAGGGGATGTCCACTTCGACTATGATGAGAC 
CTGGACTATCGGGGATGACCAGGGCACAGACCTGCTGCAGGTGGCAGCCCATGAATTTGGCCACGTGCTGGGGCTGC 
AGCACACAACAGCAGCCAAGGCCCTGATGTCCGCCTTCTACACCTTTCGCTACCCACTGAGTCTCAGCCCAGATGAC 
TGCAGGGGCGTTCAACACCTATATGGCCAGCCCTGGCCCACTGTCACCTCCAGGACCCCAGCCCTGGGCCCCCAGGC 
TGGGATAGACACCAATGAGATTGCACCGCTGGAGGTGAGGCCCTGCCTGCCAGTCCCCCTACTCCTCTGCTGGCCAC 
TGTGACTGCAGCATATGCCCTCAGCATGTGTCCCTCTCTCCCACCCCAGCCAGACGCCCCGCCAGATGCCTGTGAGG 
CCTCCTTTGACGCGGTCTCCACCATCCGAGGCGAGCTCTTTTTCTTCAAAGCGGGCTTTGTGTGGCGCCTCCGTGGG 
GGCCAGCTGCAGCCCGGCTACCCAGCATTGGCCTCTCGCCACTGGCAGGGACTGCCCAGCCCTGTGGACGCTGCCTT 
CGAGGATGCCCAGGGCCACATTTGGTTCTTCCAAGGTGCTCAGTACTGGGTGTACGACGGTGAAAAGCCAGTCCTGG 
GCCCCGCACCCCTCACCGAGCTGGGCCTGGTGAGGTTCCCGGTCCATGCTGCCTTGGTCTGGGGTCCCGAGAAGAAC 
AAGATCTACTTCTTCCGAGGCAGGGACTACTGGCGTTTCCACCCCAGCACCCGGCGTGTAGACAGTCCCGTGCCCCG 
CAGGGCCACTGACTGGAGAGGGGTGCCCTCTGAGATCGACGCTGCCTTCCAGGATGCTGATGGCTATGCCTACTTCC 
TGCGCGGCCGCCTCTACTGGAAGTTTGACCCTGTGAAGGTGAAGGCTCTGGAAGGCTTCCCCCGTCTCGTGGGTCCT 



WO 2006/131783 



PCT/IB2005/004037 



149 

GACTTCTTTGGCTGTGCCGAGCCTGCCAACACTTTCCTCTGACCATGGCTTGGATGCCCTCAGGGGTGCTGACCCCT 
GCCAGGCCACGAATATCAGGCTAGAGACCCATGGCCATCTTTGTGGCTGTGGGCACCAGGCATGGGACTGAGCCCAT 
GTCTCCTCAGGGGGATGGGGTGGGGTACAACCACCATGACAACTGCCGGGAGGGCCACGCAGGTCGTGGTCACCTGC 
CAGCGACTGTCTCAGACTGGGGAGGGAGGCTTTGGCATGACTTAAGAGGAAGGGCAGTCTTGGGCCCGCTATGCAGG 
TCCTGGCAAACCTGGCTGCCCTGTCTCCATCCCTGTCCCTCAGGGTAGCACCATGGCAGGACTGGGGGAACTGGAGT 
GTCCTTGCTGTATCCCTGTTGTGAGGTTCCTTCCAGGGGCTGGCACTGAAGCAAGGGTGCTGGGGCCCCATGGCCTT 
CAGCCCTGGCTGAGCAACTGGGCTGTAGGGCAGGGCCACTTCCTGAGGTCAGGTCTTGGTAGGTGCCTGCATCTGTC 
TGCCTTCTGGCTGACAATCCTGGAAATCTGTTCTCCAGAATCCAGGCCAAAAAGTTCACAGTCAAATGGGGAGGGGT 
ATTCTTCATGCAGGAGACCCCAGGCCCTGGAGGCTGCAACATACCTCAATCCTGTCCCAGGCCGGATCCTCCTGAAG 
CCCTTTTCGCAGCACTGCTATCCTCCAAAGCCATTGTAAATGTGTGTACAGTGTGTATAAACCTTCTTCTTCTTTTT 
TTTTTTTAAACTGAGGATTGTCATTAAACACAGTTGTTTTCTACCTGCC 

<210> SEQ ID NO 130 

<211> Length : 2,077 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 130 
>HSSTROL3_T12 

CAAGCCCAGCAGCCCCGGGGCGGATGGCTCCGGCCGCCTGGCTCCGCAGCGCGGCCGCGCGCGCCCTCCTGCCCCCG 
ATGCTGCTGCTGCTGCTCCAGCCGCCGCCGCTGCTGGCCCGGGCTCTGCCGCCGGACGTCCACCACCTCCATGCCGA 
GAGGAGGGGGCCACAGCCCTGGCATGCAGCCCTGCCCAGTAGCCCGGCACCTGCCCCTGCCACGCAGGAAGCCCCCC 
GGCCTGCCAGCAGCCTCAGGCCTCCCCGCTGTGGCGTGCCCGACCCATCTGATGGGCTGAGTGCCCGCAACCGACAG 
AAGAGGATCCTTCGGTTCCCATGGCAGTTGGTGCAGGAGCAGGTGCGGCAGACGATGGCAGAGGCCCTAAAGGTATG 
GAGCGATGTGACGCCACTCACCTTTACTGAGGTGCACGAGGGCCGTGCTGACATCATGATCGACTTCGCCAGGTACT 
GGCATGGGGACGACCTGCCGTTTGATGGGCCTGGGGGCATCCTGGCCCATGCCTTCTTCCCCAAGACTCACCGAGAA 
GGGGATGTCCACTTCGACTATGATGAGACCTGGACTATCGGGGATGACCAGGGCACAGACCTGCTGCAGGTGGCAGC 
CCATGAATTTGGCCACGTGCTGGGGCTGCAGCACACAACAGCAGCCAAGGCCCTGATGTCCGCCTTCTACACCTTTC 
GCTACCCACTGAGTCTCAGCCCAGATGACTGCAGGGGCGTTCT^ACACCTATATGGCCAGCCCTGGCCCACTGTCACC 
TCCAGGACCCCAGCCCTGGGCCCCCAGGCTGGGATAGACACCAATGAGATTGCACCGCTGGAGCCAGACGCCCCGCC 
AGATGCCTGTGAGGCCTCCTTTGACGCGGTCTCCACCATCCGAGGCGAGCTCTTTTTCTTCAAAGCGGGCTTTGTGT 
GGCGCCTCCGTGGGGGCCAGCTGCAGCCCGGCTACCCAGCATTGGCCTCTCGCCACTGGCAGGGACTGCCCAGCCCT 
GTGGACGCTGCCTTCGAGGATGCCCAGGGCCACATTTGGTTCTTCCAAGGGACTACTGGCGTTTCCACCCCAGCACC 
CGGCGTGTAGACAGTCCCGTGCCCCGCAGGGCCACTGACTGGAGAGGGGTGCCCTCTGAGATCGACGCTGCCTTCCA 
GGATGCTGATGGCTATGCCTACTTCCTGCGCGGCCGCCTCTACTGGAAGTTTGACCCTGTGAAGGTGAAGGCTCTGG 
AAGGCTTCCCCCGTCTCGTGGGTCCTGACTTCTTTGGCTGTGCCGAGCCTGCCAACACTTTCCTCTGACCATGGCTT 
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GGATGCCCTCAGGGGTGCTGACCCCTGCCAGGCCACGAATATCAGGCTAGAGACCCATGGCCATCTTTGTGGCTGTG 
GGCACCAGGCATGGGACTGAGCCCATGTCTCCTCAGGGGGATGGGGTGGGGTACAACCACCATGACAACTGCCGGGA 
GGGCCACGCAGGTCGTGGTCACCTGCCAGCGACTGTCTCAGACTGGGCAGGGAGGCTTTGGCATGACTTAAGAGGAA 
GGGCAGTCTTGGGCCCGCTATGCAGGTCCTGGCAAACCTGGCTGCCCTGTCTCCATCCCTGTCCCTCAGGGTAGCAC 
CATGGCAGGACTGGGGGAACTGGAGTGTCCTTGCTGTATCCCTGTTGTGAGGTTCCTTCCAGGGGCTGGCACTGAAG 
CAAGGGTGCTGGGGCCCCATGGCCTTCAGCCCTGGCTGAGCAACTGGGCTGTAGGGCAGGGCCACTTCCTGAGGTCA 
GGTCTTGGTAGGTGCCTGCATCTGTCTGCCTTCTGGCTGACAATCCTGGAAATCTGTTCTCCAGAATCCAGGCCAAA 
AAGTTCACAGTCAAATGGGGAGGGGTATTCTTCATGCAGGAGACCCCAGGCCCTGGAGGCTGCAACATACCTCAATC 
CTGTCCCAGGCCGGATCCTCCTGAAGCCCTTTTCGCAGCACTGCTATCCTCCAAAGCCATTGTAAATGTGTGTACAG 
TGTGTATAAACCTTCTTCTTCTTTTTTTTTTTTAAACTGAGGATTGTCATTAAACACAGTTGTTTTCTACCTGCC 



<210> SEQ ID NO 131 

<211> Length : 1,266 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 131 
>HUMTREFAC_PEA_2_T4 

CGCTCCCCAGTAGAGGACCCGGAACCAGAACTGGAATCCGCCCTTACCGCTTGCTGCCAAAACAGTGGGGGCTGAAC 
TGACCTCTCCCCTTTGGGAGAGAAAAACTGTCTGGGAGCTTGACAAAGGCATGCAGGAGAGAACAGGAGCAGCCACA 
GCCAGGAGGGAGAGCCTTCCCCAAGCAAACAATCCAGAGCAGCTGTGCAAACAACGGTGCATAAATGAGGCCTCCTG 
GACCATGAAGCGAGTCCTGAGCTGCGTCCCGGAGCCCACGGTGGTCATGGCTGCCAGAGCGCTCTGCATGCTGGGGC 
TGGTCCTGGCCTTGCTGTCCTCCAGCTCTGCTGAGGAGTACGTGGGCCTGTGGAAAGTGCATCTTCCTAAGGGCGAG 
GGTTTCAGCAGTGGTTGAACTCGGCGGGGTGGGGCGGAGCGGGAGGATGCAAACTTGCAAAGTGAAGCAAACACACT 
CACCGCAGCCCAGCAAGGGCTCTGGCAGCTGACAGGGCTTTGTCTGGGACAGCTGCAAACCAGTGTGCCGTGCCAGC 
CAAGGACAGGGTGGACTGCGGCTACCCCCATGTCACCCCCAAGGAGTGCAACAACCGGGGCTGCTGCTTTGACTCCA 
GGATCCCTGGAGTGCCTTGGTGTTTCAAGCCCCTGCAGGAAGCAGAATGCACCTTCTGAGGCACCTCCAGCTGCCCC 
CGGCCGGGGGATGCGAGGCTCGGAGCACCCTTGCCCGGCTGTGATTGCTGCCAGGCACTGTTCATCTCAGCTTTTCT 
GTCCCTTTGCTCCCGGCAAGCGCTTCTGCTGAAAGTTCATATCTGGAGCCTGATGTCTTAACGAATAAAGGTCCCAT 
GCTCCACCCGAGGACAGTTCTTCGTGCCTGAGACTTTCTGAGGTTGTGCTTTATTTCTGCTGCGTCGTGGGAGAGGG 
CGGGAGGGTGTCAGGGGAGAGTCTGCCCAGGCCTCAAGGGCAGGAAAAGACTCCCTAAGGAGCTGCAGTGCATGCAA 
GGATATTTTGAATCCAGACTGGCACCCACGTCACAGGAAAGCCTAGGAACACTGTAAGTGCCGCTTCCTCGGGAAAG 
CAGAAAAAATACATTTCAGGTAGAAGTTTTCAAAAATCACAAGTCTTTCTTGGTGAAGACAGCAAGCCAATAAAACT 



WO 2006/131783 



PCT/IB2005/004037 



151 

GTCTTCCAAAGTGGTCCTTTATTTCACAACCACTCTCGCTACTGTTCAATACTTGTACTATTCCTGGGTTTTGTTTC 
TTTGTACAGTAAACATTATGAACAAACAGGCAAA 

<210> SEQ ID NO 132 

<211> Length : 747 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 132 
>HUMTRE FAC_PE A_2 _T 5 

CGCTCCCCAGTAGAGGACCCGGAACCAGAACTGGAATCCGCCCTTACCGCTTGCTGCCAAAACAGTGGGGGCTGAAC 
TGACCTCTCCCCTTTGGGAGAGAAAAACTGTCTGGGAGCTTGACAAAGGCATGCAGGAGAGAACAGGAGCAGCCACA 
GCCAGGAGGGAGAGCCTTCCCCAAGCAAACAATCCAGAGCAGCTGTGCAAACAACGGTGCATAAATGAGGCCTCCTG 
GACCATGAAGCGAGTCCTGAGCTGCGTCCCGGAGCCCACGGTGGTCATGGCTGCCAGAGCGCTCTGCATGCTGGGGC 
TGGTCCTGGCCTTGCTGTCCTCCAGCTCTGCTGAGGAGTACGTGGGCCTGTCCCAGCAAGGGCTCTGGCAGCTGACA 
GGGCTTTGTCTGGGACAGCTGCAAACCAGTGTGCCGTGCCAGCCAAGGACAGGGTGGACTGCGGCTACCCCCATGTC 
ACCCCCAAGGAGTGCAACAACCGGGGCTGCTGCTTTGACTCCAGGATCCCTGGAGTGCCTTGGTGTTTCAAGCCCCT 
GCAGGAAGCAGAATGCACCTTCTGAGGCACCTCCAGCTGCCCCCGGCCGGGGGATGCGAGGCTCGGAGCACCCTTGC 
CCGGCTGTGATTGCTGCCAGGCACTGTTCATCTCAGCTTTTCTGTCCCTTTGCTCCCGGCAAGCGCTTCTGCTGAAA 
GTTCATATCTGGAGCCTGATGTCTTAACGAATAAAGGTCCCATGCTCCACCCGA 

<210> SEQ ID NO 133 

<211> Length : 2,201 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 133 
>HSS100PCB_T1 

TGAGACAAGATGTCACTCTGTCACCCAGGCTGGAGTGCAGTGGCAGGATCACGGCTCACTGCAGCCTCGACCTCCCT 
GGGCTCAGGTGATCCTCCCACCTCAGCCTACCGAGTAGCTGGGACTACAGGTGCATGTCACCATACCCGGTTAATTT 
TTGTATTTTTTTTAGAGACAAGGTCTCACCATGTTGCCCAGGCTGGTCTCAAACTCCTGTGCTCAGGCAATGCGTCA 
GCCTCGACATCTCAAAGTGCTGTGATTACAGGCGTGAGCCCCGACACCTGGCCTAGTTCTATTTTCTAAATGTGAAT 
TCTGTAAAGATATCTTTTAAAAATAAAGTTCTGTTTTTGGTAGAAAATGTAAAAATAGATAAATATGGAGGGAAGAA 
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ATCCCCCCTGGAATACAGACGCTTCCTCTCCCTTCCAGCCTTTTCCCCATATGAACATTGCTGTGAGTGAGACTTAC 

ATGCAATGTAATTTCTTTTTGAGCTTAACATTACAACATAAATTCTCAAACTCTGATGTTCATTAAACACCCCAGCC 

CCATCCTGGGAACTTGGGCTTGGGGCTCGGGGTGTTCTGATAATGATCAAAGTATGAGAATTGAACCCATGAGGACT 

TTGATCCAAGATACTGGGGTGTGGGGAGGGGCAGGCACAGGTGTCCTGGGAACACACTTTGAGAAGCAATGGCAAAG 

CTGGGGGTCCAGCTAATGTGTTACATTAGAATCACCTCGGGGAGGCCCTGGGTGCCCTTCTCAGCCCTCCCTCCGGA 

GGCTGCTGAAGCCCAGCAAAGCCGGAGTCAGAGAACAATGTCCGCCTGAGGGCAGGGCTGGGCTGGGCTGGCCTTCT 

GGCCCTATCTGCTCCGTGCCCAACCCAGCGCCCCGCACAGTCGGAGCTTTGTAAATACGAGGTGACTGTCTGCCTAC 

AAACTTTGTAAACATCACTTGAAATGGCCGCAGGGCATTGCGACATGGCCATACCACTATTTGTTTGCTATTGAATT 

TGTACTTCCCTGCCTTACTTTTGCTATTGCAAACCATGCTGTCACTAAGGTCTTCATGCACACAGTTGTGTCTTGGT 

CAGATGATATGTTTCTACCAATTTTAATTGTGTTTCTTTCCACCTGGACACACAGCTCTCTGGCCCAGGGCTGGGTC 

ATCAGCACACCCTGCTGCTGCTGTTCAGATCTGCATCCTGGTCCCGCTTGGTCCCACAGTGAGAACGCTTTGCTATC 

ACATGGGCAGGCTCTGAGAGCCCTGCCGGCCTGGCCTTCTCAAAGAAGACCTGAGAGCTTGGGACCCAAGCAGAGAG 

GAAGAACAGGGCTCAGGGTGCTTGCTCCATGCTCGCTCCACACCTGGGGCTCAACCCTGGCTTTCCCCGGCTCCCTG 

TGTGACTTCAGGGCAGGTCCCTTGGGCCCTCTGGGCCTTATCATCTTCATCTGTAACAGGGCGATGCCTCTGCCGTG 

TCTGGTGGTGTTGAGGAGTTCCTGTTTGTGTAAGCAGCTAGTTCAGTGCCAGCACGAGATGGGAGGCCCATGAAGTT 

AGCAGTGCACAAAAAATAGAGCAAAGACTGGATGCATTTCCTGAGAACAACCATCACTGTAAAGCACTTTACAAATC 

CAAAGACAACCCCCGGCAAAAACTCAAAATGAAACTCCCTCTCGCAGAGCACAATTCCAATTCGCTCTAAAAACATT 

ACAAGTTAGTTCATGTCATGCCAGATAGCTGAAGGCAGCTCACAAGTTCTTAAGGCCAGGAATGCCATGTGTCTGCT 

ATGCACAGCTGGCCCTGGCCCTGAGCCTGAATGACAGCACAAAGGTGACGCAGATGTGGGTGCCCTGCTCCTGCCCA 

GCAGCAGTGCTTGGTGGAGGCTGAGGCCCTGCACAGGCACCCTCACTGCTGACCTTGAGCCTCTCTCTCCTCTAGAG 

TGGAAAAGACAAGGATGCCGTGGATAAATTGCTCAAGGACCTGGACGCCAATGGAGATGCCCAGGTGGACTTCAGTG 

AGTTCATCGTGTTCGTGGCTGCAATCACGTCTGCCTGTCACAAGTACTTTGAGAAGGCAGGACTCAAATGATGCCCT 

GGAGATGTCACAGATTCCTGGCAGAGCCATGGTCCCAGGCTTCCCAAAAGTGTTTGTTGGCAATTATTCCCCTAGGC 

TGAGCCTGCTCATGTACCTCTGATTAATAAATGCTTATGAAATGA 



<210> SEQ ID NO 134 

<211> Length : 5,503 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 134 
>R20779_T7 

CTGTCCCCGGCCCCGGCGCGGGGGAGACGTGAGCGTGCACACGTACACACACAGCAGGGGAAGAGGCGCTCCAAGCG 
GCGCCCAACTTTCTCCTTCCCTCCACGGGCCGGGTGAGAAAGTAGCCGGGGGCTATCCCGACCCGGCGGTTCTTGGG 
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GAGGGGGCCGAACAAGAAAAGGGAGGAGATGGAGATAACTTCCCCGGATTTAGCTTTTTTGTCTTTGTTTTTGTTCT 

CACCACTTCCATCGGATGACTGGAGAGTAAAAGGGAACCCGGAGCGGGGTGGCGAGCAGCGCTTTGAGAAAATGCAG 

GAGTGTGTTTGGAGACGCGTAAAGTTGCCTTTCAAGCTCTGGCCTCCGGGCACGCGATGCTCCGCGGCGGGCTGACT 

CAGGGCTGCCTTGGGCCTCCCTGCCACCCTCCTGGAAATGATGCAAGTCCTGACTGTCACCTGGATCCCTGCAGCCC 

AGCCTGGAATGCGTCTGGATTAGGGGAAAGACGAGAAACGACACTCCAGGTGTTGCACGGCCCACCAAAGCGGGAAG 

ATAGGGCAGTTGCTCAGACCAAATACTGTATCTAGTGCTTCTGCTCCTATCTTCAATCGTGGGGTTCTTTTTAATGC 

AAAGTGTCACAAGGCCAGGAATTCCCATGTGTGCTCAGTTGGCCCACAGCATCATTGTGCCTAGGAAACTGCTTCAA 

TTTATCAAGTCCTCTGGGCTGGGAATCTCACTGAATTCCAAACGGCGGAAAGAGGAAACTTTCCCAACCCGATGTGG 

GTGTGACGCGAGCCAGGGGCCCCAGGGACACTGTCCCAGAGCACACCGTCCCCCTTTAACAGCAACTGGAGCTTGGA 

TTCGCTCTTATATTGTACAGTCCTTTCGACCATTGCCCTGGAGCACCCGCACACGCGCACGCATCTCCGGCCGCGCT 

CACACACACTCATACACACGCACGCAAACGCGTGGCCGCCGCCAGGTCGGCAACTTTGTCCGGCGCTCCCAGCGGCG 

CTCGGCTTCCTCCTGTAGTAGTTGAGCGCAGGCCCCGCCTCCCGGCCGTGTTGTCAAAAGGGCCGGGGTCTCGGATT 

GGTCCAGCCGCCGGGACAACACCTGCTCGACTCCTTCATTCAAGTGACACCAGAGCTTCCAGGGATATTTGAGGCAC 

CATCCCTGCCATTGCCGGGCACTCGCGGCGCTGCTAACGGCCTGGTCACATGCTCTCCGGAGAGCTACGGGAGGGCG 

CTGGGTAACCTCTATCCGAGCCGCGGCCGCGAGGAGGAGGGAAAAGGCGAGCAAAAAGGAAGAGTGGGAGGAGGAGG 

GGAAGCGGCGAAGGAGGAAGAGGAGGAGGAGGAAGAGGGGAGCACAAAGGATCCAGGTCTCCCGACGGGAGGTTAAT 

ACCAAGAACCATGTGTGCCGAGCGGCTGGGCCAGTTCATGACCCTGGCTTTGGTGTTGGCCACCTTTGACCCGGCGC 

GGGGGACCGACGCCACCAACCCACCCGAGGGTCCCCAAGACAGGAGCTCCCAGCAGAAAGGCCGCCTGTCCCTGCAG 

AATACAGCGGAGATCCAGCACTGTTTGGTCAACGCTGGCGATGTGGGGTGTGGCGTGTTTGAATGTTTCGAGAACAA 

CTCTTGTGAGATTCGGGGCTTACATGGGATTTGCATGACTTTTCTGCACAACGCTGGAAAATTTGATGCCCAGGGCA 

AGTCATTCATCAAAGACGCCTTGAAATGTAAGGCCCACGCTCTGCGGCACAGGTTCGGCTGCATAAGCCGGAAGTGC 

CCGGCCATCAGGGAAATGGTGTCCCAGTTGCAGCGGGAATGCTACCTCAAGCACGACCTGTGCGCGGCTGCCCAGGA 

GAACACCCGGGTGATAGTGGAGATGATCCATTTCAAGGACTTGCTGCTGCACGAATGCTACAAGATAGAAATTACTA 

TGCCCAAGAGGAGGAAAGTGAAGCTAAGAGATTAGAGAACTCGGACTGAGACCCTACGTGGACCTCGTGAACTTGCT 

GCTGACCTGTGGGGAGGAGGTGAAGGAGGCCATCACCCACAGCGTGCAGGTTCAGTGTGAGCAGAACTGGGGAAGCC 

TGTGCTCCATCTTGAGCTTCTGCACCTCGGCCATCCAGAAGCCTCCCACGGCGCCCCCCGAGCGCCAGCCCCAGGTG 

GACAGAACCAAGCTCTCCAGGGCCCACCACGGGGAAGCAGGACATCACCTCCCAGAGCCCAGCAGTAGGGAGACTGG 

CCGAGGTGCCAAGGGTGAGCGAGGTAGCAAGAGCCACCCAAACGCCCATGCCCGAGGCAGAGTCGGGGGCCTTGGGG 

CTCAGGGACCTTCCGGAAGCAGCGAGTGGGAAGACGAACAGTCTGAGTATTCTGATATCCGGAGGTGAAATGAAAGG 

CCTGGCCACGAAATCTTTCCTCCACGCCGTCCATTTTCTTATCTATGGACATTCCAAAACATTTACCATTAGAGAGG 

GGGGATGTCACACGCAGGATTCTGTGGGGACTGTGGACTTCATCGAGGTGTGTGTTCGCGGAACGGACAGGTGAGAT 

GGAGACCCCTGGGGCCGTGGGGTCTCAGGGGTGCCTGGTGAATTCTGCACTTACACGTACTCAAGGGAGCGCGCCCG 

CGTTATCCTCGTACCTTTGTCTTCTTTCCATCTGTGGAGTCAGTGGGTGTCGGCCGCTCTGTTGTGGGGGAGGTGAA 

CCAGGGAGGGGCAGGGCAAGGCAGGGCCCCCAGAGCTGGGCCACACAGTGGGTGCTGGGCCTCGCCCCGAAGCTTCT 

GGTGCAGCAGCCTCTGGTGCTGTCTCCGCGGAAGTCAGGGCGGCTGGATTCCAGGACAGGAGTGAATGTAAAAATAA 

ATATCGCTTAGAATGCAGGAGAAGGGTGGAGAGGAGGCAGGGGCCGAGGGGGTGCTTGGTGCCAAACTGAAATTCAG 

TTTCTTGTGTGGGGCCTTGCGGTTCAGAGCTCTTGGCGAGGGTGGAGGGAGGAGTGTCATTTCTATGTGTAATTTCT 

GAGCCATTGTACTGTCTGGGCTGGGGGGGACACTGTCCAAGGGAGTGGCCCCTATGAGTTTATATTTTAACCACTGC 
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TTCAAATCTCGATTTCACTTTTTTTATTTATCCAGTTATATCTACATATCTGTCATCTAAATAAATGGCTTTCAAAC 

AAAGCAACTGGGTCATTAAAACCAGCTCAAAGGGGGTTTAAAAAAAAAAAACCAGCCCATCCTTTGAGGCTGATTTT 

TCTTTTTTTTAAGTTCTATTTTAAAAGCTATCAAACAGCGACATAGCCATACATCTGACTGCCTGACATGGACTCCT 

GCCCACTTGGGGGAAACCTTATACCCAGAGGAAAATACACACCTGGGGAGTACATTTGACAAATTTCCCTTAGGATT 

TCGTTATCTCACCTTGACCCTCAGCCAAGATTGGXAAAGCTGCGTCCTGGCGATTCCAGGAGACCCAGCTGGAAACC 

TGGCTTCTCCATGTGAGGGGATGGGAAAGGAAAGAAGAGAATGAAGACTACTTAGTAATTCCCATCAGGAAATGCTG 

ACCTTTTACATAAAATCAAGGAGACTGCTGAAAATCTCTAAGGGACAGGATTTTCCAGATCCTAATTGGAAATTTAG 

CAATAAGGAGAGGAGTCCAAGGGGACAAATAAAGGCAGAGAGAAGAGACAGAACTAAAAATACGAGGAAAGGAGAGT 

GAGGATTTTCATTAAAAGTCTCAGCAGTGGGTTTCTTGGGTTATTTAAAACATCACCTAAATAGGCCTTTTCTTCCT 

AATTGGCCATCAAATTAAAGCCTATCCTTTCTCAAGCAGGAGCTGGTATTGTAGGGAGTGGCCGGGTATTCTGGGCT 

GGGCTCTTCTGGAGTAGGGGGTCAGCAAACATTGTCTGCAAAGGGCCAGATACTGAATCCAGTACTTTCAGTTTGGC 

GAGCCGTGAGGTCTCTGTCGAAACTACTCAACTCTGCCGTCCTAGCACAAAAGCAGCCATAGACAACACACAAACGA 

GAGGGCTTGGCTCCCTTCCAGGAAGATTTATTTAACAGGCTCCCAGCTGAAXTTCACTCACAGGACACAGTTTACTG 

ATCTCTGTTCTAGTGAGTGGGTCAAAAAGCATATGCATCCTTATCCGTCAACTCATCAGCTCTTCCTCAAGGCAACC 

TGAGGCCAGACACCAAGAAACCAAGCGTATCTGCTCTAAAATGACTTGTTCCTGGGGAATGCCTTCAACCAAAACAC 

AGCTAGTATTTCTATGCCCCAAATCCAATCCCAGTCTTTCATGATCCATGCCGGCGGTTGGGTGGGGAGGGGAATCA 

TTGGTTGGGGGAAGGGAGGAAACCCCACCTCCAGCCCCCGCCACCGGGCTCCCTGGGCACCCAGCAAGATCTGGGGC 

TGCAGAGAACAGAAGAGCTGGTGCACTTAATCCAGCTCTGCCCTTGGGGGGAGGAGGACCTGTGTGTCAGGCTCTGC 

CATGGGAACGAGTGTAAACCGTGGCTGTCTCCTGCAGTGAGCCACCGCGGCAGGCACGTTGACTGTTTTACTGACAT 

CACTCAAAAGCTAAAGCAATAACATTCTCCTGCGTTGCTGAGTCAGCTGTTCATTTGTCCGCCAGCTCCTGGACTGG 

ATGTGTGAAAGGCATCACATTTCCATTTTCCTCCGTGTAAATGTTTTATGTGTTCGCCTACTGATCCCATTCGTTGC 

TTCTATTGTAAATATTTGTCATTTGTATTTATTATCTCTGTGTTTTCCCCCTAAGGCATAAAATGGTTTACTGTGTT 

CATTTGAACCCATTTACTGATCTCTGTTGTATATTTTTCATGCCACTGCTTTGTTTTCTCCTCAGAAGTCGGGTAGA 

TAGCATTTCTATCCCATCCCTCACGTTATTGGAAGCATGCAACAGTATTTATTGCTCAGGGTCTTCTGCTTAAAACT 

GAGGAAGGTCCACATTCCTGCAAGCATTGATTGAGACATTTGCACAATCTAAAATGTAAGCAAAGTAGTCATTAAAA 

ATACACCCTCTACTTGGGCTTTATACTGCATACAAATTTACTCATGAGCCTTCCTTTGAGGAAGGATGTGGATCTCC 

AAATAAAGATTTAGTGTTTATTTTGAGCTCTGCATCTTAACAAGATGATCTGAACACCTCTCCTTTGTATCAATAAA 

TAGCCCTGTTATTCTGAAGTGAGAGGACCAAGTATAGTAAAATGCTGACATCTAAAACTAAATAAATAGAAAACACC 

AGGCCAGAACTATAGTCATACTCACACAAAGGGAGAAATTTAAACTCGAACCAAGCAAAAGGCTTCACGGAAATAGC 

ATGGAAA7VACAATGCTTCCAGTGGCCACTTCCTAAGGAGGAACAACCCCGTCTGATCTCAGAATTGGCACCACGTGA 

GCTTGCTAAGTGATAATATCTGTTTCTACTACGGATTTAGGCAACAGGACCTGTACATTGTCACATTGCATTATTTT 

TCTTCAAGCGTTAATAAAAGTTTTAAATAAATGGCT 



<210> SEQ ID NO 135 
<211> Length : 1,919 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 135 
>R3814 4_PEA_2_T6 

GGATTCCCGGAAGAACCCGCAGCAGCTCCCAGGATGAACTGGTTGCAGTGGCTGCTGCTGCTGCGGGGGCGCTGAGA 

GGACACGAGCTCTATGCCTTTCCGGCTGCTCATCCCGCTCGGCCTCCTGTGCGCGCTGCTGCCTCAGCACCATGGTG 

CGCCAGGTCCCGACGGCTCCGCGCCAGATCCCGCCCACTACAGGGAGCGAGTCAAGGCCATGTTCTACCACGCCTAC 

GACAGCTACCTGGAGAATGCCTTTCCCTTCGATGAGCTGCGACCTCTCACCTGTGACGGGCACGACACCTGGGGCAG 

TTTTTCTCTGACTCTAATTGATGCACTGGACACCTTGCTGATTTTGGGGAATGTCTCAGAATTCCAAAGAGTGGTTG 

AAGTGCTCCAGGACAGCGTGGACTTTGATATTGATGTGAACGCCTCTGTGTTTGAAACAAACATTCGAGTGGTAGGA 

GGACTCCTGTCTGCTCATCTGCTCTCCAAGAAGGCTGGGGTGGAAGTAGAGGCTGGATGGCCCTGTTCCGGGCCTCT 

CCTGAGAATGGCTGAGGAGGCGGCCCGAAAACTCCTCCCAGCCTTTCAGACCCCCACTGGCATGCCATATGGAACAG 

TGAACTTACTTCATGGCGTGAACCCAGGAGAGACCCCTGTCACCTGTACGGCAGGGATTGGGACCTTCATTGTTGAA 

TTTGCCACCCTGAGCAGCCTCACTGGTGACCCGGTGTTCGAAGATGTGGCCAGAGTGGCTTTGATGCGCCTCTGGGA 

GAGCCGGTCAGATATCGGGCTGGTCGGCAACCACATTGATGTGCTCACTGGCAAGTGGGTGGCCCAGGACGCAGGCA 

TCGGGGCTGGCGTGGACTCCTACTTTGAGTACTTGGTGAAAGGAGCCATCCTGCTTCAGGATAAGAAGCTCATGGCC 

ATGTTCCTAGAGTATAACAAAGCCATCCGGAACTACACCCGCTTCGATGACTGGTACCTGTGGGTTCAGATGTACAA 

GGGGACTGTGTCCATGCCAGTCTTCCAGTCCTTGGAGGCCTACTGGCCTGGTCTTCAGAGCCTCATTGGAGACATTG 

ACAATGCCATGAGGACCTTCCTCAACTACTACACTGTATGGAAGCAGTTTGGGGGGCTCCCGGAATTCTACAACATT 

CCTCAGGGATACACAGTGGAGAAGCGAGAGGGCTACCCACTTCGGCCAGAACTTATTGAAAGCGCAATGTACCTCTA 

CCGTGCCACGGGGGATCCCACCCTCCTAGAACTCGGAAGAGATGCTGTGGAATCCATTGAAAAAATCAGCAAGGTGG 

AGTGCGGATTTGCAACACTTGCTTCCTTCTCCCACATGTCAGATCAAAGATCTGCGAGACCACAAGCTGGACAACCG 

CATGGAGTCGTTCTTCCTGGCCGAGACTGTGAAATACCTCTACCTCCTGTTTGACCCAACCAACTTCATCCACAACA 

ATGGGTCCACCTTCGACGCGGTGATCACCCCCTATGGGGAGTGCATCCTGGGGGCTGGGGGGTACATCTTCAACACA 

GAAGCTCACCCCATCGACCCTGCCGCCCTGCACTGCTGCCAGAGGCTGAAGGAAGAGCAGTGGGAGGTGGAGGACTT 

GATGAGGGAATTCTACTCTCTCAAACGGAGCAGGTCGAAATTTCAGAAAAACACTGTTAGTTCGGGGCCATGGGAAC 

CTCCAGCAAGGCCAGGAACACTCTTCTCACCAGAAAACCATGACCAGGCAAGGGAGAGGAAGCCTGCCAAACAGAAG 

GTCCCACTTCTCAGCTGCCCCAGTCAGCCCTTCACCTCCAAGTTGGCATTACTGGGACAGGTTTTCCTAGACTCCTC 

ATAACCACTGGATAATTTTTTTATTTTTATTTTTTTGAGGCTAAACTATAATAAATTGCTTTTGGCTATCA 

<210> SEQ ID NO 136 

<211> Length : 1,743 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 136 
>R3 8 1 4 4_PEA_2_T1 0 

GGATTCCCGGAAGAACCCGCAGCAGCTCCCAGGATGAACTGGTTGCAGTGGCTGCTGCTGCTGCGGGGGCGCTGAGA 
GGACACGAGCTCTATGCCTTTCCGGCTGCTCATCCCGCTCGGCCTCCTGTGCGCGCTGCTGCCTCAGCACCATGGTG 
CGCCAGGTCCCGACGGCTCCGCGCCAGATCCCGCCCACTACAGATTTTGGGGAATGTCTCAGAATTCCAAAGAGTGG 
TTGAAGTGCTCCAGGACAGCGTGGACTTTGATATTGATGTGAACGCCTCTGTGTTTGAAACAAACATTCGAGTGGTA 
GGAGGACTCCTGTCTGCTCATCTGCTCTCCAAGAAGGCTGGGGTGGAAGTAGAGGCTGGATGGCCCTGTTCCGGGCC 
TCTCCTGAGAATGGCTGAGGAGGCGGCCCGAAAACTCCTCCCAGCCTTTCAGACCCCCACTGGCATGCCATATGGAA 
CAGTGAACTTACTTCATGGCGTGAACCCAGGAGAGACCCCTGTCACCTGTACGGCAGGGATTGGGACCTTCATTGTT 
GAATTTGCCACCCTGAGCAGCCTCACTGGTGACCCGGTGTTCGAAGATGTGGCCAGAGTGGCTTTGATGCGCCTCTG 
GGAGAGCCGGTCAGATATCGGGCTGGTCGGCAACCACATTGATGTGCTCACTGGCAAGTGGGTGGCCCAGGACGCAG 
GCATCGGGGCTGGCGTGGACTCCTACTTTGAGTACTTGGTGAAAGGAGCCATCCTGCTTCAGGATAAGAAGCTCATG 
GCCATGTTCCTAGAGTATAACAAAGCCATCCGGAACTACACCCGCTTCGATGACTGGTACCTGTGGGTTCAGATGTA 
CAAGGGGACTGTGTCCATGCCAGTCTTCCAGTCCTTGGAGGCCTACTGGCCTGGTCTTCAGAGCCTCATTGGAGACA 
TTGACAATGCCATGAGGACCTTCCTCAACTACTACACTGTATGGAAGCAGTTTGGGGGGCTCCCGGAATTCTACAAC 
ATTCCTCAGGGATACACAGTGGAGAAGCGAGAGGGCTACCCACTTCGGCCAGAACTTATTGAAAGCGCAATGTACCT 
CTACCGTGCCACGGGGGATCCCACCCTCCTAGAACTCGGAAGAGATGCTGTGGAATCCATTGAAAAAATCAGCAAGG 
TGGAGTGCGGATTTGCAACAATCAAAGATCTGCGAGACCACAAGCTGGACAACCGCATGGAGTCGTTCTTCCTGGCC 
GAGACTGTGAAATACCTCTACCTCCTGTTTGACCCAACCAACTTCATCCACAACAATGGGTCCACCTTCGACGCGGT 
GATCACCCCCTATGGGGAGTGCATCCTGGGGGCTGGGGGGTACATCTTCAACACAGAAGCTCACCCCATCGACCCTG 
CCGCCCTGCACTGCTGCCAGAGGCTGAAGGAAGAGCAGTGGGAGGTGGAGGACTTGATGAGGGAATTCTACTCTCTC 
AAACGGAGCAGGTCGAAATTTCAGAAAAACACTGTTAGTTCGGGGCCATGGGAACCTCCAGCAAGGCCAGGAACACT 
CTTCTCACCAGAAAACCATGACCAGGCAAGGGAGAGGAAGCCTGCCAAACAGAAGGTCCCACTTCTCAGCTGCCCCA 
GTCAGCCCTTCACCTCCAAGTTGGCATTACTGGGACAGGTTTTCCTAGACTCCTCATAACCACTGGATAATTTTTTT 
ATTTTTATTTTTTTGAGGCTAAACTATAATAAATTGCTTTTGGCTATCA 

<210> SEQ ID NO 137 

<211> Length : 1,749 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 137 
>R3 8 1 4 4_PEA_2_T 1 3 

GGATTCCCGGAAGAACCCGCAGCAGCTCCCAGGATGAACTGGTTGCAGTGGCTGCTGCTGCTGCGGGGGCGCTGAGA 
GGACACGAGCTCTATGCCTTTCCGGCTGCTCATCCCGCTCGGCCTCCTGTGCGCGCTGCTGCCTCAGCACCATGGTG 
CGCCAGGTCCCGACGGCTCCGCGCCAGATCCCGCCCACTACAGGGAGCGAGTCAAGGCCATGTTCTACCACGCCTAC 
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GACAGCTACCTGGAGAATGCCTTTCCCTTCGATGAGCTGCGACCTCTCACCTGTGACGGGCACGACACCTGGGGCAG 
TTTTTCTCTGACTCTAATTGATGCACTGGACACCTTGCTGATTTTGGGGAATGTCTCAGAATTCCAAAGAGTGGTTG 
AAGTGCTCCAGGACAGCGTGGACTTTGATATTGATGTGAACGCCTCTGTGTTTGAAACAAACATTCGAGTGGTAGGA 
GGACTCCTGTCTGCTCATCTGCTCTCCAAGAAGGCTGGGGTGGAAGTAGAGGCTGGATGGCCCTGTTCCGGGCCTCT 
CCTGAGAATGGCTGAGGAGGCGGCCCGAAAACTCCTCCCAGCCTTTCAGACCCCCACTGGCATGCCATATGGAACAG 
TGAACTTACTTCATGGCGTGAACCCAGGAGAGACCCCTGTCACCTGTACGGCAGGGATTGGGACCTTCATTGTTGAA 
TTTGCCACCCTGAGCAGCCTCACTGGTGACCCGGTGTTCGAAGATGTGGCCAGAGTGGCTTTGATGCGCCTCTGGGA 
GAGCCGGTCAGATATCGGGCTGGTCGGCAACCACATTGATGTGCTCACTGGCAAGTGGGTGGCCCAGGACGCAGGCA 
TCGGGGCTGGCGTGGACTCCTACTTTGAGTACTTGGTGAAAGGAGCCATCCTGCTTCAGGATAAGAAGCTCATGGCC 
ATGTTCCTAGAGTATAACAAAGCCATCCGGAACTACACCCGCTTCGATGACTGGTACCTGTGGGTTCAGATGTACAA 
GGGGACTGTGTCCATGCCAGTCTTCCAGTCCTTGGAGGCCTACTGGCCTGGTCTTCAGAACTTATTGAAAGCGCAAT 
GTACCTCTACCGTGCCACGGGGGATCCCACCCTCCTAGAACTCGGAAGAGATGCTGTGGAATCCATTGAAAAAATCA 
GCAAGGTGGAGTGCGGATTTGCAACAATCAAAGATCTGCGAGACCACAAGCTGGACAACCGCATGGAGTCGTTCTTC 
CTGGCCGAGACTGTGAAATACCTCTACCTCCTGTTTGACCCAACCAACTTCATCCACAACAATGGGTCCACCTTCGA 
CGCGGTGATCACCCCCTATGGGGAGTGCATCCTGGGGGCTGGGGGGTACATCTTCAACACAGAAGCTCACCCCATCG 
ACCCTGCCGCCCTGCACTGCTGCCAGAGGCTGAAGGAAGAGCAGTGGGAGGTGGAGGACTTGATGAGGGAATTCTAC 
TCTCTCAAACGGAGCAGGTCGAAATTTCAGAAAAACACTGTTAGTTCGGGGCCATGGGAACCTCCAGCAAGGCCAGG 
AACACTCTTCTCACCAGAAAACCATGACCAGGCAAGGGAGAGGAAGCCTGCCAAACAGAAGGTCCCACTTCTCAGCT 
GCCCCAGTCAGCCCTTCACCTCCAAGTTGGCATTACTGGGACAGGTTTTCCTAGACTCCTCATAACCACTGGATAAT 
TTTTTTATTTTTATTTTTTTGAGGCTAAACTATAATAAATTGCTTTTGGCTATCA 

<210> SEQ ID NO 138 

<211> Length : 1,769 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 138 
>R3 814 4_PEA_2_T1 5 

GGATTCCCGGAAGAACCCGCAGCAGCTCCCAGGATGAACTGGTTGCAGTGGCTGCTGCTGCTGCGGGGGCGCTGAGA 
GGACACGAGCTCTATGCCTTTCCGGCTGCTCATCCCGCTCGGCCTCCTGTGCGCGCTGCTGCCTCAGCACCATGGTG 
CGCCAGGTCCCGACGGCTCCGCGCCAGATCCCGCCCACTACAGGGAGCGAGTCAAGGCCATGTTCTACCACGCCTAC 
GACAGCTACCTGGAGAATGCCTTTCCCTTCGATGAGCTGCGACCTCTCACCTGTGACGGGCACGACACCTGGGGCAG 
TTTTTCTCTGACTCTAATTGATGCACTGGACACCTTGCTGATTTTGGGGAATGTCTCAGAATTCCAAAGAGTGGTTG 
AAGTGCTCCAGGACAGCGTGGACTTTGATATTGATGTGAACGCCTCTGTGTTTGAAACAAACATTCGAGTGGTAGGA 
GGACTCCTGTCTGCTCATCTGCTCTCCAAGAAGGCTGGGGTGGAAGTAGAGGCTGGATGGCCCTGTTCCGGGCCTCT 
CCTGAGAATGGCTGAGGAGGCGGCCCGAAAACTCCTCCCAGCCTTTCAGACCCCCACTGGCATGCCATATGGAACAG 
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TGAACTTACTTCATGGCGTGAACCCAGGAGAGACCCCTGTCACCTGTACGGCAGGGATTGGGACCTTCATTGTTGAA 

TTTGCCACCCTGAGCAGCCTCACTGGTGACCCGGTGTTCGAAGATGTGGCCAGAGTGGCTTTGATGCGCCTCTGGGA 

GAGCCGGTCAGATATCGGGCTGGTCGGCAACCACATTGATGTGCTCACTGGCAAGTGGGTGGCCCAGGACGCAGGCA 

TCGGGGCTGGCGTGGACTCCTACTTTGAGTACTTGGTGAAAGGAGCCATCCTGCTTCAGGATAAGAAGCTCATGGCC 

ATGTTCCTAGAGCCTCATTGGAGACATTGACAATGCCATGAGGACCTTCCTCAACTACTACACTGTATGGAAGCAGT 

TTGGGGGGCTCCCGGAATTCTACAACATTCCTCAGGGATACACAGTGGAGAAGCGAGAGGGCTACCCACTTCGGCCA 

GAACTTATTGAAAGCGCAATGTACCTCTACCGTGCCACGGGGGATCCCACCCTCCTAGAACTCGGAAGAGATGCTGT 

GGAATCCATTGAAAAAATCAGCAAGGTGGAGTGCGGATTTGCAACAATCAAAGATCTGCGAGACCACAAGCTGGACA 

ACCGCATGGAGTCGTTCTTCCTGGCCGAGACTGTGAAATACCTCTACCTCCTGTTTGACCCAACCAACTTCATCCAC 

AACAATGGGTCCACCTTCGACGCGGTGATCACCCCCTATGGGGAGTGCATCCTGGGGGCTGGGGGGTACATCTTCAA 

CACAGAAGCTCACCCCATCGACCCTGCCGCCCTGCACTGCTGCCAGAGGCTGAAGGAAGAGCAGTGGGAGGTGGAGG 

ACTTGATGAGGGAATTCTACTCTCTCAAACGGAGCAGGTCGAAATTTCAGAAAAACACTGTTAGTTCGGGGCCATGG 

GAACCTCCAGCAAGGCCAGGAACACTCTTCTCACCAGAAAACCATGACCAGGCAAGGGAGAGGAAGCCTGCCAAACA 

GAAGGTCCCACTTCTCAGCTGCCCCAGTCAGCCCTTCACCTCCAAGTTGGCATTACTGGGACAGGTTTTCCTAGACT 

CCTCATAACCACTGGATAATTTTTTTATTTTTATTTTTTTGAGGCTAAACTATAATAAATTGCTTTTGGCTATCA 

<2X0> SEQ ID NO X39 

<211> Length : 1, 522 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 139 
>R3 8 1 4 4_PEA_2_T 1 9 

GGATTCCCGGAAGAACCCGCAGCAGCTCCCAGGATGAACTGGTTGCAGTGGCTGCTGCTGCTGCGGGGGCGCTGAGA 
GGACACGAGCTCTATGCCTTTCCGGCTGCTCATCCCGCTCGGCCTCCTGTGCGCGCTGCTGCCTCAGCACCATGGTG 
CGCCAGGTCCCGACGGCTCCGCGCCAGATCCCGCCCACTACAGGGAGCGAGTCAAGGCCATGTTCTACCACGCCTAC 
GACAGCTACCTGGAGAATGCCTTTCCCTTCGATGAGCTGCGACCTCTCACCTGTGACGGGCACGACACCTGGGGCAG 
TTTTTCTCTGACTCTAATTGATGCACTGGACACCTTGCTGATTTTGGGGAATGTCTCAGAATTCCAAAGAGTGGTTG 
AAGTGCTCCAGGACAGCGTGGACTTTGATATTGATGTGAACGCCTCTGTGTTTGAAACAAACATTCGAGTGGTAGGA 
GGACTCCTGTCTGCTCATCTGCTCTCCAAGAAGGCTGGGGTGGAAGTAGAGGCTGGATGGCCCTGTTCCGGGCCTCT 
CCTGAGAATGGCTGAGGAGGCGGCCCGAAAACTCCTCCCAGCCTTTCAGACCCCCACTGGCATGCCATATGGAACAG 
TGAACTTACTTCATGGCGTGAACCCAGGAGAGACCCCTGTCACCTGTACGGCAGGGATTGGGACCTTCATTGTTGAA 
TTTGCCACCCTGAGCAGCCTCACTGGTGACCCGGTGTTCGAAGATGTGGCCAGAGTGGCTTTGATGCGCCTCTGGGA 
GAGCCGGTCAGATATCGGGCTGGTCGGCAACCACATTGATGTGCTCACTGGCAAGTGGGTGGCCCAGGACGCAGGCA 
TCGGGGCTGGCGTGGACTCCTACTTTGAGTACTTGGTGAAAGGAGCCATCCTGCTTCAGGATAAGAAGCTCATGGCC 
ATGTTCCTAGAGTATAACAAAGCCATCCGGAACTACACCCGCTTCGATGACTGGTACCTGTGGGTTCAGATGTACAA 
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GGGGACTGTGTCCATGCCAGTCTTCCAGTCCTTGGAGGCCTACTGGCCTGGTCTTCAGAGCCTCATTGGAGACATTG 
ACAATGCCATGAGGACCTTCCTCAACTACTACACTGTATGGAAGCAGTTTGGGGGGCTCCCGGAATTCTACAACATT 
CCTCAGGGATACACAGTGGAGAAGCGAGAGGGCTACCCACTTCGGCCAGAACTTATTGAAAGCGCAATGTACCTCTA 
CCGTGCCACGGGGGATCCCACCCTCCTAGAACTCGGAAGAGATGCTGTGGAATCCATTGAAAAAATCAGCAAGGTGG 
AGTGCGGATTTGCAACAAAAAGATCTCGCTCTGTTGCCCAGGCTGGAGTGCAGTGGTGTGATCACGACTCACCGCAG 
CCTTGACCTCCCACACTCAAGCAATCCTCCTGCCTTAGCCTTCCAAGTAGCTGGAACTCCAGGTGGTGGTTAATTTT 
ATGTGTCAACCTGGCTGGACCACTGGGTACTCCGATATTTGGTCAAACATTATTCTGAG 

<210> SEQ ID NO 140 

<211> Length : 1,414 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 140 
>R3 814 4 JPEA_2_T2 7 

GGATTCCCGGAAGAACCCGCAGCAGCTCCCAGGATGAACTGGTTGCAGTGGCTGCTGCTGCTGCGGGGGCGCTGAGA 
GGACACGAGCTCTATGCCTTTCCGGCTGCTCATCCCGCTCGGCCTCCTGTGCGCGCTGCTGCCTCAGCACCATGGTG 
CGCCAGGTCCCGACGGCTCCGCGCCAGATCCCGCCCACTACAGGGAGCGAGTCAAGGCCATGTTCTACCACGCCTAC 
GACAGCTACCTGGAGAATGCCTTTCCCTTCGATGAGCTGCGACCTCTCACCTGTGACGGGCACGACACCTGGGGCAG 
TTTTTCTCTGACTCTAATTGATGCACTGGACACCTTGCTGATTTTGGGGAATGTCTCAGAATTCCAAAGAGTGGTTG 
AAGTGCTCCAGGACAGCGTGGACTTTGATATTGATGTGAACGCCTCTGTGTTTGAAACAAACATTCGAGAGTATAAC 
AAAGCCATCCGGAACTACACCCGCTTCGATGACTGGTACCTGTGGGTTCAGATGTACAAGGGGACTGTGTCCATGCC 
AGTCTTCCAGTCCTTGGAGGCCTACTGGCCTGGTCTTCAGAGCCTCATTGGAGACATTGACAATGCCATGAGGACCT 
TCCTCAACTACTACACTGTATGGAAGCAGTTTGGGGGGCTCCCGGAATTCTACAACATTCCTCAGGGATACACAGTG 
GAGAAGCGAGAGGGCTACCCACTTCGGCCAGAACTTATTGAAAGCGCAATGTACCTCTACCGTGCCACGGGGGATCC 
CACCCTCCTAGAACTCGGAAGAGATGCTGTGGAATCCATTGAAAAAATCAGCAAGGTGGAGTGCGGATTTGCAACAA 
TCAAAGATCTGCGAGACCACAAGCTGGACAACCGCATGGAGTCGTTCTTCCTGGCCGAGACTGTGAAATACCTCTAC 
CTCCTGTTTGACCCAACCAACTTCATCCACAACAATGGGTCCACCTTCGACGCGGTGATCACCCCCTATGGGGAGTG 
CATCCTGGGGGCTGGGGGGTACATCTTCAACACAGAAGCTCACCCCATCGACCCTGCCGCCCTGCACTGCTGCCAGA 
GGCTGAAGGAAGAGCAGTGGGAGGTGGAGGACTTGATGAGGGAATTCTACTCTCTCAAACGGAGCAGGTCGAAATTT 
CAGAAAAACACTGTTAGTTCGGGGCCATGGGAACCTCCAGCAAGGCCAGGAACACTCTTCTCACCAGAAAACCATGA 
CCAGGCAAGGGAGAGGAAGCCTGCCAAACAGAAGGTCCCACTTCTCAGCTGCCCCAGTCAGCCCTTCACCTCCAAGT 
TGGCATTACTGGGACAGGTTTTCCTAGACTCCTCATAACCACTGGATAATTTTTTTATTTTTATTTTTTTGAGGCTA 
AACTATAATAAATTGCTTTTGGCTATCA 
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<210> SEQ ID NO 141 

<211> Length : 1,846 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 141 
>HUMOSTRO_PEA_l__PEA_l JT1 4 

GTGGCAGAAAACCTCATGACACAATCTCTCCGCCTCCCTGTGTTGGTGGAGGATGTCTGCAGCAGCATTTAAATTCT 

GGGAGGGCTTGGTTGTCAGCAGCAGCAGGAGGAGGCAGAGCACAGCATCGTCGGGACCAGACTCGTCTCAGGCCAGT 

TGCAGCCTTCTCAGCCAAACGCCGACCAAGGAAAACTCACTACCATGAGAATTGCAGTGATTTGCTTTTGCCTCCTA 

GGCATCACCTGTGCCATACCAGTTAAACAGGCTGATTCTGGAAGTTCTGAGGAAAAGCAGCTTTACAACAAATACCC 

AGATGCTGTGGCCACATGGCTAAACCCTGACCCATCTCAGAAGCAGAATCTCCTAGCCCCACAGGTATTTTTAAACT 

TCTCATAATTAAACTACAGTGATGAAAGATAGCCACACTCAGGCCATTTGGGCTGCTCAGATGAATCCTGCCTGCCT 

GCTGGCAAACATGTGCTTAGGACATTGACTGATCTGCCATGTTGGCTTCTCTCTGTGTTAAGCCATCCACAGATGAG 

GCTGAAAAATAAAAACTGCTTTGGATTAAAAAGGTTAACTTTTGAATAAAAAAGCTAGGCATGTGTGATGCGCACTA 

ACACGTGCCATTCCTTCTTCAGAATGCTGTGTCCTCTGAAGAAACCAATGACTTTAAACAAGAGACCCTTCCAAGTA 

AGTCCAACGAAAGCCATGACCACATGGATGATATGGATGATGAAGATGATGATGACCATGTGGACAGCCAGGACTCC 

ATTGACTCGAACGACTCTGATGATGTAGATGACACTGATGATTCTCACCAGTCTGATGAGTCTCACCATTCTGATGA 

ATCTGATGAACTGGTCACTGATTTTCCCACGGACCTGCCAGCAACCGAAGTTTTCACTCCAGTTGTCCCCACAGTAG 

ACACATATGATGGCCGAGGTGATAGTGTGGTTTATGGACTGAGGTCAAAATCTAAGAAGTTTCGCAGACCTGACATC 

CAGTACCCTGATGCTACAGACGAGGACATCACCTCACACATGGAAAGCGAGGAGTTGAATGGTGCATACAAGGCCAT 

CCCCGTTGCCCAGGACCTGAACGCGCCTTCTGATTGGGACAGCCGTGGGAAGGACAGTTATGAAACGAGTCAGCTGG 

ATGACCAGAGTGCTGAAACCCACAGCCACAAGCAGTCCAGATTATATAAGCGGAAAGCCAATGATGAGAGCAATGAG 

CATTCCGATGTGATTGATAGTCAGGAACTTTCCAAAGTCAGCCGTGAATTCCACAGCCATGAATTTCACAGCCATGA 

AGATATGCTGGTTGTAGACCCCAAAAGTAAGGAAGAAGATAAACACCTGAAATTTCGTATTTCTCATGAATTAGATA 

GTGCATCTTCTGAGGTCAATTAAAAGGAGAAAAAATACAATTTCTCACTTTGCATTTAGTCAAAAGAAAAAATGCTT 

TATAGCAAAATGAAAGAGAACATGAAATGCTTCTTTCTCAGTTTATTGGTTGAATGTGTATCTATTTGAGTCTGGAA 

ATAACTAATGTGTTTGATAATTAGTTTAGTTTGTGGCTTCATGGAAACTCCCTGTAAACTAAAAGCTTCAGGGTTAT 

GTCTATGTTCATTCTATAGAAGAAATGCAAACTATCACTGTATTTTAATATTTGTTATTCTCTCATGAATAGAAATT 

TATGTAGAAGCAAACAAAATACTTTTACCCACTTAAAAAGAGAATATAACATTTTATGTCACTATAATCTTTTGTTT 

TTTAAGTTAGTGTATATTTTGTTGTGATTATCTTTTTGTGGTGTGAATAAATCTTTTATCTTGAATGTAATAAGA 

<210> SEQ ID NO 142 
<211> Length : 1, 769 
<212> Type : DNA 
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<213> Organism : Homo sapiens 



<400> sequence : 142 
>HUMOSTRO_PEA_l_PEA_l_Tl 6 

GTGGCAGAAAACCTCATGACACAATCTCTCCGCCTCCCTGTGTTGGTGGAGGATGTCTGCAGCAGCATTTAAATTCT 

GGGAGGGCTTGGTTGTCAGCAGCAGCAGGAGGAGGCAGAGCACAGCATCGTCGGGACCAGACTCGTCTCAGGCCAGT 

TGCAGCCTTCTCAGCCAAACGCCGACCAAGGAAAACTCACTACCATGAGAATTGCAGTGATTTGCTTTTGCCTCCTA 

GGCATCACCTGTGCCATACCAGTTAAACAGGCTGATTCTGGAAGTTCTGAGGAAAAGCAGCACTAAAGATGTACCTA 

CCCCTCCACAACAGATGAAACTGTGCCAGCCAAACAACAAATGGGCATTGTCCCCAGAAGCTTGGACAAAAAGGCAC 

ACAGAGTTCAATTCCAGTTGAACAGAATAAAGGCCAAAATAGAGCTGCCTTGGGGGTCACTGCAATTAGACTGCTTA 

ATGAAGACATTAAAAGAACTTTACAACAAATACCCAGATGCTGTGGCCACATGGCTAAACCCTGACCCATCTCAGAA 

GCAGAATCTCCTAGCCCCACAGAATGCTGTGTCCTCTGAAGAAACCAATGACTTTAAACAAGAGACCCTTCCAAGTA 

AGTCCAACGAAAGCCATGACCACATGGATGATATGGATGATGAAGATGATGATGACCATGTGGACAGCCAGGACTCC 

ATTGACTCGAACGACTCTGATGATGTAGATGACACTGATGATTCTCACCAGTCTGATGAGTCTCACCATTCTGATGA 

ATCTGATGAACTGGTCACTGATTTTCCCACGGACCTGCCAGCAACCGAAGTTTTCACTCCAGTTGTCCCCACAGTAG 

ACACATATGATGGCCGAGGTGATAGTGTGGTTTATGGACTGAGGTCAAAATCTAAGAAGTTTCGCAGACCTGACATC 

CAGTACCCTGATGCTACAGACGAGGACATCACCTCACACATGGAAAGCGAGGAGTTGAATGGTGCATACAAGGCCAT 

CCCCGTTGCCCAGGACCTGAACGCGCCTTCTGATTGGGACAGCCGTGGGAAGGACAGTTATGAAACGAGTCAGCTGG 

ATGACCAGAGTGCTGAAACCCACAGCCACAAGCAGTCCAGATTATATAAGCGGAAAGCCAATGATGAGAGCAATGAG 

CATTCCGATGTGATTGATAGTCAGGAACTTTCCAAAGTCAGCCGTGAATTCCACAGCCATGAATTTCACAGCCATGA 

AGATATGCTGGTTGTAGACCCCAAAAGTAAGGAAGAAGATAAACACCTGAAATTTCGTATTTCTCATGAATTAGATA 

GTGCATCTTCTGAGGTCAATTAAAAGGAGAAAAAATACAATTTCTCACTTTGCATTTAGTCAAAAGAAAAAATGCTT 

TATAGCAAAATGAAAGAGAACATGAAATGCTTCTTTCTCAGTTTATTGGTTGAATGTGTATCTATTTGAGTCTGGAA 

ATAACTAATGTGTTTGATAATTAGTTTAGTTTGTGGCTTCATGGAAACTCCCTGTAAACTAAAAGCTTCAGGGTTAT 

GTCTATGTTCATTCTATAGAAGAAATGCAAACTATCACTGTATTTTAATATTTGTTATTCTCTCATGAATAGAAATT 

TATGTAGAAGCAAACAAAATACTTTTACCCACTTAAAAAGAGAATATAACATTTTATGTCACTATAATCTTTTGTTT 

TTTAAGTTAGTGTATATTTTGTTGTGATTATCTTTTTGTGGTGTGAATAAATCTTTTATCTTGAATGTAATAAGA 

<210> SEQ ID NO 143 

<211> Length : 378 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 143 
>HUMOSTRO_PEA_l PEA 1 T30 
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GTGGCAGAAAACCTCATGACACAATCTCTCCGCCTCCCTGTGTTGGTGGAGGATGTCTGCAGCAGCATTTAAATTCT 
GGGAGGGCTTGGTTGTCAGCAGCAGCAGGAGGAGGCAGAGCACAGCATCGTCGGGACCAGACTCGTCTCAGGCCAGT 
TGCAGCCTTCTCAGCCAAACGCCGACCAAGGAAAACTCACTACCATGAGAATTGCAGTGATTTGCTTTTGCCTCCTA 
GGCATCACCTGTGCCATACCAGTTAAACAGGCTGATTCTGGAAGTTCTGAGGAAAAGCAGGTAAGCATCTTTTATGT 
TTTTATATAGTTAAATCATTTACTCAATTATGGCGAGAGGTGCAAGAAACGTATTTGCTGCGATATTACT 



<210> SEQ ID NO 144 

<211> Length : 1,295 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 144 
>R1 172 3_PE A_1_T 1 5 

AGAAGAGGAAGACAGGAAGGGGGTGGGGATGTGAAGCGACCGTCCCAGCCTTCCCCGCCCGCCACCCCCACCCCAAC 

TCGGCAGCCGTCACGTGATGCCTGGAGTGGGAGGTGGGGAGAAAAGGCGAGACTTTTGTGGGTGCTCCCGATCGCCA 

GTAGTTCCTTCAGTCTCAGCCGCCAACTCCGGAGGCGCGGTGCTCGGCCCGGGAGCGCGAGCGGGAGGAGCAGAGAC 

CCGCAGCCGGGAGCCCGAGCGCGGGCGATGCAGGCTCCGCGAGCGGCACCTGCGGCTCCTCTAAGCTACGACCGTCG 

TCTCCGCGGCAGCAGCGCGGGCCCCAGCAGCCTCGGCAGCCACAGCCGCTGCAGCCGGGGCAGCCTCCGCTGCTGTC 

GCCTCCTCTGATGCGCTTGCCCTCTCCCGGCCCCGGGACTCCGGGAGAATGTGGGTCCTAGGCATCGCGGCAACTTT 

TTGCGGATTGTTCTTGCTTCCAGGCTTTGCGCTGCAAATCCAGTGCTACCAGTGTGAAGAATTCCAGCTGAACAACG 

ACTGCTCCTCCCCCGAGTTCATTGTGAATTGCACGGTGAACGTTCAAGACATGTGTCAGAAAGAAGTGATGGAGCAA 

AGTGCCGGGATCATGTACCGCAAGTCCTGTGCATCATCAGCGGCCTGTCTCATCGCCTCTGCCGGTTCTCCTTGCAG 

AGGACTGGCGCCGGGACGCGAAGAGCAACGGGCGCTGCACAAAGCGGGCGCTGTCGGTGGTGGAGTGCGCATGTACG 

CGCAGGCGCTTCTCGTGGTTGGCGTGCTGCAGCGACAGGCGGCAGCACAGCACCTGCACGAACACCCGCCGAAACTG 

CTGCGAGGACACCGTGTACAGGAGCGGGTTGATGACCGAGCTGAGGTAGAAAAACGTCTCCGAGAAGGGGAGGAGGA 

TCATGTACGCCCGGAAGTAGGACCTCGTCCAGTCGTGCTTGGGTTTGGCCGCAGCCATGATCCTCCGAATCTGGTTG 

GGCATCCAGCATACGGCCAATGTCACAACAATCAGCCCTGGGCAGACACGAGCAGGAGGGAGAGACAGAGAAAAGAA 

AAACACAGCATGAGAACACAGTAAATAAATAAAACCATAAAATATTTAGCCCCTCTGTTCTGTGCTTACTGGCCAGG 

AAATGGTACCAATTTTTCAGTGTTGGACTTGACAGCTTCTTTTGCCACAAGCAAGAGAGAATTTAACACTGTTTCAA 

ACCCGGGGGAGTTGGCTGTGTTAAAGAAAGACCATTAAATGCTTTAGACAGTGTATTTATACC 



<210> SEQ ID NO 145 
<211> Length : 1,367 
<212> Type : DNA 
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<213> Organism : Homo sapiens 



<400> sequence : 145 
>R1 1 7 2 3_PEA_1_T1 7 

AGAAGAGGAAGACAGGAAGGGGGTGGGGATGTGAAGCGACCGTCCCAGCCTTCCCCGCCCGCCACCCCCACCCCAAC 

TCGGCAGCCGTCACGTGATGCCTGGAGTGGGAGGTGGGGAGAAAAGGCGAGACTTTTGTGGGTGCTCCCGATCGCCA 

GTAGTTCCTTCAGTCTCAGCCGCCAACTCCGGAGGCGCGGTGCTCGGCCCGGGAGCGCGAGCGGGAGGAGCAGAGAC 

CCGCAGCCGGGAGCCCGAGCGCGGGCGATGCAGGCTCCGCGAGCGGCACCTGCGGCTCCTCTAAGCTACGACCGTCG 

TCTCCGCGGCAGCAGCGCGGGCCCCAGCAGCCTCGGCAGCCACAGCCGCTGCAGCCGGGGCAGCCTCCGCTGCTGTC 

GCCTCCTCTGATGCGCTTGCCCTCTCCCGGCCCCGGGACTCCGGGAGAATGTGGGTCCTAGGCATCGCGGCAACTTT 

TTGCGGATTGTTCTTGCTTCCAGGCTTTGCGCTGCAAATCCAGTGCTACCAGTGTGAAGAATTCCAGCTGAACAACG 

ACTGCTCCTCCCCCGAGTTCATTGTGAATTGCACGGTGAACGTTCAAGACATGTGTCAGAAAGAAGTGATGGAGCAA 

AGTGCCGGGTCTCACTGTGTCACGAGGCTGGAGTGCAGTGGAACAATTTCAGCACACTGCAACCTCTGCCTCCCAGG 

CTCAAATGATCATCCCACCTAAGCCTCCGGAGTAGCTGGGACCACAGGCAAGCGCCACCATGCCCAGCTGATACCAA 

TGTCTTTTAAAAAATGTTGTATGTGGAAATAAATTGAGACTTATAGAAAAGCTGCAAAAATAGTGCAGTTTCTATAT 

ATCCTTCCCCCATCTTTGGCTAGTGTTAACAATCTACATAACCGCAGTACGATGATCAAGGCTAGGAAATTAACATT 

GGCACAGTACTGTTAATGAAACCATGCTTTGTTTTGAGATTCCCACAGTTTTGCCTTTTTCTGTTCCAAGATCCTAT 

CCAGGATCCCACGTTGCATTTCATTGTCATGTCTCCTTCTCCTCTAACCTCTGACAATGCATCATTCTTTCCATGTC 

TTTTGTGATGTTGACACTTTTGAAGAGGACTGGTCCAGATTTTTGTACACTGTCCCTCAGTTTGGGATTGTCTGCTG 

TTTTCTCATGAACAGATAGAGGTTTTGCATTTTTGACAAGAATCCTCAGAAGAGATGCACCCTTCTCAGTGCACTGT 

AGCAAGGGGCGCATGCTGTCAATGTCTTACTGGTGATGTTAACTTTGATCGCTTTTGATTCAGATAGTATCTGCTGG 

GTTTTTCCACTGTAAAGTTACTATTTTTTCCATTGTAATTAATAAATAACTTGAGGGA 

<210> SEQ ID NO 146 

<211> Length : 1,520 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 146 
>R 1 1 7 2 3_PEA_1_T 1 9 

AGAAGAGGAAGACAGGAAGGGGGTGGGGATGTGAAGCGACCGTCCCAGCCTTCCCCGCCCGCCACCCCCACCCCAAC 
TCGGCAGCCGTCACGTGATGCCTGGAGTGGGAGGTGGGGAGAAAAGGCGAGACTTTTGTGGGTGCTCCCGATCGCCA 
GTAGTTCCTTCAGTCTCAGCCGCCAACTCCGGAGGCGCGGTGCTCGGCCCGGGAGCGCGAGCGGGAGGAGCAGAGAC 
CCGCAGCCGGGAGCCCGAGCGCGGGCGATGCAGGCTCCGCGAGCGGCACCTGCGGCTCCTCTAAGCTACGACCGTCG 
TCTCCGCGGCAGCAGCGCGGGCCCCAGCAGCCTCGGCAGCCACAGCCGCTGCAGCCGGGGCAGCCTCCGCTGCTGTC 
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GCCTCCTCTGATGCGCTTGCCCTCTCCCGGCCCCGGGACTCCGGGAGAATGTGGGTCCTAGGCATCGCGGCAACTTT 
TTGCGGATTGTTCTTGCTTCCAGGCTTTGCGCTGCAAATCCAGTGCTACCAGTGTGAAGAATTCCAGCTGAACAACG 
ACTGCTCCTCCCCCGAGTTCATTGTGAATTGCACGGTGAACGTTCAAGACATGTGTCAGAAAGAAGTGATGGAGCAA 
AGTGCCGACACTAAAAGAACAAACACCTTGCTCTTCGAGATGAGACATTTTGCCAAGCAGTTGACCACTTAGTTCTC 
AAGAAGCAACTATCTCTTTCATGTGCCTTCTGAGGAAGTATTCAGAGGGGGAATATCAAATGTCTTTCCCTTGGACT 
CTCCCAGGTCTCACTGTGTCACGAGGCTGGAGTGCAGTGGAACAATTTCAGCACACTGCAACCTCTGCCTCCCAGGC 
TCAAATGATCATCCCACCTAAGCCTCCGGAGTAGCTGGGACCACAGGCAAGCGCCACCATGCCCAGCTGATACCAAT 
GTCTTTTAAAAAATGTTGTATGTGGAAATAAATTGAGACTTATAGAAAAGCTGCAAAAATAGTGCAGTTTCTATATA 
TCCTTCCCCCATCTTTGGCTAGTGTTAACAATCTACATAACCGCAGTACGATGATCAAGGCTAGGAAATTAACATTG 
GCACAGTACTGTTAATGAAACCATGCTTTGTTTTGAGATTCCCACAGTTTTGCCTTTTTCTGTTCCAAGATCCTATC 
CAGGATCCCACGTTGCATTTCATTGTCATGTCTCCTTCTCCTCTAACCTCTGACAATGCATCATTCTTTCCATGTCT 
TTTGTGATGTTGACACTTTTGAAGAGGACTGGTCCAGATTTTTGTACACTGTCCCTCAGTTTGGGATTGTCTGCTGT 
TTTCTCATGAACAGATAGAGGTTTTGCATTTTTGACAAGAATCCTCAGAAGAGATGCACCCTTCTCAGTGCACTGTA 
GCAAGGGGCGCATGCTGTCAATGTCTTACTGGTGATGTTAACTTTGATCGCTTTTGATTCAGATAGTATCTGCTGGG 
TTTTTCCACTGTAAAGTTACTATTTTTTCCATTGTAATTAATAAATAACTTGAGGGA 

<210> SEQ ID NO 147 

<211> Length : 1, 371 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 147 
>R1 1 7 23_PEA_1_T2 0 

AGAAGAGGAAGACAGGAAGGGGGTGGGGATGTGAAGCGACCGTCCCAGCCTTCCCCGCCCGCCACCCCCACCCCAAC 
TCGGCAGCCGTCACGTGATGCCTGGAGTGGGAGGTGGGGAGAAAAGGCGAGACTTTTGTGGGTGCTCCCGATCGCCA 
GTAGTTCCTTCAGTCTCAGCCGCCAACTCCGGAGGCGCGGTGCTCGGCCCGGGAGCGCGAGCGGGAGGAGCAGAGAC 
CCGCAGCCGGGAGCCCGAGCGCGGGCGATGCAGGCTCCGCGAGCGGCACCTGCGGCTCCTCTAAGCTACGACCGTCG 
TCTCCGCGGCAGCAGCGCGGGCCCCAGCAGCCTCGGCAGCCACAGCCGCTGCAGCCGGGGCAGCCTCCGCTGCTGTC 
GCCTCCTCTGATGCGCTTGCCCTCTCCCGGCCCCGGGACTCCGGGAGAATGTGGGTCCTAGGCATCGCGGCAACTTT 
TTGCGGATTGTTCTTGCTTCCAGGCTTTGCGCTGCAAATCCAGTGCTACCAGTGTGAAGAATTCCAGCTGAACAACG 
ACTGCTCCTCCCCCGAGTTCATTGTGAATTGCACGGTGAACGTTCAAGACATGTGTCAGAAAGAAGTGATGGAGCAA 
AGTGCCGACAGGGTCTCACTGTGTCACGAGGCTGGAGTGCAGTGGAACAATTTCAGCACACTGCAACCTCTGCCTCC 
CAGGCTCAAATGATCATCCCACCTAAGCCTCCGGAGTAGCTGGGACCACAGGCAAGCGCCACCATGCCCAGCTGATA 
CCAATGTCTTTTAAAAAATGTTGTATGTGGAAATAAATTGAGACTTATAGAAAAGCTGCAAAAATAGTGCAGTTTCT 
ATATATCCTTCCCCCATCTTTGGCTAGTGTTAACAATCTACATAACCGCAGTACGATGATCAAGGCTAGGAAATTAA 
CATTGGCACAGTACTGTTAATGAAACCATGCTTTGTTTTGAGATTCCCACAGTTTTGCCTTTTTCTGTTCCAAGATC 
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CTATCCAGGATCCCACGTTGCATTTCATTGTCATGTCTCCTTCTCCTCTAACCTCTGACAATGCATCATTCTTTCCA 
TGTCTTTTGTGATGTTGACACTTTTGAAGAGGACTGGTCCAGATTTTTGTACACTGTCCCTCAGTTTGGGATTGTCT 
GCTGTTTTCTCATGAACAGATAGAGGTTTTGCATTTTTGACAAGAATCCTCAGAAGAGATGCACCCTTCTCAGTGCA 
CTGTAGCAAGGGGCGCATGCTGTCAATGTCTTACTGGTGATGTTAACTTTGATCGCTTTTGATTCAGATAGTATCTG 
CTGGGTTTTTCCACTGTAAAGTTACTATTTTTTCCATTGTAATTAATAAATAACTTGAGGGA 

<210> SEQ ID NO 148 

<211> Length : 2, 213 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 148 
>R1 17 2 3_PEA_1_T5 

AGAAGAGGAAGACAGGAAGGGGGTGGGGATGTGAAGCGACCGTCCCAGCCTTCCCCGCCCGCCACCCCCACCCCAAC 
TCGGCAGCCGTCACGTGATGCCTGGAGTGGGAGGTGGGGAGAAAAGGCGAGACTTTTGTGGGTGCTCCCGATCGCCA 
GTAGTTCCTTCAGTCTCAGCCGCCAACTCCGGAGGCGCGGTGCTCGGCCCGGGAGCGCGAGCGGGAGGAGCAGAGAC 
CCGCAGCCGGGAGCCCGAGCGCGGGCGATGCAGGCTCCGCGAGCGGCACCTGCGGCTCCTCTAAGCTACGACCGTCG 
TCTCCGCGGCAGCAGCGCGGGCCCCAGCAGCCTCGGCAGCCACAGCCGCTGCAGCCGGGGCAGCCTCCGCTGCTGTC 
GCCTCCTCTGATGCGCTTGCCCTCTCCCGGCCCCGGGACTCCGGGAGAATGTGGGTCCTAGGCATCGCGGCAACTTT 
TTGCGGATTGTTCTTGCTTCCAGGCTTTGCGCTGCAAATCCAGTGCTACCAGTGTGAAGAATTCCAGCTGAACAACG 
ACTGCTCCTCCCCCGAGTTCATTGTGAATTGCACGGTGAACGTTCAAGACATGTGTCAGAAAGAAGTGATGGAGCAA 
AGTGCCGACACTAAAAGAACAAACACCTTGCTCTTCGAGATGAGACATTTTGCCAAGCAGTTGACCACTTAGTTCTC 
AAGAAGCAACTATCTCTTTCATGTGCCTTCTGAGGAAGTATTCAGAGGGGGAATATCAAATGTCTTTCCCTTGGACT 
CTCCCAGGATCATGTACCGCAAGTCCTGTGCATCATCAGCGGCCTGTCTCATCGCCTCTGCCGGGTACCAGTCCTTC 
TGCTCCCCAGGGAAACTGAACTCAGTTTGCATCAGCTGCTGCAACACCCCTCTTTGTAACGGGCCAAGGCCCAAGAA 
AAGGGGAAGTTCTGCCTCGGCCCTCAGGCCAGGGCTCCGCACCACCATCCTGTTCCTCAAATTAGCCCTCTTCTCGG 
CACACTGCTGAAGCTGAAGGAGATGCCACCCCCTCCTGCATTGTTCTTCCAGCCCTCGCCCCCAACCCCCCACCTCC 
CTGAGTGAGTTTCTTCTGGGTGTCCTTTTATTCTGGGTAGGGAGCGGGAGTCCGTGTTCTCTTTTGTTCCTGTGCAA 
ATAATGAAAGAGCTCGGTAAAGCATTCTGAATAAATTCAGCCTGACTGAATTTTCAGTATGTACTTGAAGGAAGGAG 
GTGGAGTGAAAGTTCACCCCCATGTCTGTGTAACCGGAGTCAAGGCCAGGCTGGCAGAGTCAGTCCTTAGAAGTCAC 
TGAGGTGGGCATCTGCCTTTTGTAAAGCCTCCAGTGTCCATTCCATCCCTGATGGGGGCATAGTTTGAGACTGCAGA 
GTGAGAGTGACGTTTTCTTAGGGCTGGAGGGCCAGTTCCCACTCAAGGCTCCCTCGCTTGACATTCAAACTTCATGC 
TCCTGAAAACCATTCTCTGCAGCAGAATTGGCTGGTTTCGCGCCTGAGTTGGGCTCTAGTGACTCGAGACTCAATGA 
CTGGGACTTAGACTGGGGCTCGGCCTCGCTCTGAAAAGTGCTTAAGAAAATCTTCTCAGTTCTCCTTGCAGAGGACT 
GGCGCCGGGACGCGAAGAGCAACGGGCGCTGCACAAAGCGGGCGCTGTCGGTGGTGGAGTGCGCATGTACGCGCAGG 
CGCTTCTCGTGGTTGGCGTGCTGCAGCGACAGGCGGCAGCACAGCACCTGCACGAACACCCGCCGAAACTGCTGCGA 



WO 2006/131783 



PCT/IB2005/004037 



166 

GGACACCGTGTACAGGAGCGGGTTGATGACCGAGCTGAGGTAGAAAAACGTCTCCGAGAAGGGGAGGAGGATCATGT 
ACGCCCGGAAGTAGGACCTCGTCCAGTCGTGCTTGGGTTTGGCCGCAGCCATGATCCTCCGAATCTGGTTGGGCATC 
CAGCATACGGCCAATGTCACAACAATCAGCCCTGGGCAGACACGAGCAGGAGGGAGAGACAGAGAAAAGAAAAACAC 
AGCATGAGAACACAGTAAATAAATAAAACCATAAAATATTTAGCCCCTCTGTTCTGTGCTTACTGGCCAGGAAATGG 
TACCAATTTTTCAGTGTTGGACTTGACAGCTTCTTTTGCCACAAGCAAGAGAGAATTTAACACTGTTTCAAACCCGG 
GGGAGTTGGCTGTGTTAAAGAAAGACCATTAAATGCTTTAGACAGTGTATTTATACC 

<210> SEQ ID NO 149 

<211> Length : 2,247 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 149 
>R1 172 3_PE A_1_T 6 

AGAAGAGGAAGACAGGAAGGGGGTGGGGATGTGAAGCGACCGTCCCAGCCTTCCCCGCCCGCCACCCCCACCCCAAC 

TCGGCAGCCGTCACGTGATGCCTGGAGTGGGAGGTGGGGAGAAAAGGCGAGACTTTTGTGGGTGCTCCCGATCGCCA 

GTAGTTCCTTCAGTCTCAGCCGCCAACTCCGGAGGCGCGGTGCTCGGCCCGGGAGCGCGAGCGGGAGGAGCAGAGAC 

CCGCAGCCGGGAGCCCGAGCGCGGGCGATGCAGGCTCCGCGAGCGGCACCTGCGGCTCCTCTAAGCTACGACCGTCG 

TCTCCGCGGCAGCAGCGCGGGCCCCAGCAGCCTCGGCAGCCACAGCCGCTGCAGCCGGGGCAGCCTCCGCTGCTGTC 

GCCTCCTCTGATGCGCTTGCCCTCTCCCGGCCCCGGGACTCCGGGAGAATGTGGGTCCTAGGCATCGCGGCAACTTT 

TTGCGGATTGTTCTTGCTTCCAGGTGAGAATACCCAGAGGCCAGCAGCCGAGGCCAGGCTTTGCGCTGCAAATCCAG 

TGCTACCAGTGTGAAGAATTCCAGCTGAACAACGACTGCTCCTCCCCCGAGTTCATTGTGAATTGCACGGTGAACGT 

TCAAGACATGTGTCAGAAAGAAGTGATGGAGCAAAGTGCCGACACTAAAAGAACAAACACCTTGCTCTTCGAGATGA 

GACATTTTGCCAAGCAGTTGACCACTTAGTTCTCAAGAAGCAACTATCTCTTTCATGTGCCTTCTGAGGAAGTATTC 

AGAGGGGGAATATCAAATGTCTTTCCCTTGGACTCTCCCAGGATCATGTACCGCAAGTCCTGTGCATCATCAGCGGC 

CTGTCTCATCGCCTCTGCCGGGTACCAGTCCTTCTGCTCCCCAGGGAAACTGAACTCAGTTTGCATCAGCTGCTGCA 

ACACCCCTCTTTGTAACGGGCCAAGGCCCAAGAAAAGGGGAAGTTCTGCCTCGGCCCTCAGGCCAGGGCTCCGCACC 

ACCATCCTGTTCCTCAAATTAGCCCTCTTCTCGGCACACTGCTGAAGCTGAAGGAGATGCCACCCCCTCCTGCATTG 

TTCTTCCAGCCCTCGCCCCCAACCCCCCACCTCCCTGAGTGAGTTTCTTCTGGGTGTCCTTTTATTCTGGGTAGGGA 

GCGGGAGTCCGTGTTCTCTTTTGTTCCTGTGCAAATAATGAAAGAGCTCGGTAAAGCATTCTGAATAAATTCAGCCT 

GACTGAATTTTCAGTATGTACTTGAAGGAAGGAGGTGGAGTGAAAGTTCACCCCCATGTCTGTGTAACCGGAGTCAA 

GGCCAGGCTGGCAGAGTCAGTCCTTAGAAGTCACTGAGGTGGGCATCTGCCTTTTGTAAAGCCTCCAGTGTCCATTC 

CATCCCTGATGGGGGCATAGTTTGAGACTGCAGAGTGAGAGTGACGTTTTCTTAGGGCTGGAGGGCCAGTTCCCACT 

CAAGGCTCCCTCGCTTGACATTCAAACTTCATGCTCCTGAAAACCATTCTCTGCAGCAGAATTGGCTGGTTTCGCGC 

CTGAGTTGGGCTCTAGTGACTCGAGACTCAATGACTGGGACTTAGACTGGGGCTCGGCCTCGCTCTGAAAAGTGCTT 

AAGAAAATCTTCTCAGTTCTCCTTGCAGAGGACTGGCGCCGGGACGCGAAGAGCAACGGGCGCTGCACAAAGCGGGC 



WO 2006/131783 



PCT/IB2005/004037 



167 

GCTGTCGGTGGTGGAGTGCGCATGTACGCGCAGGCGCTTCTCGTGGTTGGCGTGCTGCAGCGACAGGCGGCAGCACA 

GCACCTGCACGAACACCCGCCGAAACTGCTGCGAGGACACCGTGTACAGGAGCGGGTTGATGACCGAGCTGAGGTAG 

AAAAACGTCTCCGAGAAGGGGAGGAGGATCATGTACGCCCGGAAGTAGGACCTCGTCCAGTCGTGCTTGGGTTTGGC 

CGCAGCCATGATCCTCCGAATCTGGTTGGGCATCCAGCATACGGCCAATGTCACAACAATCAGCCCTGGGCAGACAC 

GAGCAGGAGGGAGAGACAGAGAAAAGAAAAACACAGCATGAGAACACAGTAAATAAATAAAACCATAAAATATTTAG 

CCCCTCTGTTCTGTGCTTACTGGCCAGGAAATGGTACCAATTTTTCAGTGTTGGACTTGACAGCTTCTTTTGCCACA 

AGCAAGAGAGAATTTAACACTGTTTCAAACCCGGGGGAGTTGGCTGTGTTAAAGAAAGACCATTAAATGCTTTAGAC 
AGTGTATTTATACC 



<210> SEQ ID NO 150 

<211> Length : 876 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 150 
>R1 627 6_PEA_1_T 6 

GTGGAGGAGGATGGTGGGGAGTGGTGGTGTCTTCGTCCTGGGAGAAGGCGAAGCAACTTCCAGGAGGAAACGGGCGT 

TTCCTTCCCACGCGCTCGAGCGAGCCCTGGGTCCTGGCCTCGGAACTCCACCCAGCCCCTCCCCACCCTCTGGGAAA 

AGCCAGTCGCCACACACAGGCACACGCAGGCCCCGGCGCCGCGCCCTAAGGAGAGCAGCACCCACAGCCAATTGCCA 

TGGCAACCCCGGGGTTCGTTCCACTTCCCCACCCAGCCGATCTCCCCCCTCCTCCCTGCACTGCAGCCAACCGGCTT 

GTGCGCGTCCCAGGAGCGCGCTATAAAACCTGTGCTGGGCGTGATCGGCAAGCACCGGACCAGGGGGAAGGCGAGCA 

GTGCCAATCTACAGCGAAGAAAGTCTCGTTTGGTAAAAGCGAGAAGGGAAAGCCTGAGCATGCAGAGTGTGCAGAGC 

ACGAGCTTTTGTCTCCGAAAGCAGTGCCTTTGCCTGACCTTCCTGCTTCTCCATCTCCTGGGACAGGTCGCTGCGAC 

TCAGCGCTGCCCTCCCCAGTGCCCGGGCCAGTGCCCTGCGACGCCGCCGACCTGCGCCCCCGGGGTGCGCGCGGTGC 

TGGACGGCTGCTCATGCTGTCTGGTGTGTGCCCGCCAGCGTGGCGAGAGCTGCTCAGATCTGGAGCCATGCGACGAG 

AGCAGTGGCCTCTACTGTGATCGCAGCGCGGACCCCAGCAACCAGACTGGCATCTGCACGGGTAATCCTGCTCCCTC 

TGCTGTTTGACCTCTTCTCCTGCAGCTAAGTGAAGCTGCTTCCTCCCTTCTCTTTTGTATTCCCCTTCCCAGAGGGC 
G AT A A GC AAAT A AT AAT AAT G C AAT AAAT 



Segment nucleic acid sequences 
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<210> SEQ ID NO 151 

<211> Length : 232 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 151 
>H617 7 5_node__2 

ATCTGGTGGTTCTCCGGAGAGCAGCTTCCTTGGGTGTTACATGAGCCAAGCCCTCACTGTACAGAAGAGTGAGAGCT 
GAAACCTGTTCCCTGAGCTGATCAGAAGGACATCCCTTGGCCCCTCCATCTGGGCTCCTGTGGATAGGAGGGGCTGG 
GTGAGCAGGCCAGCTGGGCTATGGTGTGGTGCCTCGGCCTGGCCGTCCTCAGCCTGGTCATCAGCCAGGGGGCTGAC 
G 

<210> SEQ ID NO 152 

<211> Length : 189 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 189 
>H6177 5_node_4 

GTCGAGGGAAGCCTGAGGTGGTATCGGTGGTGGGCCGGGCTGGGGAGAGTGTGGTGCTGGGCTGTGACCTGCTGCCC 
CCGGCCGGCCGGCCCCCCCTGCATGTCATCGAGTGGCTGCGCTTTGGATTCCTGCTTCCCATCTTCATCCAGTTCGG 
CCTCTACTCTCCCCGAATTGACCCTGATTACGTGG 

<210> SEQ ID NO 153 

<211> Length : 201 

<212> Type : DNA 

<2 13> Organism : Homo sapiens 

<400> sequence : 153 
>H61775 node 6 
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GTTCTCCTCAGGTGGGAGGTAGGGAGGTATCAGCAAGAAAGGTGGGCTGGGTAGAGTCGCACAAGGCCTCCTATGAA 

CGGCTTTGTCCCTGCTCTGATCTCATCTCCAGCTCTGCTGCCTTAACTCTGCTTAATAAGCATGGCTGTGCTCCCAA 
GCAGTGTTAATTCATTGAAAGATGTCATTCATTTACACACACACACA 

<210> SEQ ID NO 154 

<211> Length : 698 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 154 
>H61775_node_8 

GTGACTGTGGGTTCCCTGCCTTCCGAGAGCTTAAGAGAGCAGAGACTGTGTCTCCTGTTTTCTTCACACGCCGCTGC 

ATATGGGAAGATCTGAAGTCAACAGGCTTTAGCCCTGCAGGTGGAGGGAGGCCTCCAGGAGGTGGGCCCAGGACTCA 

GGAGGACTCAGGGCTGCCCTGCTGGCGATCTTCCTGTTCTGTAACACTACAGGTCTAGCAGTCCAGCTGTCACAGAA 

AAGCTAGGACATGCAGTATGCTTCTTTGGATATTCTGAGTAACATTTGGACTGTTACCCATTGGCTACCAGCATCTC 

CCAAGTGAGAATACATAGATTACCCCCAGTGCCCTGAACAGCACTCGGTCCTAACACCCGTGTCCATGGAAAGCACG 

CCGCGTCTGGAGAAAGAAGCCGAAGGCTCTTGTCACTTACTAGCCATGTGATTTTGGAAAGAAACTTAACATTAATT 

CCTTCAGCTACAATGGAATTCTTGGGAGGATTAAATATGGTGACAACGCCTAATATTAGATGGCCTGTATTCCACAC 

TCAATCTTCCTTCCCTCTTCTTCCTTCTTTGTAGAGCTATAATGAAAAGTATCATGTGGGACACAGAAGAGGTTGCA 

GTCTGGGGTCTGCAGGGCTTAGCGGCCAGGCAGATTAGCTTTCTTGAGGAATCCTGACAGTGGGTGGAAGGGTATGA 
TGATG 



<210> SEQ ID NO 155 

<211> Length : 86 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 155 
>H617 75__node_0 

GGAGGCGCTCGGGGCATCCGAGGCGGGGAGGCGGGTCCGCCCCCTATTGTGTAGCGGCGAGAGTGGAGCCGAGCGGT 
GCGGAGCAG 



<210> SEQ ID NO 156 
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<211> Length : 7 
<212> Type : DNA 
<213> Organism : Homo sapiens 

<400> sequence : 156 

>H61775_node_5 

GATAAGA 



<210> SEQ ID NO 157 

<211> Length : 203 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 157 
>M8 5 4 91_PEA_l_node_0 

1 C ™ CCCGGGCCGICCGGGC ^ 

AGGCTGGGGGCCGCGCTGCTGCTGCTGCCGCTGCTCGCCGCCGTGGAAG 

<210> SEQ ID NO 158 

<211> Length : 229 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 158 
>M8 5 4 9 l_PEA_l_node_l 3 

AGAG A CA GGA TCTCACTAT GTTGT CCA GG CT GG TCTTGAAC T C GTGG CCACA AATGATC CTCCCACCTCAGCCTCCC 
AAA G T G TTGGAATTATAGGCATGAACCACCAT G CCCA GG AG G AGAATTTTT G ATAATAATATTTTGTGGACATCTTT 
GCATATCATGTCAGAGCTATAACATCATTGTG G A G AA G CTCTTAG G ATCCCATA G AATAAATGTACCGTAATTTA 



<210> SEQ ID NO 159 
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<211> Length : 336 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 159 
>M8 54 91_PEA_l_node_21 

CCATCCCCTCCGCGCCCCAGGCTGTGATTTCCAGTGTCAATGAGACCTCCCTCATGCTGGAGTGGACCCCTCCCCGC 
GACTCCGGAGGCCGAGAGGACCTCGTCTACAACATCATCTGCAAGAGCTGTGGCTCGGGCCGGGGTGCCTGCACCCG 
CTGCGGGGACAATGTACAGTACGCACCACGCCAGCTAGGCCTGACCGAGCCACGCATTTACATCAGTGACCTGCTGG 
CCCACACCCAGTACACCTTCGAGATCCAGGCTGTGAACGGCGTTACTGACCAGAGCCCCTTCTCGCCTCAGTTCGCC 
TCTGTGAACATCACCACCAACCAGGCAG 

<210> SEQ ID NO 160 

<211> Length : 125 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 160 
>M 8 5 4 9 l_PEA_l_n o de_2 3 

CTCCATCGGCAGTGTCCATCATGCATCAGGTGAGCCGCACCGTGGACAGCATTACCCTGTCGTGGTCCCAGCCAGAC 
CAGCCCAATGGCGTGATCCTGGACTATGAGCTGCAGTACTATGAGAAG 

<210> SEQ ID NO 161 

<211> Length : 1,305 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 161 
>M854 91_PEA_l_node_24 

GTACCTATTGGCTGGGTGCTGTCCCCATCACCCACCTCCCTGAGGGCCCCTCTCCCAGGCTGAGGCCTGGGAGTTCT 
GCCCCACCGCAAGATGAGACGCACTGGTGCAGCAGAAAGAGCACTGGCCTTGGAGTCAGGCTGCCTGGCTCCCAATC 
CAGCTCCGCTCCTTCCCACTGTGAGACCTCAGGCAGGTGCCTTGACCTCTCTGGATCTCACTTTTCTGGTCTGGAGG 
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ATACACCCAGCAATCTCAGTGAAATGCAACAGTCACATCCCTTTCCCTACCACGACCCTTTCATCTTGACCTCAGTG 

GCTTGATGTTGGGAAAAACTGGGTTTCCAAAAAGCTGCACTTATGAAGTGATAATTAGTCACTCACCTCTTCTTCGA 

CAGAGATTTGAAACAGCTCAAGAGAGCTTCCGCCTGCCCTGCTCTGAGTCCTGCTAAAACACCCACTTTCACTCGCC 

TGCATGCCCTTTGCATGGGGAGAGGTGATTTCACTTTGAGCTTTTAAATCAGACCTTAATTACTCCCTTTGGGTGGA 

AGCCCCTGGGATGGTAGAAGGATCACTGGACTAAGAGTGAGAAGCCGTAGGTTCAAATCCCAGCTCCGTCCTTCACC 

AGCTATGTGACCTTGGGCAGGCGTCTTTCTCCCTCTGAACCTCAGTTTCCACCTGTGTCGAGTGTGGGTGAGACCCC 

TCGCGGGGAGCTATGCAGGTTACGGAGAAAAGGCAGCACAGCACCCAGAATGGGACCTGGCCCTCAGCAGAGGCCAT 

GTGTGTCCCTGGCCTTCCTCCTCTGCCCTGCCTGCTGCACAGTGGGCAATGGTGACAGGATGGGAGGCCAAGTGGAT 

GTGGGGTCTGCACAGTACAGGGGCCAGGAGGTAGACAGCACAATTGCCCACCCACATGGCTGGACATCAGAGGCCCC 

AGGAAGCCTCTCCTTTGAATGATCACTTCTCTTACCTGCTCCAGGAGGCAACAAACAGCCACAGAGGCTGCAAGGGC 

ACCTGGGAAAGGCATCGCGGGGCTTCCATTCAGACTAGGTGTCAATGACTGACAGGGAGGCCTTTGGTTGAGGGCAA 

GCCCACGGGGAACTGCAGATGGATGGAAGGGCTCTCCCTGAAGGCTGAGAGGAAGAGTGCAGTCAATTGCAGCCAGT 

CCTGCTGGAGCCCAACTTTCTAGAGCCCAGCCCGGCCTTCCCACTCTGTTAACTGCTGGATCGGCTAACCAGGCCGG 

TCTCCAGGGCCTTTCAAACACTTACCCAGCCTTTGCCGGCCGTCTTACCATTGCTTGCGTGCGTGTTCATCCC 

<210> SEQ ID NO 162 

<211> Length : 404 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 162 
>M854 91_PEA_l_node_8 

TGGGAAGAGGTGAGTGGCTACGATGAGAACATGAACACGATCCGCACGTACCAGGTGTGCAACGTGTTTGAGTCAAG 

CCAGAACAACTGGCTACGGACCAAGTTTATCCGGCGCCGTGGCGCCCACCGCATCCACGTGGAGATGAAGTTTTCGG 

TGCGTGACTGCAGCAGCATCCCCAGCGTGCCTGGCTCCTGCAAGGAGACCTTCAACCTCTATTACTATGAGGCTGAC 

TTTGACTCGGCCACCAAGACCTTCCCCAACTGGATGGAGAATCCATGGGTGAAGGTGGATACCATTGCAGCCGACGA 

GAGCTTCTCCCAGGTGGACCTGGGTGGCCGCGTCATGAAAATCAACACCGAGGTGCGGAGCTTCGGACCTGTGTCCC 
GCAGCGGCTTCTACCTGGC 



<210> SEQ ID NO 163 
<211> Length : 184 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 163 
>M85 4 91_PEA_l_node_9 

CTTCCAGGACTATGGCGGCTGCATGTCCCTCATCGCCGTGCGTGTCTTCTACCGCAAGTGCCCCCGCATCATCCAGA 

ATGGCGCCATCTTCCAGGAAACCCTGTCGGGGGCTGAGAGCACATCGCTGGTGGCTGCCCGGGGCAGCTGCATCGCC 
AATGCGGAAGAGGTGGATGTACCCATCAAG 

<210> SEQ ID NO 164 

<211> Length : 97 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 164 
>M8 5 4 91_PEA_l_node_l 0 

CTCTACTGTAACGGGGACGGCGAGTGGCTGGTGCCCATCGGGCGCTGCATGTGCAAAGCAGGCTTCGAGGCCGTTGA 
GAATGGCACCGTCTGCCGAG 

<210> SEQ ID NO 165 

<211> Length : 91 

<212> Type r DNA 

<213> Organism : Homo sapiens 

<400> sequence : 165 
>M8 54 9 1_PE A_l_node_l 8 

GTTGTCCATCTGGGACTTTCAAGGCCAACCAAGGGGATGAGGCCTGTACCCACTGTCCCATCAACAGCCGGACCACT 
TCTGAAGGGGCCAC 
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<210> SEQ ID NO 166 

<211> Length : 65 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 166 
>M8 54 91 JPEA_l_node_l 9 

CAACTGTGTCTGCCGCAATGGCTACTACAGAGCAGACCTGGACCCCCTGGACATGCCCTGCACAA 

<210> SEQ ID NO 167 

<211> Length : 65 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 167 
>M8 5 4 91_PEA_l_node_6 

AAACGCTAATGGACTCCACTACAGCGACTGCTGAGCTGGGCTGGATGGTGCATCCTCCATCAGGG 



<210> SEQ ID NO 168 

<211> Length : 810 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 168 
>T39971_node_0 

GAGACTGAGCCTGGGGACAGGGAGTGGCCTGCTCAGAAAAGACTCAGAAATTAAATCCAGTCCAGTGGGTTGATATT 
TACCCAAATTTCCAGCCTGGGGAGATTGATGCACCCAAGAGAAGAACCCAGAAATGAAACTTTGTTCTTTTATGCTA 
AAAAATAAAATTCCCCAGAGTGCTTACAATCTCTCCTCCCACTCCCTTTTTCCTGCCCTAAATAAATAATGGCGAAT 
GAGCACCCAGCCAGGGATGTGTCTGATCAAACAATCATGGATCAATAGCTATGTTTGGAGAAGGAATTTGTGGCTGC 
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TCCAGCTACTGGGCATTTTGTCTGGTCCAGTTCATGTAATCTCCCAACACCCCATGAAGCAAGGCTTTGTTAATCCT 
ATTTTACTGAAAATGAACTAAGACTCAGAGAGATAAAGCTGTTGCCCAATGAGCCTTCTTTCTGCCCTCCAGATCCA 
CGGTGCTAATTCCCCTTCCGATGACCTAATGATTCTGAGCTTGGCAAAGGTCTTATCTCCCAGCTCGCCCAGGCCCA 
GTGTTCCAGGAATGTGACCTTTGCTGCAGCAGCCGCTGGAGGGGGCAGAGGGGATGGGCTGGAGGTTGAGCAAACAG 
AGCAGCAGAAAAGGCAGTTCCTCTTCTCCAGTGCCCTCCTTCCCTGTCTCTGCCTCTCCCTCCCTTCCTCAGGCATC 
AGAGCGGAGACTTCAGGGAGACCAGAGCCCAGCTTGCCAGGCACTGAGCTAGAAGCCCTGCCATGGCACCCCTGAGA 
CCCCTTCTCATACTGGCCCTGCTGGCATGGGTTGCTCTGG 

<210> SEQ ID NO 169 

<211> Length : 168 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 169 
>T39971_node_18 

GTGCCAGGGGCTGTGGGCCAGGGTAGAAAGCATCTAGGGAGGGTTTGAGAGCTATTGCTCCCAGGGACAGGGTGGAC 
AGGGAAGCTGGACCCAGGGCCCTGCAGGACCTGGTGGGAGCTCTGTGAGCACAGGGCAGCCCCAAGACTCCAGGTCC 
TGGGCAGTGAACCT 



<210> SEQ ID NO 170 

<211> Length : 157 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 170 
>T39971_node_21 

GGTAGTCAGTACTGGCGCTTTGAGGATGGTGTCCTGGACCCTGATTACCCCCGAAATATCTCTGACGGCTTCGATGG 
CATCCCGGACAACGTGGATGCAGCCTTGGCCCTCCCTGCCCATAGCTACAGTGGCCGGGAGCGGGTCTACTTCTTCA 
AGG 
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<210> SEQ ID NO 171 

<211> Length : 198 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 171 
>T39971_node__22 

AGTACTCAGGGGGTGGTGGGAGACTGAGCAGGCAGTGGAGCAGTCTTGGATTCCTTTCACATTTCACTGGGGACAGG 
CCTCAGCATGTGCCCACCCCTGACCCCCACCTCATGCTGGGAGATCCTAACTTCAACAGCCTCTGGGATCTCCAGTC 
TTGCCCTGGCCCAGCCCTCCTAATGCCCACCACCCCGCTCCTCG 

<210> SEQ ID NO 172 

<211> Length : 153 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 172 
>T39971_node_23 

GGAAACAGTACTGGGAGTACCAGTTCCAGCACCAGCCCAGTCAGGAGGAGTGTGAAGGCAGCTCCCTGTCGGCTGTG 
TTTGAACACTTTGCCATGATGCAGCGGGACAGCTGGGAGGACATCTTCGAGCTTCTCTTCTGGGGCAGAACCTCTG 

<210> SEQ ID NO 173 

<211> Length : 140 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 173 
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>T39971_node_31 

GCCATCCCGCGCCACGTGGCTGTCCTTGTTCTCCAGTGAGGAGAGCAACTTGGGAGCCAACAACTATGATGACTACA 
GGATGGACTGGCTTGTGCCTGCCACCTGTGAACCCATCCAGAGTGTCTTCTTCTTCTCTGGAG 



<210> SEQ ID NO 174 

<211> Length : 127 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 174 
>T3997l_node_33 

ACAAGTACTACCGAGTCAATCTTCGCACACGGCGAGTGGACACTGTGGACCCTCCCTACCCACGCTCCATCGCTCAG 
TACTGGCTGGGCTGCCCAGCTCCTGGCCATCTGTAGGAGTCAGAGCCCAC 



<210> SEQ ID NO 175 

<211> Length : 223 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 175 
>T39971_node__7 

TGACTCGCGGGGATGTGTTCACTATGCCGGAGGATGAGTACACGGTCTATGACGATGGCGAGGAGAAAAACAATGCC 
ACTGTCCATGAACAGGTGGGGGGCCCCTCCCTGACCTCTGACCTCCAGGCCCAGTCCAAAGGGAATCCTGAGCAGAC 
ACCTGTTCTGAAACCTGAGGAAGAGGCCCCTGCGCCTGAGGTGGGCGCCTCTAAGCCTGAGGGGATAGA 



<210> SEQ ID NO 176 
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<211> Length : 9 
<212> Type : DNA 
<213> Organism : Homo sapiens 

<400> sequence : 176 

>T39971_node_l 

CTGACCAAG 

<210> SEQ ID NO 177 

<211> Length : 44 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 177 
>T39971_node_10 

GGAGACCTCAGCCCCCAGCAGAGGAGGAGCTGTGCAGTGGGAAG 

<210> SEQ ID NO 178 

<211> Length : 38 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 178 
>T39971_node_ll 

CCCTTCGACGCCTTCACCGACCTCAAGAACGGTTCCCT 
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<210> SEQ ID NO 179 

<211> Length : 14 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 179 

>T39971_node__12 

CTTTGCCTTCCGAG 

<210> SEQ ID NO 180 

<211> Length : 32 

<212> Type : DNA 

<213> Organism ; Homo sapiens 

<400> sequence : 180 
>T39971__node_15 

GGCAGTACTGCTATGAACTGGACGAAAAGGCA 

<210> SEQ ID NO 181 

<211> Length : 24 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 181 

>T39971_node_16 

GTGAGGCCTGGGTACCCCAAGCTC 
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<210> SEQ ID NO 182 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 182 
>T39971_node_17 

ATCCGAGATGTCTGGGGCATCGAGGGCCCCATCGATGCCGCCTTCACCCGCATCAACTGTCAGGGGAAGACCTACCT 
CTTCAAG 

<210> SEQ ID NO 183 

<211> Length : 42 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 183 
>T39971_node_2 6 

CTGGTACCAGACAGCCCCAGTTCATTAGCCGGGACTGGCACG 

<210> SEQ ID NO 184 

<211> Length : 51 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 184 
>T39971_node_27 

GTGTGCCAGGGCAAGTGGACGCAGCCATGGCTGGCCGCATCTACATCTCAG 
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<210> SEQ ID NO 185 

<211> Length : 9 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 185 
>T3 9 971_node_ 28 
GCATGGCAC 

<210> SEQ ID NO 186 
<211> Length : 95 
<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 186 
>T39971_node_29 

CCCGCCCCTCCTTGGCCAAGAAACAAAGGTTTAGGCATCGCAACCGCAAAGGCTACCGTTCACAACGAGGCCACAGC 
CGTGGCCGCAACCAGAAC 

<210> SEQ ID NO 187 

<211> Length : 42 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 187 
>T39971_node_3 

AGTCATGCAAGGGCCGCTGCACTGAGGGCTTCAACGTGGACA 
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<210> SEQ ID NO 188 

<211> Length : 8 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 188 

>T39971_node_30 

TCCCGCCG 

<210> SEQ ID NO 189 

<211> Length : 7 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 189 
>T3 9971_node_34 
ATGGCCG 

<210> SEQ ID NO 190 

<211> Length : 17 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 190 
>T39971 node 35 
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GGCCCTCTGTAGCTCCC 

<210> SEQ ID NO 191 

<211> Length : 62 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 191 
>T39971_node_36 

TCCTCCCATCTCCTTCCCCCAGCCCAATAAAGGTCCCTTAGCCCCGAAAAAAAAGCKATAAT 

<210> SEQ ID NO 192 

<211> Length : 20 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 192 
>T39 971_node_4 
AGAAGTGCCAGTGTGACGAG 



<210> SEQ ID NO 193 

<211> Length : 58 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 193 
>T39971_node_5 

CTCTGCTCTTACTACCAGAGCTGCTGCACAGACTATACGGCTGAGTGCAAGCCCCAAG 



<210> SEQ ID NO 194 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 194 

>T39971_node_8 

CTCAAG 



<210> SEQ ID NO 195 

<211> Length : 20 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 195 

>T3 9971_node_9 

GC C T GAG AC C CTTCATCCAG 



<210> SEQ ID NO 196 
<211> Length : 327 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 196 
>Z213 68_PEA_l_node_0 

AGGTTACTTGACTGGGAGTTCTCAGACCTCCAGTTTCAGCCCTGCCCTCAGCCTCCAATCCGTAAGAGACACCCAGC 
CCCAGCAATTGGATTGGGCAGCCCGTCTTGACACACCACTGTGCTGAGTGCTTGAGGACGTGTTTCAACAGATGGTT 
GGGGTTAGTGTGTGTCATCACATTCGAGTGGGGATTAAGAGAAGGAAGGCTGCCTTGCTGGAGCTGTGTGGTCTTCT 
CCAAGTGAGAGTCGCAGGCAATAGAACTACTTTGCTTTTGGAGGAAAAGGAGGAATTCATTTTCAGCAGACACAAGA 

AAAGCAGTTTTTTTTTCAG 



<210> SEQ ID NO 197 

<211> Length : 177 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 197 
>Z213 68_PEA_l__node_15 

AACTCCAGAAATCAGGAGACGGAGACATTTTGTCAGTTTTGCAACATTGGACCAAATACAATGAAGTATTCTTGCTG 
TGCTCTGGTTTTGGCTGTCCTGGGCACAGAATTGCTGGGAAGCCTCTGTTCGACTGTCAGATCCCCGAGGTTCAGAG 
GACGGATACAGCAGGAACGAAAA 



<210> SEQ ID NO 198 

<211> Length : 240 

<212> Type : DNA 

<213> Organism : Homo sapiens 



WO 2006/131783 



PCT/IB2005/004037 



186 

<400> sequence : 198 
>Z21 3 68_PEA_l_node_l 9 

GGTCCCTGCAAGTCATGAACAAAACGAGAAAGATTATGGAACATGGGGGGGCCACCTTCATCAATGCCTTTGTGACT 
ACACCCATGTGCTGCCCGTCACGGTCCTCCATGCTCACCGGGAAGTATGTGCACAATCACAATGTCTACACCAACAA 
CGAGAACTGCTCTTCCCCCTCGTGGCAGGCCATGCATGAGCCTCGGACTTTTGCTGTATATCTTAACAACACTGGCT 
ACAGAACAG 



<210> SEQ ID NO 199 

<211> Length : 300 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 199 
> Z 2 1 3 6 8_PEA_l_no de_2 

TCTTCATCTTGCGAGCACTTGGCAGACCGTCGCTAATGAATCTTGGGGCCGGTGTCGGGCCGGGGCGGCTTGATCGG 
CAACTAGGAAACCCCAGGCGCAGAGGCCAGGAGCGAGGGCAGCGAGGATCAGAGGCCAGGCCTTCCCGGCTGCCGGC 
GCTCCTCGGAGGTCAGGGCAGATGAGGAACATGACTCTCCCCCTTCGGAGGAGGAAGGAAGTCCCGCTGCCACCTTA 
TCTCTGCTCCTCTGCCTCCTCCCTGTTCCCAGAGCTTTTTCTCTAGAGAAGATTTTGAAGGCGGCTTTT 



<210> SEQ ID NO 200 

<211> Length : 152 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 200 
> Z 2 1 3 6 8_PEA_l_node_2 1 

CCTTTTTTGGAAAATACCTCAATGAATATAATGGCAGCTACATCCCCCCTGGGTGGCGAGAATGGCTTGGATTAATC 
AAGAATTCTCGCTTCTATAATTACACTGTTTGTCGCAATGGCATCAAAGAAAAGCATGGATTTGATTATGCAAAG 
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<210> SEQ ID NO 201 

<211> Length : 176 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 201 
> Z 2 1 3 6 8__PEA_l_no de_3 3 

CTGTAXAACATGCTCGTGGAGACGGGGGAGCTGGAGAATACTTACATCATTTACACCGCCGACCATGGTTACCATAT 
TGGGCAGTTTGGACTGGTCAAGGGGAAATCCATGCCATATGACTTTGATATTCGTGTGCCTTTTTTTATTCGTGGTC 

CAAGTGTAGAACCAGGATCAAT 



<210> SEQ ID NO 202 

<211> Length : 129 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 202 
> Z 2 1 3 6 8_PE A_l_no de_3 6 

AGTCCCACAGATCGTTCTCAACATTGACTTGGCCCCCACGATCCTGGATATTGCTGGGCTCGACACACCTCCTGATG 
TGGACGGCAAGTCTGTCCTCAAACTTCTGGACCCAGAAAAGCCAGGTAACAG 



<210> SEQ ID NO 203 

<211> Length : 279 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 203 
>Z213 68_PEA_l_node_37 

GTGTGTCATTGTTCCTCCTCTCAGCCAGCCCCAAATACACTGAGCTCCAGCTGGTGCCCAGAGCCAGCCAGCAGCTG 
AAGACATGGAGGCAGAATATGCCTTGCCCACAAGGATCACCCCAAGCTGAGCATTTCTCAGCTGCTTGTGAATAGCA 
TATTGATGGAGATGCACTCATGGTCTGTGGGAAGTGAGAGGTGTTTCTTTAAATAAGCTGTTAGCACAGATCCATTT 
GGAAAAACGTCCAGATGCCAAAAGTAAATATTATCATTTTGCTTTCAG 



<210> SEQ ID NO 280 

<211> Length : 853 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 280 
>Z213 68_PEA_l__node_39 

GTAATTATTGGTTCCTGGGGTGCTTCTGGGAACCAGTCCTAGTGGGCAGCTTTCCCTGCTGAGTATTTTTTTTCTCC 
TTATTTTTGTTTACTAAGCATGCAGATTTCGTAAACCTAGTCACAAGATTGAATGGTTTGCTGCTTATTCTGTAGTG 
GTCAATAGAGTAATAATTGCTGGATCAGAATTGTAAAGAATAACCCTCAAGTTGGTTAATTGGTACAAAAACACAGT 
TAGATAGAAGTTATAGAATTTGATAGTATAGTTGGGACATTATCGTTAACAATAATTTATGTATATCTTAAAATAGC 
TAGAAGTGAAGAATTGCAAAGTTCCCAACACAAGGAAAAGATAAATGAGATGATGAATATCCCAATTATCTTGATTT 
GATCATTACACATTGTAGACTGGTATCCATATATCACACGTACCCCCAAAATATGTATAATTGTGATATATCAATTT 
TTAAAATACCAAAAAAGCAAGAGAATGACGACTCCACATCCCCCAAAAAGAATAAATTCTCATAAGCTTGGACCAAA 
GCCTTTATCATGGGTGTAGATTTACTGTTGCATTTCTCAGTGCTGGTTTCTAATCAGACCAGTGGATTGAGTTTCTC 
TACCATCCTCCCCACGTTCTTCTCTAAGCTGCCTCCAAGCCTCACCCGGCACCCTTCTTCCTACTTCCTACTTCTTT 
TCCTTGTGTGCCTTTCCTAGTTTTAAATAGATAAATGTATGCCATTGTAATTATTTCCATTGTCACTTCTGGGTTTC 
CCCTTTTGGTTCATTAATACCCATTGCCTTGTTTTTCTCTGTACATAAATTAGGAGAGAGAAAATATTTGTATAATT 
TTTTTA 



<210> SEQ ID NO 281 
<211> Length : 162 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 281 
> Z 2 1 3 6 8_PEA_l_node_4 

GGATTCTTCACTTCTCTTGAACAAGGAACTCACTCAGAGACTAACACAAAGGAAGTAATTTCTTACCTGGTCATTAT 
TTAGTCTACAATAAGTTCATCCTTCTTCAGTGTGACCAGTAAATTCTTCCCATACTCTTGAAGAGAGCATAATTGGA 

AT G GAG AG 

<210> SEQ ID NO 282 

<211> Length : 130 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 282 
>Z2 1 3 6 8_PEA_l_node_4 1 

CAAATTTCTACGTAAGAAGGAAGAATCCAGCAAGAATATCCAACAGTCAAATCACTTGCCCAAATATGAACGGGTCA 
AAGAACTATGCCAGCAGGCCAGGTACCAGACAGCCTGTGAACAACCGGGGCAG 

<210> SEQ ID NO 283 

<211> Length : 217 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 283 
> Z 2 1 3 6 8_PEA_l_node_4 3 

AAGTGGCAATGCATTGAGGATACATCTGGCAAGCTTCGAATTCACAAGTGTAAAGGACCCAGTGACCTGCTCACAGT 
CCGGCAGAGCACGCGGAACCTCTACGCTCGCGGCTTCCATGACAAAGACAAAGAGTGCAGTTGTAGGGAGTCTGGTT 
ACCGTGCCAGCAGAAGCCAAAGAAAGAGTCAACGGCAATTCTTGAGAAACCAGGGGACTCCAA 
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<210> SEQ ID NO 284 

<211> Length : 256 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 284 
> Z 2 1 3 6 8_PE A_l_node_4 5 

AGTACAAGCCCAGATTTGTCCATACTCGGCAGACACGTTCCTTGTCCGTCGAATTTGAAGGTGAAATATATGACATA 
AATCTGGAAGAAGAAGAAGAATTGCAAGTGTTGCAACCAAGAAACATTGCTAAGCGTCATGATGAAGGCCACAAGGG 
GCCAAGAGATCTCCAGGCTTCCAGTGGTGGCAACAGGGGCAGGATGCTGGCAGATAGCAGCAACGCCGTGGGCCCAC 
CTACCACTGTCCGAGTGACACACAA 



<210> SEQ ID NO 285 

<211> Length : 176 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 285 
>Z 2 1 3 6 8_PEA_l_node_5 3 

GGAGGCTGCTCAGGAAGTAGATAGCAAACTGCAACTTTTCAAGGAGAACAACCGTAGGAGGAAGAAGGAGAGGAAGG 
AGAAGAGACGGCAGAGGAAGGGGGAAGAGTGCAGCCTGCCTGGCCTCACTTGCTTCACGCATGACAACAACCACTGG 

CAGACAGCCCCGTTCTGGAACC 



<210> SEQ ID NO 286 
<211> Length : 143 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 286 
> Z 2 1 3 6 8_PEA_l_no de_5 6 

TGGGATCTTTCTGTGCTTGCACGAGTTCTAACAATAACACCTACTGGTGTTTGCGTACAGTTAATGAGACGCATAAT 
TTTCTTTTCTGTGAGTTTGCTACTGGCTTTTTGGAGTATTTTGATATGAATACAGATCCTTATCAG 

<210> SEQ ID NO 287 

<211> Length : 124 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 287 
>Z213 68_PEA_l_node_58 

CTCACAAATACAGTGCACACGGTAGAACGAGGCATTTTGAATCAGCTACACGTACAACTAATGGAGCTCAGAAGCTG 
TCAAGGATATAAGCAGTGCAACCCAAGACCTAAGAATCTTGATGTTG 



<210> SEQ ID NO 288 

<211> Length : 588 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 288 
>Z 2 1 3 68__PEA_l_node_6 6 

AGGACAGTTATGGGATGGATGGGAAGGTTAATCAGCCCCGTCTCACTGCAGACATCAACTGGCAAGGCCTAGAGGAG 
CTACACAGTGTGAATGAAAACATCTATGAGTACAGACAAAACTACAGACTTAGTCTGGTGGACTGGACTAATTACTT 
GAAGGATTTAGATAGAGTATTTGCACTGCTGAAGAGTCACTATGAGCAAAATAAAACAAATAAGACTCAAACTGCTC 
AAAGTGACGGGTTCTTGGTTGTCTCTGCTGAGCACGCTGTGTCAATGGAGATGGCCTCTGCTGACTCAGATGAAGAC 
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CCAAGGCATAAGGTTGGGAAAACACCTCATTTGACCTTGCCAGCTGACCTTCA^ACCCTGCATTTGAACCGACCAAC 
ATTAAGTCCAGAGAGTAAACTTGAATGGAATAACGACATTCCAGAAGTTAATCATTTGAATTCTGAACACTGGAGAA 
AAACCGAAAAATGGACGGGGCATGAAGAGACTAATCATCTGGAAACCGATTTCAGTGGCGATGGCATGACAGAGCTA 
GAGCTCGGGCCCAGCCCCAGGCTGCAGCCCATTCGCAGGCACCCGAAAG 



<210> SEQ ID NO 289 

<211> Length : 585 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 289 
> Z 2 1 3 6 8_PEA__l_node__ 6 7 

AACTTCCCCAGTATGGTGGTCCTGGAAAGGACATTTTTGAAGATCAACTATATCTTCCTGTGCATTCCGATGGAATT 
TCAGTTCATCAGATGTTCACCATGGCCACCGCAGAACACCGAAGTAATTCCAGCATAGCGGGGAAGATGTTGACCAA 
GGTGGAGAAGAATCACGAAAAGGAGAAGTCACAGCACCTAGAAGGCAGCGCCTCCTCTTCACTCTCCTCTGATTAGA 
TGAAACTGTTACCTTACCCTAAACACAGTATTTCTTTTTAACTTTTTTATTTGTAAACTAATAAAGGTAATCACAGC 
CACCAACATTCCAAGCTACCCTGGGTACCTTTGTGCAGTAGAAGCTAGTGAGCATGTGAGCAAGCGGTGTGCACACG 
GAGACTCATCGTTATAATTTACTATCTGCCAAGAGTAGAAAGAAAGGCTGGGGATATTTGGGTTGGCTTGGTTTTGA 
TTTTTTGCTTGTTTGTTTGTTTTGTACTAAAACAGTATTATCTTTTGAATATCGTAGGGACATAAGTATATACATGT 
TATCCAATCAAGATGGCTAGAATGGTGCCTTTCTGAGTGTCTAAAA 



<210> SEQ ID NO 290 

<211> Length : 1,188 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 290 



WO 2006/131783 



PCT/IB2005/004037 



193 

>Z213 68_PEA_l_node_69 

TTTTGATTCATTTTTAACCACTGGAATTTTTCAATGCCGTCATTTTCAGTTAGATGATTTTGCACTTTGAGATTAAA 
ATGCCATGTCTATTTGATTAGTCTTATTTTTTTATTTTTACAGGCTTATCAGTCTCACTGTTGGCTGTCATTGTGAC 
AAAGTCAAATAAACCCCCAAGGACGACACACAGTATGGATCACATATTGTTTGACATTAAGCTTTTGCCAGAAAATG 
TTGCATGTGTTTTACCTCGACTTGCTAAAATCGATTAGCAGAAAGGCATGGCTAATAATGTTGGTGGTGAAAATAAA 
TAAATAAGTAAACAAAATGAAGATTGCCTGCTCTCTCTGTGCCTAGCCTCAAAGCGTTCATCATACATCATACCTTT 
AAGATTGCTATATTTTGGGTTATTTTCTTGACAGGAGAAAAAGATCTAAAGATCTTTTATTTTCATCTTTTTTGGTT 
TTCTTGGCATGACTAAGAAGCTTAAATGTTGATAAAATATGACTAGTTTTGAATTTACACCAAGAACTTCTCAATAA 
AAGAAAATCATGAATGCTCCACAATTTCAACATACCACAAGAGAAGTTAATTTCTTAACATTGTGTTCTATGATTAT 
TTGTAAGACCTTCACCAAGTTCTGATATCTTTTAAAGACATAGTTCAAAATTGCTTTTGAAAATCTGTATTCTTGAA 
AATATCCTTGTTGTGTATTAGGTTTTTAAATACCAGCTAAAGGATTACCTCACTGAGTCATCAGTACCCTCCTATTC 
AGCTCCCCAAGATGATGTGTTTTTGCTTACCCTAAGAGAGGTTTTCTTCTTATTTTTAGATAATTCAAGTGCTTAGA 
TAAATTATGTTTTCTTTAAGTGTTTATGGTAAACTCTTTTAAAGAAAATTTAATATGTTATAGCTGAATCTTTTTGG 
TAACTTTAAATCTTTATCATAGACTCTGTACATATGTTCAAATTAGCTGCTTGCCTGATGTGTGTATCATCGGTGGG 
ATGACAGAACAAACATATTTATGATCATGAATAATGTGCTTTGTAAAAAGATTTCAAGTTATTAGGAAGCATACTCT 
GTTTTTTAATCATGTATAATATTCCATGATACTTTTATAGAACAATTCTGGCTTCAGGAAAGTCTAGAAGCAATATT 
TCTTCAAATAAAAGGTGTTTAAACTTTTTTCTG 



<210> SEQ ID NO 291 

<211> Length : 45 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 291 
> Z 2 1 3 6 8_PEA_l_node_l 1 

GACCCTATCTGCAGATGTTCTGAATACCTCTGAGAATAGAGATTG 



<210> 
<211> 
<212> 



SEQ ID 
Length 
Type : 



NO 2 92 
: 28 
DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 292 

> Z 2 1 3 6 8__PE A_l_node_l 2 

ATTATTCAACCAGGATACCTAATTCAAG 



<210> SEQ ID NO 293 

<211> Length : 15 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 293 
> Z 2 1 3 6 8_PEA_l_node_l 6 
AACATCCGACCCAAC 



<210> SEQ ID NO 294 

<211> Length : 40 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 294 
>Z2 1 3 6 8_PEA_l_node_l 7 

ATTATTCTTGTGCTTACCGATGATCAAGATGTGGAGCTGG 



<210> SEQ ID NO 295 
<211> Length : 74 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 295 
> Z 2 1 3 6 8_PEA_l_node_2 3 

GACTACTTCACAGACTTAATCACTAACGAGAGCATTAATTACTTCAAAATGTCTAAGAGAATGTATCCCCATAG 



<210> SEQ ID NO 296 

<211> Length : 96 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 296 
>Z213 68JPEA_l_node_24 

GCCCGTTATGATGGTGATCAGCCACGCTGCGCCCCACGGCCCCGAGGACTCAGCCCCACAGTTTTCTAAACTGTACC 
CCAATGCTTCCCAACACAT 



<210> SEQ ID NO 297 

<211> Length : 59 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 297 
> Z 2 1 3 6 8__PE A_l_no de_3 0 

AACTCCTAGTTATAACTATGCACCAAATATGGATAAACACTGGATTATGCAGTACACAG 
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<210> SEQ ID NO 298 

<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 298 
>Z2 1 3 6 8__PEA_l_node_3 1 

GACCAATGCTGCCCATCCACATGGAATTTACAAACATTCTACAGCGCAAAAGGCTCCAGACTTTGATGTCAGTGGAT 
GATTCTGTGGAGAGG 



<210> SEQ ID NO 299 

<211> Length : 57 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 299 
>Z 2 1 3 6 8_PEA_l_node__3 8 

GTTTCGAACAAACAAGAAGGCCAAAATTTGGCGTGATACATTCCTAGTGGAAAGAGG 



<210> SEQ ID NO 300 

<211> Length : 97 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 300 
>Z21 3 68_PEA_l__nocle_4 1 

GTGTTTTATTCTTCCCAATGACTCTATCCATTGTGAGAGAGAACTGTACCAATCGGCCAGAGCGTGGAAGGACCATA 
AGGCATACATTGACAAAGAG 

<210> SEQ ID NO 301 

<211> Length : 95 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 301 
>Z213 68_PEA_l_node_4 9 

ATTGAAGCTCTGCAAGATAAAATTAAGAATTTAAGAGAAGTGAGAGGACATCTGAAGAGAAGGAAGCCTGAGGAATG 
TAGCTGCAGTAAACAAAG 

<210> SEQ ID NO 302 

<211> Length : 66 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 302 
>Z 2 1 3 6 8_PEA_l_node_5 1 

CTATTACAATAAAGAGAAAGGTGTAAAAAAGCAAGAGAAATTAAAGAGCCATCTTCACCCATTCAA 



<210> SEQ ID NO 303 
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<211> Length : 34 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 303 
>Z213 68_PEA_l_node_61 

GAAATAAAGATGGAGGAAGCTATGACCTACACAG 

<210> SEQ ID NO 304 

<211> Length : 53 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 304 
>Z213 68_PEA_l_node_68 

CTTGACACCCCTGGTAAATCTTTCAACACACTTCCACTGCCTGCGTAATGAAG 

<210> SEQ ID NO 305 

<211> Length : 95 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 305 
> Z 2 1 3 6 8_PE A_l_no de_7 

GTGCTGACGGCCACCCACCATCATCTAAAGAAGATAAACTTGGCAAATGACATGCAGGTTCTTCAAGGCAGAATAAT 
TGCAGAAAATCTTCAAAG 
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SEQ ID NO: 306 

>H5 3 6 2 6_PEA_l_node_2 5 

GTGCGCAGCGACGTGAAGCCGGTGATCCAGTGGCTGAAGCGCGTGGAGTACGGCGCCGAGGGCCGCCACAACTCCAC 
CATCGATGTGGGCGGCCAGAAGTTTGTGGTGCTGCCCACGGGTGACGTGTGGTCGCGGCCCGACGGCTCCTACCTCA 
ATAAGCTGCTCATCACCCGTGCCCGCCAGGACGATGCGGGCATGTACATCTGCCTTGGCGCCAACACCATGGGCTAC 

AGCTTCCGCAGCGCCTTCCTCACCGTGCTGCCAG 

SEQ ID NO: 307 

>H5 3 62 6_PEA_l_node_2 6 

GTGCGCGGCTGCCACGCCACGCCACACCATGCTGGTGCCCGGACCCGCCCCCTGGGCCCGGCGTCCCACCCACCGGG 
TGGGGCCCCACCCTTCCCTCCCGGGCCGTGCTGGCCAGGTCATCTGCCGAGGGAGGGCAGCCCAGGGGCACCGTCTC 
CACAGCCCCTGGGATGGGTCTGGGGTGCTCTCCTGGTCTTTGTGTCGGCGTTCCCCTCCCTACCTCCTTTCCTCTCG 

CTCTTGCAG 

SEQ ID NO: 308 

>H53 62 6_PEA_l_node__27 

ACCCAAAACCGCCAGGGCCACCTGTGGCCTCCTCGTCCTCGGCCACTAGCCTGCCGTGGCCCGTGGTCATCGGCATC 
CCAGCCGGCGCTGTCTTCATCCTGGGCACCCTGCTCCTGTGGCTTTGCCAGGCCCAGAAGAAGCCGTGCACCCCCGC 
GCCTGCCCCTCCCCTGCCTGGGCACCGCCCGCCGGGGACGGCCCGCGACCGCAGCGGAGACAAGGACCTTCCCTCGT 
TGGCCGCCCTCAGCGCTGGCCCTGGTGTGGGGCTGTGTGAGGAGCATGGGTCTCCGGCAGCCCCCCAGCACTTACTG 
GGCCCAGGCCCAGTTGCTGGCCCTAAGTTGTACCCCAAACTCTACACAGACATCCACACACACACACACACACACTC 
TCACACACACTCACACGTGGAGGGCAAGGTCCACCAGCACATCCACTATCAGTGCTAGACGGCACCGTATCTGCAGT 
GGGCACGGGGGGGCCGGCCAGACAGGCAGACTGGGAGGATGGAGGACGGAGCTGCAGACGAAGGCAGGGGACCCATG 
GCGAGGAGGAATGGCCAGCACCCCAGGCAGTCTGTGTGTGAGGCATAGCCCCTGGACACACACACACAGACACACAC 
ACTACCTGGATGCATGTATGCACACACATGCGCGCACACGTGCTCCCTGAAGGCACACGTACGCACACACGCACATG 
CACAGATATGCCGCCTGGGCACACAGATAAGCTGCCCAAATGCACGCACACGCACAGAGACATGCCAGAACATACAA 
GGACATGCTGCCTGAACATACACACGCACACCCATGCGCAGATGTGCTGCCTGGACACACACACACACACGGATATG 
CTGXCTGGACGCACACACGTGCAGATATGGTATCCGGACACACACGTGCACAG 

SEQ ID NO: 309 
>H53626 PEA 1 node_34 
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GCAGATATGCTGCCTGGACACACACACAGATAATGCTGCCTCAACACTCACACACGTGCAGATATTGCCTGGACACA 
CACATGTGCACAGATATGCTGTCTGGACATGCACACACGTGCAGATATGCTGTCCGGATACACACGCACGCACACAT 
GCAGATATGCTGCCTGGGCACACACTTCCGGACACACATGCACACACAGGTGCAGATATGCTGCCTGGACACACGCA 
GACTGACGTGCTTTTGGGAGGGTGTGCCGTGAAGCCTGCAGTACGTGTGCCGTGAGGCTCATAGTTGATGAGGGACT 
TTCCCTGCTCCACCGTCACTCCCCCAACTCTGCCCGCCTCTGTCCCCGCCTCAGTCCCCGCCTCCATCCCCGCCTCT 
GTCCCCTGGCCTTGGCGGCTATTTTTGCCACCTGCCTTGGGTGCCCAGGAGTCCCCTACTGCTGTGGGCTGGGGTTG 

GGGGCACAG 

SEQ ID NO: 310 

>H53 62 6_PEA_l_node_3 5 

CAGCCCCAAGCCTGAGAGGCTGGAGCCCATGGCTAGTGGCTCATCCCCACTGCATTCTCCCCCTGACACAGAGAAGG 
GGCCTTGGTATTTATATTTAAGAAATGAAGATAATATTAATAATGATGGAAGGAAGACTGGGTTGCAGGGACTGTGG 

TCTCTCCTGGGGCCCGG 

SEQ ID NO: 311 

>H5 3 6 2 6_PEA_l_node_3 6 

GACCCGCCTGGTCTTTCAGCCATGCTGATGACCACACCCCGTCCAGGCCAGACACCACCCCCCACCCCACTGTCGTG 
GTGGCCCCAGATCTCTGTAATTTTATGTAGAGTTTGAGCTGAAGCCCCGTATATTTAATTTATTTTGTTAAACATGA 
AAGTGCATCCTTTCCCTCCA 

SEQ ID NO: 312 

>H5362 6_PEA_l_node_ll 

GTCCGGACAGGCCGAGATGACGCCGAGCCCCCTGTTGCTGCTCCTGCTGCCGCCG 

SEQ ID NO: 313 

>H53 62 6_PEA_l__node_12 

CTGCTGCTGGGGGCCTTCCCGCCGGCCGCCGCCGCCCGAG 

SEQ ID NO: 314 

>H 5 3 6 2 6_PE A_l_no de__l 6 

GTCAACTACACCCTCGTCGTGCTGG 

SEQ ID NO: 315 

>H53 62 6_PEA_l__node_l 9 

ATGACATTAGCCCAGGGAAGGAGAGCCTGGGGCCCGACAGCTCCTCTGGGG 

SEQ ID NO: 316 
>H53626 PEA 1 node 20 
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GTCAAGAGGACCCCGCCAGCCAGCAGTGGG 

SEQ ID NO: 317 

>H5 3 62 6_PEA_l_node_2 4 

AGCGGACCCGTTCCAAGCCCGTGCTCACAGGCACGCACCCCGTGAACACGACGGTGGACTTCGGGGGGACCACGTCC 
TTCCAGTGCAAG 

SEQ ID NO: 318 

>H5 3 62 6_PE A_l_no de_2 8 

ATATGCTGCCTGGACACACAGATAATGCTGCCTTGACACACACATGCACGGATATTGCCTGGACACACACACACACA 
C 

SEQ ID NO: 319 

>H53 62 6JPEA_l_node_2 9 

GCGTGCACAGATATGCTGTCTGGACACGCACACACATGCAGATATGCTGCCTGGACACACACTTCCAGACACACGTG 
CACAGGCGCAGAT 

SEQ ID NO: 320 

>H5 3 62 6_PEA_l__node_3 0 

ATGCTGCCTGGACACACGCAGATATGCTGTCTAGTCACACACACAC 

SEQ ID NO: 321 

>H5362 6_PEA_l_node_31 

GCAGACATGCTGTCCGGACACACACAC 

SEQ ID NO: 322 

>H5 3 6 2 6_PE A__l_no de_3 2 

GCATGCACAGATATGCTGTCCGGACACAC 

SEQ ID NO: 323 

>H5362 6_PEA_l_node__33 

ACACGCAC 



SEQ ID NO: 324 
>H5 362 6_PEA_1_P4 

MTPSPLLLLLLPPLLLGAFPPAAAARGPPKMADKVVPRQVARLGRTVRLQCPVEGDPPPLTMWTKDGRTIHSGWSRF 
RVLPQGLKVKQVEREDAGVYVCKATNGFGSLSVNYTLWLDDISPGKESLGPDSSSGGQEDPASQQWARPRFTQPSK 
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MRRRVIARPVGSSVRLKCVASGHPRPDITWMKDDQALTRPEAAEPRKKKWTLSLKNLRPEDSGKYTCRVSNRAGAIN 
ATYKVDVIQRTRSKPVLTGTHPVNTTVDFGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVGGQKFVVLPT 
GDVWSRPDGSYLNKLLITRARQDDAGMYICLGANTMGYSFRSAFLTVLPGARLPRHATPCWCPDPPPGPGVPPTGWG 
PTLPSRAVLARSSAEGGQPRGTVSTAPGMGLGCSPGLCVGVPLPTSFPLALADPKPPGPPVASSSSATSLPWPVVIG 
IPAGAVFILGTLLLWLCQAQKKPCTPAPAPPLPGHRPPGTARDRSGDKDLPSLAALSAGPGVGLCEEHGSPAAPQHL 
LGPGPVAGPKLYPKLYTDIHTHTHTHSHTHSHVEGKVHQHIHYQC 

SEQ ID NO: 325 
>H53 62 6_PEA_1_P5 

MTPSPLLLLLLPPLLLGAFPPAAAARGPPKMADKVVPRQVARLGRTVRLQCPVEGDPPPLTMWTKDGRTIHSGWSRF 
RVLPQGLKVKQVEREDAGVYVCKATNGFGSLSVNYTLVVLDDI SPGKESLGPDSSSGGQEDPASQQWARPRFTQPSK 
MRRRVIARPVGSSVRLKCVASGHPRPDITWMKDDQALTRPEAAEPRKKKWTLSLKNLRPEDSGKYTCRVSNRAGAIN 
ATYKVDVIQRTRSKPVLTGTHPVNTTVDFGGTTSFQCKTQNRQGHLWPPRPRPLACRGPWSSASQPALSSSWAPCSC 
GFARPRRSRAPPRLPLPCLGTARRGRPATAAETRTFPRWPPSALALVWGCVRSMGLRQPPSTYWAQAQLLALSCTPN 
STQTSTHTHTHTLTHTHTWRARSTSTSTISARRHRICSGHGGAGQTGRLGGWRTELQTKAGDPWRGGMASTPGSLCV 

RHSPWTHTHRHTHYLDACMHTHARTRAP 

SEQ ID NO: 326 
>Ubiquitin Forward primer 
ATTTGGGTCGCGGTTCTTG 

SEQ ID NO: 327 

>Ubiquitin Reverse primer 

TGCCTTGACATTCTCGATGGT 

SEQ ID NO: 328 
>Ubiquitin-amplicon 

ATTTGGGTCGCGGTTCTTGTTTGTGGATCGCTGTGATCGTCACTTGACAATGCAGATCTTCGTGAAGACTCTGACTG 
GTAAGACCATCACCCTCGAGG T T G A G C C C AG T G AC AC CATC GAG A AT G T C A AG G C A 

SEQ ID NO: 329 
>SDHA Forward primer 
TGGGAACAAGAGGGCATCTG 

SEQ ID NO: 330 
SDHA Reverse primer 



WO 2006/131783 



PCT/IB2005/004037 



203 

CCACCACTGCATCAAATTCATG 

SEQ ID NO: 331 
SDHA-amplicon 

TGGGAACAAGAGGGCATCTGCTAAAGTTTCAGATTCCATTTCTGCTCAGTATCCAGTAGTGGATCATGAATTTGATG 
CAGTGGTGG 

SEQ ID NO: 332 
PBGD Forward primer 
TGAGAGTGATTCGCGTGGG 

SEQ ID NO: 333 

PBGD Reverse primer 

CCAGGGTACGAGGCTTTCAAT 

SEQ ID NO: 334 
PBGD-amplicon 

TGAGAGTGATTCGCGTGGGTACCCGCAAGAGCCAGCTTGCTCGCATACAGACGGACAGTGTGGTGGCAACATTGAAA 
GCCTCGTACCCTGG 

<210> SEQ ID NO 335 

<211> Length : 760 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 335 
>HUMGRP 5E_no de_0 

CCAAAATCTATGGGCTGGGACAGCAAAGATGTGGCCTACGAAGAGAAAGGTCTGGAGAATCAGAAGGCCTTCAAATG 
GTGGTTCCAAATCCCTCCAGCAAAGCCCATCCATCTTTAGAGCTCACCCGTCTCCAGCTACACCCCCCACCCCTCCC 
GGCCCAGATCAGGCAGCGGGGTCGCCCTCTCCAGGACTCTCAAGGCAGCTAAGGCTGGAGGCGCCGGCGAGCCTGGA 
GAGGGAGGAGTTCACTAAATTGTGTTGGATGGAAGGCGTCGAGGACCGGAGGAATTAATCCGATGTGGGGAAGGCGG 
ACGGGGCTACGAGGAAAAAAGAGGGGGCAATGTACACTCAGCCTTTTCATCACTCGGCGGGGAGATGGATGGTTTTC 
CGGACCGGGCGTCCCAGCGCCCCGGTTAGCTATAGGGAGACGTCAGAGCGCTCTGGTCCGCGATAGAAGAGCCCCCC 
AGCCCCCCCGCCCGGGCTTCCATATAAAGTAGGGGCCCTAGTGGAGGCCGCAGCAGTAGCACCAGCGGCTGCGGCGG 
CGGAGCTCCTCCGAGGTCCGGGTCACCAGTCTCTGCTCTTCCCAGCCTCTCCGGCGCGCTCCAAGGGCTTCCCGTCG 
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GGACCATGCGCGGCAGTGAGCTCCCGCTGGTCCTGCTGGCGCTGGTCCTCTGCCTGGCGCCCCGGGGGCGAGCGGTC 
CCGCTGCCTGCGGGCGGAGGGACCGTGCTGACCAAGATGTACCCGCGCGGCAACCACTGGGCGGTGG 



<210> SEQ ID NO 336 

<211> Length : 224 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 336 
>HUMGRP5E_node_2 

GGCACTTAATGGGGAAAAAGAGCACAGGGGAGTCTTCTTCTGTTTCTGAGAGAGGGAGCCTGAAGCAGCAGCTGAGA 
GAGTACATCAGGTGGGAAGAAGCTGCAAGGAATTTGCTGGGTCTCATAGAAGCAAAGGAGAACAGAAACCACCAGCC 
ACCTCAACCCAAGGCCCTGGGCAATCAGCAGCCTTCGTGGGATTCAGAGGATAGCAGCAACTTCAAAGAT 



<210> SEQ ID NO 337 

<211> Length : 359 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 337 
>HUMGRP5E_node_8 

GTTCTCAACGTGAAGGAAGGAACCCCCAGCTGAACCAGCAATGATAATGATGGCCTCTCTCAAAAGAGAAAAACAAA 
ACCCCTAAGAGACTGCGTTCTGCAAGCATCAGTTCTACGGATCATCAACAAGATTTCCTTGTGCAAAATATTTGACT 
ATTCTGTATCTTTCATCCTTGACTAAATTCGTGATTTTCAAGCAGCATCTTCTGGTTTAAACTTGTTTGCTGTGAAC 
AATTGTCGAAAAGAGTCTTCCAATTAATGCTTTTTTATATCTAGGCTACCTGTTGGTTAGATTCAAGGCCCCGAGCT 
GTTACCATTCACAATAAAAGCTTAAACACATTGTCCAAAGGGCAGGCTGTT 



<210> SEQ ID NO 338 
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<211> Length : 19 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 338 

>HUMGRP5E_node_3 

GTAGGTTCAAAAGGCAAAG 

<210> SEQ ID NO 339 

<211> Length : 14 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 339 

>HUMGRP5E_node_7 

ACTCTCTGCTCCAG 

<210> SEQ ID NO 340 

<211> Length : 178 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 340 
>D56406__PEA_l_node_0 

TTCACTCACTTTCAAAGCCAGCTGAAGGAAAGAGGAAGTGCTAGAGAGAGCCCCCTTCAGTGTGCTTCTGACTTTTA 
CGGACTTGGCTTGTTAGAAGGCTGAAAGATGATGGCAGGAATGAAAATCCAGCTTGTATGCATGCTACTCCTGGCTT 

TCAGCTCCTGGAGTCTGTGCTCAG 
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<210> SEQ ID NO 341 

<211> Length : 780 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 341 
>D564 06_PEA_l_node_13 

TTAATCCAGGAAGATATTCTTGATACTGGAAATGACAAAAATGGAAAGGAAGAAGTCATAAAGAGAAAAATTCCTTA 
TATTCTGAAACGGCAGCTGTATGAGAATAAACCCAGAAGACCCTACATACTCAAAAGAGATTCTTACTATTACTGAG 
AGAATAAATCATTTATTTACATGTGATTGTGATTCATCATCCCTTAATTAAATATCAAATTATATTTGTGTGAAAAT 
GTGACAAACACACTTATCTGTCTCTTCTACAATTGTGGTTTATTGAATGTGATTTTTCTGCACTAATATAAATTAGA 
CTAAGTGTTTTCAAATAAATCTAAATCTTCAGCATGATGTGTTGTGTATAATTGGAGTAGATATTAATTAAGTCACC 
TGTATAATGTTTTGTAATTTTGCAAAACATATCTTGAGTTGTTTAAACAGTCAAAATGTTTGATATTTTATACCAGC 
TTATGAGCTCAAAGTACTACAGCAAAGCCTAGCCTGCATATCATTCACCCAAAACAAAGTAATAGCGCCTCTTTTAT 
TATTTTGACTGAATGTTTTATGGAATTGAAAGAAACATACGTTCTTTTCAAGACTTCCTCATGAATCTCTCAATTAT 
AGGAAAAGTTATTGTGATAAAATAGGAACAGCTGAAAGATTGATTAATGAACTATTGTTAATTCTTCCTATTTTAAT 
GAATGACATTGAACTGAATTTTTTGTCTGTTAAATGAACTTGATAGCTAATAAAAAGACAACTAGCCATCAAAATCA 
AAAGTTTCTC 



<210> SEQ ID NO 342 

<211> Length : 93 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 342 
>D5 64 0 6_PEA_l_node__ll 

GCACGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGACGGGCGGATCACGAGGTCAAGAGATGGAGAC 
CATCCCGGCTAACACG 
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<210> SEQ ID NO 343 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 344 
>D5 64 0 6_PEA_l_node_2 
ATTCAG 



<210> SEQ ID NO 344 

<211> Length : 56 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 344 
>D564 0 6_PEA_l_node_3 

AAGAGGAAATGAAAGCATTAGAAGCAGATTTCTTGACCAATATGCATACATCAAAG 



<210> SEQ ID NO 345 

<211> Length : 115 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 345 
>D56406 PEA 1 node 5 
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ATTAGTAAAGCACATGTTCCCTCTTGGAAGATGACTCTGCTAAATGTTTGCAGTCTTGTAAATAATTTGAACAGCCC 
AGCTGAGGAAACAGGAGAAGTTCATGAAGAGGAGCTTG 

<210> SEQ ID NO 346 

<211> Length : 34 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 346 
>D5 64 0 6_PEA_l_node_6 

TTGCAAGAAGGAAACTTCCTACTGCTTTAGATGG 

<210> SEQ ID NO 347 

<211> Length : 26 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 347 
>D5 64 0 6_PE A__l_node_7 
CTTTAGCTTGGAAGCAATGTTGACAA 

<210> SEQ ID NO 348 

<211> Length : 8 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 348 
>D5 64 0 6_PEA_l_node_8 
TATACCAG 



<210> SEQ ID NO 349 

<211> Length : 42 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence \ 349 
>D5 64 0 6_PEA_l_node__9 

CTCCACAAAATCTGTCACAGCAGGGCTTTTCAACACTGGGAG 



<210> SEQ ID NO 350 

<211> Length : 245 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 350 
>F0 5 0 6 8_PEA_l_node_0 

AAGAAAGGGAAGGCAACCGGGCAGCCCAGGCCCCGCCCCGCCGCTCCCCCACCCGTGCGCTTATAAAGCACAGGAAC 
CAGAGCTGGCCACTCAGTGGTTTCTTGGTGACACTGGATAGAACAGCTCAAGCCTTGCCACTTCGGGCTTCTCACTG 
CAGCTGGGCTTGGACTTCGGAGTTTTGCCATTGCCAGTGGGACGTCTGAGACTTTCTCCTTCAAGTACTTGGCAGAT 
CACTCTCTTAGCAG 

<210> SEQ ID NO 351 
<211> Length : 161 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 351 
>F050 68_PEA_l_node_10 

CTTCGGGACGTGCACGGTGCAGAAGCTGGCACACCAGATCTACCAGTTCACAGATAAGGACAAGGACAACGTCGCCC 
CCAGGAGCAAGATCAGCCCCCAGGGCTACGGCCGCCGGCGCCGGCGCTCCCTGCCCGAGGCCGGCCCGGGTCGGACT 

CTGGTGT 

<210> SEQ ID NO 352 

<211> Length : 121 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 352 
>F0 5068_PEA_l_node_12 

CCATGGTACAAGGAATAGTCGCGCAAGCATCCCGCTGGTGCCTCCCGGGACGAAGGACTTCCCGAGCGGTGTGGGGA 
CCGGGCTCTGACAGCCCTGCGGAGACCCTGAGTCCGGGAGGCAC 

<210> SEQ ID NO 353 

<211> Length : 630 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 353 
>F050 68_PEA_l_node_13 

CGTCCGGCGGCGAGCTCTGGCTTTGCAAGGGCCCCTCCTTCTGGGGGCTTCGCTTCCTTAGCCTTGCTCAGGTGCAA 
GTGCCCCAGGGGGCGGGGTGCAGAAGAATCCGAGTGTTTGCCAGGCTTAAGGAGAGGAGAAACTGAGAAATGAATGC 
TGAGACCCCCGGAGCAGGGGTCTGAGCCACAGCCGTGCTCGCCCACAAACTGATTTCTCACGGCGTGTCACCCCACC 
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AGGGCGCAAGCCTCACTATTACTTGAACTTTCCAAAACCTAAAGAGGAAAAGTGCAATGCGTGTTGTACATACAGAG 
GTAACTATCAATATTTAAGTTTGTTGCTGTCAAGATTTTTTTTGTAACTTCAAATATAGAGATATTTTTGTACGTTA 
TATATTGTATTAAGGGCATTTTAAAAGCAATTATATTGTCCTCCCCCTATTTTAAGACGTGAATGTCTCAGCGAGGT 
GTAAAGTTGTTCGCCGCGTGGAATGTGAGTGTGTTTGTGTGCATGAAAGAGAAAGACTGATTACCTCCTGTGTGGAA 
GAAGGAAACACCGAGTCTCTGTATAATCTATTTACATAAAATGGGTGATATGCGAACAGCAAACCAATAAACTGTCT 
C A A T G C T G A A T A A A A 

<210> SEQ ID NO 354 

<211> Length : 150 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 354 
>F05 0 68_PEA_l_node_4 

GTGAGTCCGGGCAGCGCCTTCCCCCTTGCTGGTACCTGGCAGGCAAGGGGAACTGACCGTTGGTCCCGAAGGTCTAG 
AAGTGAATGGGAGCAGGGACAGGCCTGGGCGTCACCTGAACGCACGCGAATCGGGTCTGCTTGTGTTTTCCAG 

<210> SEQ ID NO 355 

<211> Length : 233 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 355 
>F0 50 68_PEA__l_node__8 

GTAACTACGCCCTGTGCTGTCCAGGGACGGGAGGGAAGGAAGGTGTGCGGGAGGAGTTCTCTGTCTCCACTCCCCTG 
GCCCGGGGGATCGTCGGGGCTGGACCGCAGCTCAGATGGCGCGAGCAGTTTCCAGCTCCCTCTGGCTCTAGAATGGC 
TCCCGTTCCCGGTGTTGGGGCCAAAGCTCTGCTTGATGGGGTCTCAAGTTGCCTTTCTTCCCCCTCCCCCCGCCCGC 
AG 

<210> SEQ ID NO 356 
<211> Length : 76 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 356 
>F050 68_PEA_l__node_ll 

CTTCTAAGCCACAAGCACACGGGGCTCCAGCCCCCCCGAGTGGAAGTGCTCCCCACTTTCTTTAGGATTTAGGCGC 

<210> SEQ ID NO 357 

<211> Length : 119 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 357 
>F 0 5 0 6 8_PE A_l_node_3 

GGTCTGCGCTTCGCAGCCGGGATGAAGCTGGTTTCCGTCGCCCTGATGTACCTGGGTTCGCTCGCCTTCCTAGGCGC 
TGACACCGCTCGGTTGGATGTCGCGTCGGAGTTTCGAAAGAA 

<210> SEQ ID NO 358 

<211> Length : 59 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 358 
> F 0 5 0 6 8_P E A_l_n o de_ 5 

GTGGAATAAGTGGGCTCTGAGTCGTGGGAAGAGGGAACTGCGGATGTCCAGCAGCTACC 



<210> SEQ ID NO 359 
<211> Length : 60 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 359 
>F0 50 68_PEA_l_node_6 

CCACCGGGCTCGCTGACGTGAAGGCCGGGCCTGCCCAGAC 



<210> SEQ ID NO 360 

<211> Length : 51 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 360 
>F0 50 68_PEA_l_node_7 

CCTTATTCGGCCCCAGGACATGAAGGGTGCCTCTCGAAGCCCCGAAGACAG 



<210> SEQ ID NO 361 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 361 
>F0 5 0 6 8_PE A_l__node_9 

CAGTCCGGATGCCGCCCGCATCCGAGTCAAGCGCTACCGCCAGAGCATGAACAACTTCCAGGGCCTCCGGAGCTTTG 
GCTGCCG 



<210> SEQ ID NO 362 
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<211> Length : 573 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 362 
>H14624_node_0 

TTATGCTCCCGCGGAGGCCAAGCGGACTCCCTGACAGGACAGAATCTGAACGTGAGAGTGAAGGTCTTGCCTGTCCA 
GAAACTCTTGTAGCCAGCACAGGTTTAAACAAGAAGCCAAATTGTTCTGGAGAGATTGCTGGGGGCTTTCTTTGTGC 
CTCAAGCTTCTTCAGTGCCCTGAGCACAGGAAACACTCAAGCAGAGAAGCAGAGCCAAACCCAGGATACGGGAGGTC 
GAGGCTCTTCCGTAGACCTGCAGCATTGGGGTGGGATGATGTTCATTCTGTGTGTGTTCTGGACCAAGCCCCTCTCC 
AGGGACCTATGGGCAGCCCCCTTTAAGCAAGATGCCCGGTGGAGTGGGCATCCACCATCACTTACCCTGGGCTTGGG 
TGAATAGATTTTCCGTGCCTTAAATGGGCAGGGAGGGGGTAAACATGGACGGTCCATTGGTACAAATAAAAGCCTTT 
GGTGGGTTTTGATCAATTGCAAGGATCGAAGGAGACCTGTGGACCTGAGGTCAACTGGCAGCAGAGAAGAGTCTGGG 
TTCGTGAAGGCGCCGCCGCGGTGCCGCGCCACGT 

<210> SEQ ID NO 363 

<211> Length : 387 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 363 
>Hl4 624_node_16 

GTAAGCCTTCCCTCTTGCTTCCCCACTCCCTGCTGGGCTGAGACGCTCCCAGGAGATCCCGCCCCTGCCACGCATCC 
CAGTGCATCCCTGCTTGGGGTGCCAGTAGCGGGAAGGGCAGAAGTTCTGCCTGACCTGGTCTGTCATCACAACAAGC 
CTGTATCAAATTTGAGGCACCCCTCCCACGCCGCCCAAGTCTCGCGCATTCTCCTCCCGAGTTGTACCAGCTATACT 
TAAGGGCAGTTTAAAAATAAAACAAACAAACAAAAACAACAAAACTAAAAAAACGAAGAACTGAACGGCGGTTTAAA 
AAAAAATAGATACACGATTATTGTTAAAGATGCTAGCACTGGAGCTGCGCAGAGCGTTGGAAGTGGTGTTTGGTGGA 

GG 



<210> SEQ ID NO 364 
<211> Length : 249 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 364 
>Hl4 624_node_3 

ATTTGCATAAAAAAGGCCAAGAAAACTCTGGCTGTGCCCCAGCAACGGCTCATTCTGCTCCCCCGGGTCGGAGCCCC 
CCGGAGCTGCGCGCGGGCTTGCAGCGCCTCGCCCGCGCTGTCCTCCCGGTGTCCCGCTTCTCCGCGCCCCAGCCGCC 
GGCTGCCAGCTTTTCGGGGCCCCGAGTCGCACCCAGCGAAGAGAGCGGGCCCGGGACAAGCTCGAACTCCGGCCGCC 
TCGCCCTTCCCCGGCTCC 

<210> SEQ ID NO 365 

<211> Length : 10 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 365 
>Hl4 62 4__node_10 
GTGCTGGAGC 

<210> SEQ ID NO 366 

<211> Length : 35 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 366 
>H14 624__node__ll 

AGGCCGGCGCTTGGATCCCGCTGGTCATGAAGCAG 



<210> SEQ ID NO 367 
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<211> Length : 21 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 367 

>H14624_node_12 

TGCCACCCGGACACCAAGAAG 

<210> SEQ ID NO 368 

<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 368 
>H14 624_node_13 

TTCCTGTGCTCGCTCTTCGCCCCCGTCTGCCTCGATGACCTAGACGAGACCATCCAGCCATGCCACTCGCTCTGCGT 
GCAGGTGAAGGACCG 

<210> SEQ ID NO 370 

<211> Length : 60 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 370 
>Hl4 624_node_14 

CTGCGCCCCGGTCATGTCCGCCTTCGGCTTCCCCTGGCCCGACATGCTTGAGTGCGACCG 



<210> SEQ ID NO 371 
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<211> Length : 71 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 371 
>H14 624_node_15 

TTTCCCCCAGGACAACGACCTTTGCATCCCCCTCGCTAGCAGCGACCACCTCCTGCCAGCCACCGAGGAAG 

<210> SEQ ID NO 372 

<211> Length : 70 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 372 
>H14624_node_4 

GCTCCCTCTGCCCCCTCGGGGTCGCGCGCCCACGATGCTGCAGGGCCCTGGCTCGCTGCTGCTGCTCTTC 

<210> SEQ ID NO 373 

<211> Length : 11 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 373 
>Hl4 624_node___5 
CTCGCCTCGCA 



<210> SEQ ID NO 374 



WO 2006/131783 



PCT/IB2005/004037 



218 

<211> Length : 24 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 374 

>H14624_node_6 

CTGCTGCCTGGGCTCGGCGCGCGG 

<210> SEQ ID NO 375 

<211> Length : 7 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 375 
>Hl4 62 4_node_7 
GCTCTTC 

<210> SEQ ID NO 376 

<211> Length : 80 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 376 
>Hl4 624_node_8 

CTCTTTGGCCAGCCCGACTTCTCCTACAAGCGCAGCAATTGCAAGCCCATCCCGGCCAACCTGCAGCTGTGCCACGG 
CAT 



<210> SEQ ID NO 377 
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<211> Length : 55 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 377 
>Hl4624_node_9 

CGAATACCAGAACATGCGGCTGCCCAACCTGCTGGGCCACGAGACCATGAAGGAG 



<210> SEQ ID NO 378 

<211> Length : 213 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 378 
>H3 8 8 0 4_PE A_l_node_0 

GACTGGGTTGACCGATGCTGGGCAGCTGAGCGGACCAATCGGCCCCCTAGACTGAGACGTTGGCGTTTGAAATCAGC 
CAATGGCAGGTCTACACTGGAGCTTCCTCTCCGCCTCCTTCGCCTAGCCTGCGAGTGTTCTGAGGGAAGCAAGGAGG 
CGGCGGCGGCCGCAGCGAGTGGCGAGTAGTGGAAACGTTGCTTCTGAGGGGAGCCCAAG 



<210> SEQ ID NO 379 

<21i> Length : 432 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 379 
>H38 8 04_PEA_l_node_l 

GTAGGGAGGCGAGGCGACGGTGTGCGGGAGCGGGCTCTCCAGGGACTTCCCGGGTCCGCAACTGGCAGGGCCGTTCG 
ATTCGCAGGGGATCCCGTTTCGTTTCTGTTGTTTTCCCTTTATTTTTAGGAGTGCCCGGGGCGACGGGACCCCGGGA 
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GAGGGGAAAGGGAACAGTCTGGGGTCCGGGCATCGCTGTGGGCCGGGCTGGGTTTAGGGGGACGGCGGTGCGGGCTG 
GGCCGGTTTGGGCGCGGCGGGGGCCGGATGATGGGGCGAGTCCGGACCTTGGCGGGCGAGTGCTCGGCGCAGGCGCA 
AGCGCAGAGTCTCCTCGCGGTCGTCCTCTCGGCCCCTCCCTCTGGGGGGACCCCCAGTGCCAGGCTGTCAGTGCGCA 

GCCCCAGCCCGCGGGACCCCTGGGGACTCTGGGCGCCTGTTCTGCAG 

<210> SEQ ID NO 380 

<211> Length : 159 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 380 
>H3 880 4__PEA_l__node_l 6 

GTATATACCCTCTCAGTGTCTGGAGACCGGCTGATTGTGGGAACAGCAGGCCGCAGAGTGTTGGTGTGGGAGTTACG 
GAACATGGGTTACGTGCAGCAGCGCAGGGAGTCCAGCCTGAAATACCAGACTCGCTGCATACGAGCGTTTCCAAACA 

AGCAG 

<210> SEQ ID NO 381 

<211> Length : 139 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 381 
>H3 8 8 0 4_PEA_l_node_l 9 

GGTTATGTATTAAGCTCTATTGAAGGCCGAGTGGCAGTTGAGTATTTGGACCCAAGCCCTGAGGTACAGAAGAAGAA 
GTATGCCTTCAAATGTCACAGACTAAAAGAAAATAATATTGAGCAGATTTACCCAGTCAATG 

<210> SEQ ID NO 382 

<211> Length : 196 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 382 
>H3 88 04_PEA_l_node_24 

ATATTTGGGATCCATTTAACAAAAAGCGACTGTGCCAATTCCATCGGTACCCCACGAGCATCGCATCACTTGCCTXC 
AGTAATGATGGGACTACGCTTGCAATAGCGTCATCATATATGTATGAAATGGATGACACAGAACATCCTGAAGATGG 
TATCTTCATTCGCCAAGTGACAGATGCAGAAACAAAACCCAA 



<210> SEQ ID NO 383 

<211> Length : 353 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 383 
>H3 88 0 4_PEA_l_node_2 5 

GTGAGTATGCTTCACCTGTATTTGAGCCTTTTCTTGCATTCAACCCAGGATTTATTAATTTTTCTAAATTCATGAAT 
AGCATTGTTGATGCCTGCTCGATATTACAGCTGACTGTAGGGTTGGAGTTGATGTTATCATGTTCTCCCAAGCTTTC 
AATATCCGTAGGTTGATAGACGTCTGATGGATAAAATTGTGCCTAGTTGTTTTGTAGAGAAGAATGTCAAACTCTTA 
TTCTTCTTGAATAGGCTCTATTATTTGAATCTCTGGAGTTATTACCAGCTCATTGCTTCAAAATTAAGTTGAGGAAT 
TCAAGAATAATTTATTTTAGTAAATTCTATTTAAGATGTTTAAGA 



<210> SEQ ID NO 384 

<211> Length : 590 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 384 
>H 3 8 8 0 4_PEA_l_node__2 8 

TATTAACACAAAGTAAGTGACCTTCAGGTCTTATTGGAAACTCAGAGTAATATGGCCTTGCCTGGAATTGCAAATTT 
CCTTAGTTTTGAAATTTTCATAGATGTCTTTGGTTCTTGGTTGTAACTGTTGACTGAGAAGAGCCATTTACATTTTT 
TGATACCAACAGGGCAAAGCTTTTTACTTAATTACCTCTACCAGGCTTTAAGGGAAATCTGATACTTCAGCATGTGT 
TAAACTATAAAATACCTACTCCAAGTATCTGCCCAGTTCCTTGTCCCCTCTCCCCAGGCCCTTAAAGGAAGTTCTCG 
ATACATATTTGTAGAATAACTGAATGTTTTCAGGATTCCTGTACTTTGCTGAGTTAAAATGGATATGGTACCCTTGC 
TGATTGGTTGAGCCCCTAAGAGGGGGCAGAATATTAAATATTCCATATCAGATATGCTTTTACAGGTTTGACTTTAG 
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AAAAGTCTTAGCATGTGAAGCCTGTTGGATAAAGGGCTGTGTTTGCATTTAATCTGTCACTTTTGTATCTCCTGTCC 
TGGCTGGCCATTTTGATCTCATGCTGTTCTTTTTTTCTTTTGAACTTGTAG 

<210> SEQ ID NO 385 

<211> Length : 1, 228 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 385 
>H3 8 8 0 4_PEA_l_node_2 9 

GTCACCATGTACTTGACAAGATTTCATTTACTTAAGTGCCATGTTGATGATAATAAAACAATTCGTACTCCCCAATG 
GTGGATTTATTACTATTAAAGAAACCAGGGAAAATATTAATTTTAATATTATAACAACCTGAAAATAATGGAAAAGA 
GGTTTTTGAATTTTTTTTTTTAAATAAACACCTTCTTAAGTGCATGAGATGGTTTGATGGTTTGCTGCATTAAAGGT 
ATTTGGGCAAACAAAATTGGAGGGCAAGTGACTGCAGTTTTGAGAATCAGTTTTGACCTTGATGATTTTTTGTTTCC 
ACTGTGGAAATAAATGTTTGTAAATAAGTGTAATAAAAATCCCTTTGCATTCTTTCTGGACCTTAAATGGTAGAGGA 
AAAGGCTCGTGAGCCATTTGTTTCTTTTGCTGGTTATAGTTGCTAATTCTAAAGCTGCTTCAGACTGCTTCATGAGG 
AGGTTAATCTACAATTAAACAATATTTCCTCTTGGCCGTCCATTATTTTCTGAAGCAGATGGTTCATCATTTCCTGG 
GCTGTTAAACAAAGCGAGGTTAAGGTTAGACTCTTGGGAATCAGCTAGTTTTCAATCTTATTAGGGTGCAGAAGGAA 
AACTAATAAGAAAACCTCCTAATATCATTTTGTGACTGTAAACAATTATTTATTAGCAAACAATTGATCCCAGAAGG 
GCAAATTGTTTGAGTCAGTAATGAGCTGAGAAAAGACAGAGCATATCTGTGTATTTGGAAAAATAATTGTAACGTAA 
TTGCAGTGCATTTAGACAGGCATCTATTTGGACCTGTTTCTATCTCTAAATGAATTTTTGGAAACATTAATGAGGTT 
TACATATTTCTCTGACATTTATATAGTTCTTATGTCCATTTCAGTTGACCAGCCGCTGGTGATTAAAGTTAAAAAGA 
AAAAAATTATAGTGAGAATGAGATTCATTTCAATGTAATGCACTAAAGCAGAACACGAACTTAGCTTGGCCTATTCT 
AGGTAGTTCCAAATAGTATTTTTGTTGTCAAACTTTAAAATTTATATTAATTTGCAAATGTATGTCTCTGAGTAGGA 
CTTGGACCTTTCCTGAGATTTATTTTATCCGTGATGTATTTTTTTTAATTCTTTTGATACAGAGAAGGGTCTTTTTT 
TTTTTTAAGTATTTCAGTGAAAACTTGGTGTAAGTCTGAACCCATCTTTTGAAATGTATTTTCTTCATTGCAG 

<210> SEQ ID NO 386 

<211> Length : 326 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 386 
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>H3 88 0 4_J?EA_l_node_30 

GTCCACCTAATCATCCTGTGAAAGTGGTTTCTCTATGGAAAGCTTTGTTTGCTTCCTACAAATACATGCTTATTCCT 
TAAGGGATGTGTTAGAGTTACTGTGGATTTCTCTGTTTTCTGTCTTACAAGAAACTTGTCTATGTACCTTAATACTT 
TGTTTAGGATGAGGAGTCTTTGTGTCCCTGTACAGTAGTCTGACGTATTTCCCCTTCTGTCCCCTAGTAAGCCCAGT 
TGCTGTATCTGAACAGTTTGAGCTCTTTTTGTAATATACTCTAAACCTGTTATTTCTGTGCTAATAAACGAGATGCA 

GAACCCTTGAAAAATGGA 

<210> SEQ ID NO 387 

<211> Length : 70 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 387 
>H 3 8 8 0 4_PEA_l_node_l 0 

GATCCAACGCATGCCTGGAGTGGAGGACTAGATCATCAATTGAAAATGCATGATTTGAACACTGATCAAG 

<210> SEQ ID NO 388 

<211> Length : 39 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 388 
>H 3 8 8 0 4_PE A_l_no de_l 2 

AAAATCTTGTTGGGACCCATGATGCCCCTATCAGATGTG 



<210> SEQ ID NO 389 
<211> Length : 79 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 389 
>H3 8 8 0 4_PEA_l_node_l 3 

TTGAATACTGTCCAGAAGTGAATGTGATGGTCACTGGAAGTTGGGATCAGACAGTTAAACTGTGGGATCCCAGAACT 
CC 

<210> SEQ ID NO 390 

<211> Length : 34 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 390 
>H38 8 04_PEA_l_node_14 

TTGTAATGCTGGGACCTTCTCTCAGCCTGAAAAG 

<210> SEQ ID NO 391 

<211> Length : 33 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 391 
>H3 8 8 0 4_PEA_l_node_2 

ATGACCGGTTCTAACGAGTTCAAGCTGAACCAG 



<210> SEQ ID NO 392 
<211> Length : 39 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 392 
>H3 8 8 0 4_PEA_l_node_2 0 

CCATTTCTTTTCACAATATCCACAATACATTTGCCACAG 

<210> SEQ ID NO 393 

<211> Length : 21 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 393 
>H3 8 8 0 4_PE A_l_node_2 3 
GTGGTTCTGATGGCTTTGTAA 

<210> SEQ ID NO 394 

<211> Length : 48 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 394 
>H388 04_PEA__l_node_2 6 

ATTTGAACTGCCAAAAATCTTTCCTCTCCACAGAGGTTGTTTCTTTAA 



<210> SEQ ID NO 395 
<211> Length : 38 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 395 
>H3 8 8 0 4_PEA_l_node_3 

CCACCCGAGGATGGCATCTCCTCCGTGAAGTTCAGCCC 



<210> SEQ ID NO 396 

<211> Length : 111 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 396 
>H3 8 8 0 4_PEA__l_node_4 

CAACACCTCCCAGTTCCTGCTTGTCTCCTCCTGGGACACGTCCGTGCGTCTCTACGATGTGCCGGCCAACTCCATGC 
GGCTCAAGTACCAGCACACCGGCGCCGTCCTGGA 



<210> SEQ ID NO 397 

<211> Length : 13 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 397 
>H3 8 8 0 4_PE A_l_node_5 
CTGCGCCTTCTAC 
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<210> SEQ ID NO 398 

<211> Length : 257 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 398 
>HSENA7 8_node_0 

AGTGGGGAGAGATGAGTGTAGATAAAAGGAGTGCAGAAGGCACGAGGAAGCCACAGTGCTCCGGATCCTCCAATCTT 
CGCTCCTCCAATCTCCGCTCCTCCACCCAGTTCAGGAACCCGCGACCGCTCGCAGCGCTCTCTTGACCACTATGAGC 
CTCCTGTCCAGCCGCGCGGCCCGTGTCCCCGGTCCTTCGAGCTCCTTGTGCGCGCTGTTGGTGCTGCTGCTGCTGCT 

GACGCAGCCAGGGCCCATCGCCAGCG 

<210> SEQ ID NO 399 

<211> Length : 133 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 399 
>HSENA7 8_node_2 

CTGGTCCTGCCGCTGCTGTGTTGAGAGAGCTGCGTTGCGTTTGTTTACAGACCACGCAGGGAGTTCATCCCAAAATG 
ATCAGTAATCTGCAAGTGTTCGCCATAGGCCCACAGTGCTCCAAGGTGGAAGTGGT 

<210> SEQ ID NO 400 

<211> Length : 1,786 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 400 
>HSENA7 8_node__6 

TGGAAACAAGGAAAACTGATTAAGAGAAATGAGCACGCATGGAAAAGTTTCCCAGTCTTCAGCAGAGAAGTTTTCTG 

GAGGTCTCTGAACCCAGGGAAGACAAGAAGGAAAGATTTTGTTGTTGTTTGTTTATTTGTTTTTCCAGTAGTTAGCT 

TTCTTCCTGGATTCCTCACTTTGAAGAGTGTGAGGAAAACCTATGTTTGCCGCTTAAGCTTTCAGCTCAGCTAATGA 

AGTGTTTAGCATAGTACCTCTGCTATTTGCTGTTATTTTATCTGCTATGCTATTGAAGTTTTGGCAATTGACTATAG 

TGTGAGCCAGGAATCACTGGCTGTTAATCTTTCAAAGTGTCTTGAATTGTAGGTGACTATTATATTTCCAAGAAATA 

TTCCTTAAGATATTAACTGAGAAGGCTGTGGATTTAATGTGGAAATGATGTTTCATAAGAATTCTGTTGATGGAAAT 

ACACTGTTATCTTCACTTTTATAAGAAATAGGAAATATTTTAATGTTTCTTGGGGAATATGTTAGAGAATTTCCTTA 

CTCTTGATTGTGGGATACTATTTAATTATTTCACTTTAGAAAGCTGAGTGTTTCACACCTTATCTATGTAGAATATA 

TTTCCTTATTCAGAATTTCTAAAAGTTTAAGTTCTATGAGGGCTAATATCTTATCTTCCTATAATTTTAGACATTCT 

TTATCTTTTTAGTATGGCAAACTGCCATCATTTACTTTTAAACTTTGATTTTATATGCTATTTATTAAGTATTTTAT 

TAGGAGTACCATAATTCTGGTAGCTAAATATATATTTTAGATAGATGAAGAAGCTAGAAAACAGGCAAATTCCTGAC 

TGCTAGTTTATATAGAAATGTATTCTTTTAGTTTTTAAAGTAAAGGCAAACTTAACAATGACTTGTACTCTGAAAGT 

TTTGGAAACGTATTCAAACAATTTGAATATAAATTTATCATTTAGTTATAAAAATATATAGCGACATCCTCGAGGCC 

CTAGCATTTCTCCTTGGATAGGGGACCAGAGAGAGCTTGGAATGTTAAAAACAAAACAAAACAAAAAAAAACAAGGA 

GAAGTTGTCCAAGGGATGTCAATTTTTTATCCCTCTGTATGGGTTAGATTTTCCAAAATCATAATTTGAAGAAGGCC 

AGCATTTATGGTAGAATATATAATTATATATAAGGTGGCCACGCTGGGGCAAGTTCCCTCCCCACTCACAGCTTTGG 

CCCCTTTCACAGAGTAGAACCTGGGTTAGAGGATTGCAGAAGACGAGCGGCAGCGGGGAGGGCAGGGAAGATGCCTG 

TCGGGTTTTTAGCACAGTTCATTTCACTGGGATTTTGAAGCATTTCTGTCTGAATGTAAAGCCTGTTCTAGTCCTGG 

TGGGACACACTGGGGTTGGGGGTGGGGGAAGATGCGGTAATGAAACCGGTTAGTCAGTGTTGTCTTAATATCCTTGA 

TAATGCTGTAAAGTTTATTTTTACAAATATTTCTGTTTAAGCTATTTCACCTTTGTTTGGAAATCCTTCCCTTTTAA 

AGAGAAAATGTGACACTTGTGAAAAGGCTTGTAGGAAAGCTCCTCCCTTTTTTTCTTTAAACCTTTAAATGACAAAC 

CTAGGTAATTAATGGTTGTGAATTTCTATTTTTGCTTTGTTTTTAATGAACATTTGTCTTTCAGAATAGGATTCTGT 

GATAATATTTAAATGGCAAAAACAAAACATAATTTTGTGCAATTAACAAAGCTACTGCAAGAAAAATAAAACATTTC 

TTGGTAAAAACGTAT 

<210> SEQ ID NO 401 

<211> Length : 153 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 401 
>HSENA7 8_node_9 

ATATAATATATATTATATATTTAGCATTGCTGAGCTTTTTAGATGCCTATTGTGTATCTTTTAAAGGTTTTGACCAT 
TTTGTTATGAGTAATTACATATATATTACATTCACTATATTAAAATTGTACTTTTTTACTATGTGTCTCATTGGTT 
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<210> SEQ ID NO 402 

<211> Length : 110 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 402 
>HSENA7 8_node_3 

GTAAGTTCTGTGCTGCTGTGTCCGCTGTGACCTTGGCAAGAGAGAAATCCCGCAGCCTGGGTCTTCAACCTTGGTAT 
CTCATGAGTGTATCTTCTTTTTCTTTCCTTCAG 

<210> SEQ ID NO 403 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 403 
>HSENA7 8_node_4 

AGCCTCCCTGAAGAACGGGAAGGAAATTTGTCTTGATCCAGAAGCCCCTTTTCTAAAGAAAGTCATCCAGAAAATTT 
TGGACGG 

<210> SEQ ID NO 404 

<211> Length : 23 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 404 
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>HSENA7 8_node_8 
GTATTTATATATTATATATTTAT 



<210> SEQ ID NO 405 

<211> Length : 139 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 405 
>HUMODCA__node_l 

GTGCGTCTCCATGGCGACCCGCCGGTGCTATAAGTAGGGAGCGGCGTGCCGTGGGGCTTTGTCAGTCCCTCCTGTAG 
CCGCCGCCGCCGCCGCCCGCCGCCCCTCTGCCAGCAGCTCCGGCGCCACCTCGGGCCGGCGT 



<210> SEQ ID NO 406 

<211> Length : 135 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 406 
>HUMODCA_node_25 

GTTGGTTTTGCGGATTGCCACTGATGATTCCAAAGCAGTCTGTCGTCTCAGTGTGAAATTCGGTGCCACGCTCAGAA 
CCAGCAGGCTCCTTTTGGAACGGGCGAAAGAGCTAAATATCGATGTTGTTGGTGTCAG 



<210> SEQ ID NO 407 

<211> Length : 163 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 407 
>HUMODCA_node__32 

ATCACCGGCGTAATCAACCCAGCGTTGGACAAATACTTTCCGTCAGACTCTGGAGTGAGAATCATAGCTGAGCCCGG 
CAGATACTATGTTGCATCAGCTTTCACGCTTGCAGTTAATATCATTGCCAAGAAAATTGTATTAAAGGAACAGACGG 

GCTCTGATG 

<210> SEQ ID NO 408 

<211> Length : 215 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 408 
>HUMODCA_node_3 6 

AGACCTAAACCAGATGAGAAGTATTATTCATCCAGCATATGGGGACCAACATGTGATGGCCTCGATCGGATTGTTGA 
GCGCTGTGACCTGCCTGAAATGCATGTGGGTGATTGGATGCTCTTTGAAAACATGGGCGCTTACACTGTTGCTGCTG 
CCTCTACGTTCAATGGCTTCCAGAGGCCGACGATCTACTATGTGATGTCAGGGCCTGCGTG 

<210> SEQ ID NO 409 

<211> Length : 173 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 409 
>HUMODCA_node_3 9 

GATGCCAGCACCCTGCCTGTGTCTTGTGCCTGGGAGAGTGGGATGAAACGCCACAGAGCAGCCTGTGCTTCGGCTAG 
TATTAATGTGTAGATAGCACTCTGGTAGCTGTTAACTGCAAGTTTAGCTTGAATTAAGGGATTTGGGGGGACCATGT 

AACTTAATTACTGCTAGTT 
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<210> SEQ ID NO 410 

<211> Length : 1, 096 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 410 
>HUMODCA_node_4 1 

TTGAATATTTGTTTTATATGGATTTTTATTCACTCTTCAGACACGCTACTCAAGAGTGCCCCTCAGCTGCTGAACAA 
GCATTTGTAGCTTGTACAATGGCAGAATGGGCCAAAAGCTTAGTGTTGTGACCTGTTTTTAAAATAAAGTATCTTGA 
AATAATTAGGCATTGGGACGTTTTTATGGTGTGTTCATTCCAGACAGTTCACGAATCCCGTATAGCTCGCTCTGATT 
CTCAGAGAACAATGAGTGGGTCCACCCACACACAGGTAGGAGGACAGGTGAGACGGAAGCCCCATCCTCCCATGTGG 
ACGGTGCACATCTGCTCAGCCCACCCCACATGTCCAGAGTTGGCTGCAAACTCCTTGTCCAGAGCCTCTGGTGGTGG 
GACCTACTTAAGTCTGACGGACCTGTCCTGTCCAGGCCAGTGCCCAGGGAAGGTGTGGGAGGCCCTTTGAGCCTGGC 
CTGCAGAGACCATCCGTGTCCCCTCCCACCTTCATGCCTGTGAGAAGTTAGGAATGTATACGGTACCACATTTGGCA 
GTCAGCTTATTTTAATAAATTCAGCAACAGCAAGTCCCTACCATGTTGTGTATCTTCACCATCTTGTCTGACCATGA 
CCACTGGCCTTGTGTGTXCTTTTACTCAACGTGTACCCCCGCTCTCCCCCAAAGTGTGGCAGGCTCTCATGCTCCTT 
AACCCCCATTGTGGCAATGTCTTACGGGTAACGCTGGAGCTGCAGGAGGAGGGAAGGACACGTCAGAGCCACCAGGC 
AGTGGGAGCATCTTGGAGTCCCCACCAGCCTCATGAGGGGGACAGGAAGAGAGCAAATGTGTAGGGAGGAAGGCTGT 
GGCTCCTCCCGGGGTGGGAAGGTCAAGCCGATGCTGTCACCCATTTACCAAAGCTGAAGAGAGTGACTTCCTTTCTC 
AAAAGCATCACCTTCCCCTGAACCCTGAGTCCAGAGAAGCCAGGAGCCCTCATGTGGCTGCCGAGTTAGCTCAGGGC 
TTGGCTTATCACCAACTCTGGTCTCCCTGGGCCAGGGTTGCCAAAACATGAAAGATTTTTTCAGGAGCCAGAGGTTG 

GTTCTGACTGGAGGGGGA 



<210> SEQ ID NO 411 

<211> Length : 117 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 411 
>HUMODCA_node_0 

GACGTCGGCCCGCCGGCGCCCCACCAGCTCCGCGCGGGCCCGGGTTGGCCACCGCCGGGCCCCCGCCCCTCCCCCGG 
CGGTGTCCCGGCCGGAACCGATCGTGGCTGGTTTGAGCTG 
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<210> SEQ ID NO 412 

<211> Length : 110 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 412 
>HUMODCA_node_l 0 

ATTGTCACTGCTGTTCCAAGGGCACACGCAGAGGGATTTGGAATTCCTGGAGAGTTGCCTTTGTGAGAAGCTGGAAA 
TATTTCTTTCAATTCCATCTCTTAGTTTTCCAT 

<210> SEQ ID NO 413 

<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 413 
>HUMODCA_node__12 

AGGAACATCAAGAAATCATGAACAACTTTGGTAATGAAGAGTTTGACTGCCACTTCCTCGATGAAGGTTTTACTGCC 
AAGGACATTCTGGAC 

<210> SEQ ID NO 414 

<211> Length : 27 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 414 

>HUMODCA_node_l 3 

C AGAA A ATT AAT GAAGT TTCTTCTTCT 

<210> SEQ ID NO 415 

<211> Length : 72 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 415 
>HUMODCA_node_2 

CTCCGGCGGGCGGGAGCCAGGCGCTGACGGGCGCGGCGGGGGCGGCCGAGCGCTCCTGCGGCTGCGACTCAG 

<210> SEQ ID NO 416 

<211> Length : 82 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 416 
>HUMODCA_node_2 7 

CTTCCATGTAGGAAGCGGCTGTACCGATCCTGAGACCTTCGTGCAGGCAATCTCTGATGCCCGCTGTGTTTTTGACA 
TGGGG 

<210> SEQ ID NO 417 
<211> Length : 56 
<212> Type : DNA 
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<213> Organism : Homo sapiens 



<400> sequence : 417 
>HUMODCA_node_3 

GCTCCGGCGTCTGCGCTTCCCCATGGGGCTGGCCTGCGGCGCCTGGGCGCTCTGAG 



<210> SEQ ID NO 418 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 418 
>HUMODCA_node_30 

GCTGAGGTTGGTTTCAGCATGTATCTGCTTGATATTGGCGGTGGCTTTCCTGGATCTGAGGATGTGAAACTTAAATT 
TGAAGAG 



<210> SEQ ID NO 419 

<211> Length : 113 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 419 
>HUMODCA_node_34 

ACGAAGATGAGTCGAGTGAGCAGACCTTTATGTATTATGTGAATGATGGCGTCTATGGATCATTTAATTGCATACTC 
TATGACCACGCACATGTAAAGCCCCTTCTGCAAAAG 
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<210> SEQ ID NO 420 

<211> Length : 55 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 420 
>HUMODCA_node_3 8 

GCAACTCATGCAGCAATTCCAGAACCCCGACTTCCCACCCGAAGTAGAGGAACAG 

<210> SEQ ID NO 421 

<211> Length : 94 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 421 
>HUMODCA_node__4 0 

TTGAAATGTCTTTGTAAGAGTAGGGTCGCCATGATGCAGCCATATGGAAGACTAGGATATGGGTCACACTTATCTGT 
GTTCC TAT G G AA AC T AT 



<210> SEQ ID NO 422 

<211> Length : 271 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 422 
>R002 99_node_2 

GCGGCCGCAGAGCACTTTGCCCGGAGCCCAGCGTCCTCCCCTGAGTTCGCTGAGTCTCCCGGGACCAGCAAAGGCTG 
CGCGCCCCGCATCGGCCCGGAGGCGGGGAGCCCTGGGAGGCCTGGCCGAGCTGCCCGCAGGGAAATGGCGGAGAAAG 
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CGCTTCTCTGCCCGAGTTCAGCCGGGCTGGGGACTTGGCCCTGGGTCCTGAACTCGGCATGGCCAGTTCTGCCTCTG 
GCTGTGGACCAGGGTGTGGACTGGAGACCGCGGGGGCCAG 

<210> SEQ ID NO 423 

<211> Length : 172 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 423 
>R002 99_node_30 

GGGATCGACATTGAGACCAAGATGCACGTCCGCTTCCTTAACATGGAAACCATGGCCCTCTGCCACTGACGCACCGC 

CACCTCCGCGGAGAAACTGCACTTTGCAATGGGGCCGCCTCCCCGCGTAGCTGGAGCAGCCCAGGCCCGGCGGACAG 
CCTCTTCCTGCAGCGCCG 

<210> SEQ ID NO 424 

<211> Length : 77 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 424 
>R0 02 9 9_node_10 

GAGAACTTCAACAATGTCCCGGACCTGGAGCTCAACCCCATCCGATCCAAAATTGTTCGTGCCTTCTTCGACAACAG 

<210> SEQ ID NO 425 

<211> Length : 115 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 425 
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>R00299_node_14 

GAACCTGCGCAAGGGACCCAGTGGCCTGGCTGATGAGATCAATTTCGAGGACTTCCTGACCATCATGTCCTACTTCC 
GGCCCATCGACACCACCATGGACGAGGAACAGGTGGAG 

<210> SEQ ID NO 426 

<211> Length : 25 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 426 
>R0 02 99_node_15 
CTGTCCCGGAAGGAGAAGCTGAGAT 

<210> SEQ ID NO 427 

<211> Length : 62 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 427 
>R002 99_node_2 0 

TTCTGTTCCACATGTACGACTCGGACAGCGACGGCCGCATCACTCTGGAAGAATATCGAAAT 

<210> SEQ ID NO 428 

<211> Length : 108 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 428 
>R0 0 2 9 9_node_2 3 

GTGGTCGAGGAGCTGCTGTCGGGAAACCCTCACATCGAGAAGGAGTCCGCTCGCTCCATCGCCGACGGGGCCATGAT 
GGAGGCGGCCAGCGTGTGCATGGGGCAGATG 

<210> SEQ ID NO 429 

<211> Length : 48 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 429 
>R0 0 2 9 9_node_2 5 

GAGCCTGATCAGGTGTACGAGGGGATCACCTTCGAGGACTTCCTGAAG 

<210> SEQ ID NO 430 

<211> Length : 9 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 430 
>R 0 0 2 9 9_n o de_2 8 
ATCTGGCAG 

<210> SEQ ID NO 431 

<211> Length : 108 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 431 
>R0 02 9 9_node_31 

GTACATAGCCAAGGCTCGTCTGCGCACCTTGTGTCTTGTAGGGTATGGTATGTGGGACTTCGCTGTTTTTATCTCCA 
A T A A A A A A A A A A A A A A G G T T T G T T AA T T A A T 

<210> SEQ ID NO 432 

<211> Length : 70 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 432 
>R002 99_node_5 

TCTCATCGGATCAGATCGAGCAGCTCCATCGGAGATTTAAGCAGCTGAGTGGAGATCAGCCTACCATTCG 

<210> SEQ ID NO 433 

<211> Length : 4 

<212> Type : DNA 

<2 13> Organism : Homo sapiens 

<400> sequence : 433 
>R002 99_node_9 
CAAG 

<210> SEQ ID NO 434 

<211> Length : 157 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 434 
>W602 82_PEA_l_node_10 

GGCTTGTAGGGGGAGAGACCAGGATCATCAAGGGGTTCGAGTGCAAGCCTCACTCCCAGCCCTGGCAGGCAGCCCTG 
TTCGAGAAGACGCGGCTACTCTGTGGGGCGACGCTCATCGCCCCCAGATGGCTCCTGACAGCAGCCCACTGCCTCAA 
GCC 

<210> SEQ ID NO 435 

<211> Length : 137 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 435 
>W602 82_PEA_l_node_18 

TACGCCTGCCTCACACCTTGCGATGCGCCAACATCACCATCATTGAGCACCAGAAGTGTGAGAACGCCTACCCCGGC 
AACATCACAGACACCATGGTGTGTGCCAGCGTGCAGGAAGGGGGCAAGGACTCCTGCCAG 

<210> SEQ ID NO 436 

<211> Length : 436 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 436 
>W 6 0 2 8 2_PEA_l_n o de__2 2 

TCTCTTCAAGGCATTATCTCCTGGGGCCAGGATCCGTGTGCGATCACCCGAAAGCCTGGTGTCTACACGAAAGTCTG 
CAAATATGTGGACTGGATCCAGGAGACGATGAAGAACAATTAGACTGGACCCACCCACCACAGCCCATCACCCTCCA 
TTTCCACTTGGTGTTTGGTTCCTGTTCACTCTGTTAATAAGAAACCCTAAGCCAAGACCCTCTACGAACATTCTTTG 
GGCCTCCTGGACTACAGGAGATGCTGTCACTTAATAATCAACCTGGGGTTCGAAATCAGTGAGACCTGGATTCAAAT 
TCTGCCTTGAAATATTGTGACTCTGGGAATGACAACACCTGGTTTGTTCTCTGTTGTATCCCCAGCCCCAAAGACAG 
CTCCTGGCCATATATCAAGGTTTCAATAAATATTTGCTAAATGAGTGAATC 
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<210> SEQ ID NO 437 

<211> Length : 669 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 437 
>W 6 0 2 8 2_PEA_l__node_5 

GGAGCGGCCTAGGGGAGGCCAGGGGCCCACCTGGGCTGGGGCTGTGGAGAGGGAGTGGCTGGGACGGGAGGAAAAAG 
AGAGACGGAGATTAGATGGAAGAAGAGGGATTTCAAGACAAATTGCCAGAGATGCAGTCAGAGAGACTGACTGAGAG 
ACACAAAGATAGAAGGAATTAGAGAAAGGGCCACACAGAGCCAGACAGAGAGAGAAGAGTGGAGATGGAGACAGGGA 
CGAGGACAGAGAAAGGCAGACAGACACATAGGGACAGAAAGAGAAAAATCACACAAAGTCAGAATTACTGAATGACA 
GGGAATGACACATAGAACGAGACACAGATTCAGAGACTCAGGGCAGGGAAAGGAAGGCTGCAGACAGACAGACAGAC 
AGAGGGAGGCTGAGACACAGGGAGAAGAGGGGCTTGGAGAGGTGGCACAGGCAGGCAGCCAGTGCCTCAGAGGCCTC 
CGGGGAGGGCCCTCACACACACCCCGCCCCGGGGCATTAAGGCAGGGCTTGGAGGCCAGTCATCCTGGGCCCGCCCA 
GGGCCGCCCCCCTGCCAGCCCGCCTGCCTGGTGCCTGGCACCTGGCGCTCCAACCCAGCCTACCTGCTGTAGCTGCC 
GCCACTGCCGTCTCCGCCGCCACTGGGCCCCCAGAGCCCCAGCCCCAGAGCCT 

<210> SEQ ID NO 438 

<211> Length : 33 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 438 

>W 6 0 2 8 2_PEA_l_node_2 1 

GGTGACTCCGGGGGCCCTCTGGTCTGTAACCAG 

<210> SEQ ID NO 439 

<211> Length : 75 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 439 
>W602 82JPEA_l_node_8 

AGGAACCTGGGGCCCGCTCCTCCCCCCTCCAGGCCATGAGGATTCTGCAGTTAATCCTGCTTGCTCTGGCAACAG 

<210> SEQ ID NO 440 

<211> Length : 616 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 440 
>Z 4 1 6 4 4_PEA_l_node_0 

CCTTGCTGTTCATGGCCCAGCAGGGGCCCTATGGGGGTCAGGCCTGCAGGCACTCACACTTGGCACCTGCTCCAAAA 
CCCTTTCAGGTCTTTGAGGATCTGAGCCCTGGGCCTGGGTCTCCCGCCGGCTCGGAAAAGCTGGCCTGCCGGGCCAG 
ACGAGAGAACCACACGATTCAGAAAAGCAGTGCCCTTCAGCAGCCTCTCCACCGTCTGGGCTCCCCAAAGGCAGAGC 
GGGACGCTGGAAATGTGTGCGCGCTGTGGTATGGGTGTGCAAGTGTGCGAAGGCGGCGTGTTGTGTGAGCGAGAGGG 
TAGCGGATGTGTGTGTGCGTGTGCGCGCGTGGCTCCGGGTGTGCGCCGCTGCGATAGCGGGTCCTTTCCCGGGGCGG 
GCGACGGGCGGGCTGGGAAGGTCTCCTCCCCTCACCACATTGAGAAATCTCAGTGAGTCACCGAGTGGTTCTGCATA 
TTAATGAGCTCGCTCGCTGCGAGGGCAGGAGCGGATTTAAAAGAGGCCAGGGCGGGCGGAGGGAGGCTGTGGAGAGA 
GCGCGGAGACAAGCGCAGAGCGCAGCGCACGGCCACAGACAGCCCTGGGCATCCACCGACGGCGCAGCCGGAGCCAG 

<210> SEQ ID NO 441 

<211> Length : 1,062 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 441 
> Z 4 1 6 4 4_PE A_l_node_l 1 

GTACGCCCCGCCTCTACTCACCTTCCTTCCCACCAGACCCAGCTGTGGCTCTCAGGATGGGAAGGGACCCCCCCACC 
AGGTCATCTAGCCCCATCTAATATGTGAACACCCACCACAACATCCACAGCAAACAGATACTCAGACAATGCTTACA 
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TACCCCCAGGGACAAGGAACTCACCACTTACCAAAGCTAGCTATCCATCTCTTGTCCATTTGCAAGCATGGCAGGTT 
TGTCATTTTGTAAACTAAAGTCTGTCTCACTCTAATATTTGCATTATAATCTTAATTCCTCTTTTTATTTCAGTTAC 
GTAAGTTGTTAAATGGCAGAGTGAGCACTGGCATGGCTGCCAGGGGAGCTCTGAGGACTTCAGTGGGGTGAAATGTG 
ACCACTTAGGTGACTGTGTATGTTGGCTATAAAACTGCGCTATAAAACCATGAGGTGCTGAGGATGATCCTTGCCAG 
AAACATGTTTTCTTCTCCAAGGTGCCCCACTCCCTCTGCTGCCCAGAAACCTGATAAACTCCTTCCTTCGCAGGTGC 
TGGAAGGCACCACAGGTTTGGCTCTTTAAAATCAGAGCCACTGTTAACCAAGGCGGGCAGCAGTGTTAAGACCACCA 
GCACCCTGAACCAGCCCTGTACTTACTGGGCACTGTTTCCTTAAAATCAGAAGGTGGCTTCCCATCTCTGGTTTCCT 
GGGGTCTTATGTCTGTCCTCGGAGGGAGAATCCAGTTCCTAGCTCCCCTGTACCATGCGAAGGTAGCCTGTCCTGTC 
TCACTCCTCAGATACGCAGAGTCTGTTTACACATTTGCCTGCATAGCATGATCAGGAAGCACACACACACACACACA 
CACACACACACACGCATGCATGCACACACCATGCAGGTGACTTCCCCAGGAACTAGTGCCAGCACCCCTGCTGCAGA 
GGGGGATATCAAGGCTAAATGGAAGAGAGGGGTGACTTGCCTGGGAGCACAGGGCAAAGCCAGGACAGCAAACCAGG 
CCTCCTGGTGCTACCCCACCAGCTGCCCTCACAGGGTGGAAGGTACAGCCATAGTGGGTGC 



<210> SEQ ID NO 442 

<211> Length : 261 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 442 
> Z 4 1 6 4 4_PE A_l_node__l 2 

CTGCATTGCCCTCCCCTCACCTGGCCCAGCCATGCTACCCCAAGCTCAGCCCTGTGACCAGCTCTCCCAGAGCTGAC 
ACTCGGGCTCAACCCCTATACCTGAGCCTTTTTTGCTGCCTCCAAAACAGCCTCATCTGCAGTTGCTTGAAATAGAA 
AGTGATGAGAGCAATAAATTATTTTCTATAAATCTGCTGGGAATGAAGCCCTCTTTCTGGTCAAGCCAGGCAGCTCA 
TGTGGCAAAGGCCAGAACTGCGCAGTCCAC 



<210> SEQ ID NO 443 

<211> Length : 1,361 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 443 
>Z416 4 4_PEA_l_node_15 

GCCCTGTGTCATCAAAACTGGCAGAAGGCTAATCCCATGGGCAGGTTATGGAAGAGGCTGAGGGCATCTTGATCTGA 

TTGCTGGGGGATACTCAAACCTTTAGCTCACCTTGCTTCTCCCCTCCACCTGAGCTGCAGCCTGGAAAAGGAGGCAC 

CCACAGGTCTAAACATGGCCCTGCTTTTTTTTTTCCTGAAAATTCCAATAACAAAAGCAACGAGAGCCTCTCACTAC 

CAGGCCTTCTCTCACTTTGCTATAAAATTAGTTCACCCCTCTTTCTTAGAGTGTTGAGGTCCCTGCCCTCCCCACCT 

CCCTCCCCTGAAACAAGTTGAA71ATATCTTAATGAACATAGAACAGTGATAAAGGAAGTGTTTGAAGTCCTCTTTGT 

ACAGAGAGAGAGAGAAAGAGAATGCCAAAGCTAGGTTGGAGGAAGTAGAAGGGTATACGGTGGGCTCAGGCCCATGG 

GGGCCACACAGAGGAGCTCTGTGCACTTCAGAGACCAGAGCTTCCAGGGAGCTTCTGGCCACCACAGGAAGCAGCCT 

AGTCAGGCATTTTATTTCAATGGATAATTCAGTGGTCTTACTCAGAAATCAAGAACGAGACAGAAAAGTGATAGGCT 

AAGTGTAACGTATGGCCCCAGGGCAGCCATGGGGCAGAACTAGAAGAAAGCAAAATATCTAACTGGGCACAGCTTGA 

GAGGTGAGGGGAAGGTGGGGCTGGGAACGAGTAGAGATGAGGCAATGCAGCCAGGAGCAGGGACTGAGGGGCACAGG 

CCTCCTGCACCACTGCCCCACCCCACCAACCACCTCTTCTGTCTCCAGGAAGCAGCTTCTAGAGCTAGCATTCTTCT 

GGAGGACATGCATTATTTGGGCAAAATACAAAGAAATATACAAGCCTAAGTCAAGTAAGGGAATGCCTCCCACCCTT 

GCTATTTTCTCTAAATAGAGAGGCTGAGTACAGACGCGGAAAGAAACAAGGAGGTGTGGGAGCAGCCCGCCATGCTA 

GAGAAAGACTACATTCCTGCCACTAACAGTCGGTGGCCACTGGGCAAATCTTAAGTCTGTGGTGCCTCAGTTTCCTC 

ATATGCAAAGCGGGTTTGTTCCATAGGCCTCTGAGGACAAAATGAGATTGCAGAAGTGAGATTGCAGATGGTTAGAA 

AAGACAAAGCCACACTGGTGTGAGTTTTCATGGTCCCCGGGACCACATCCTCAGAAGGATCCCTCCCACTTCTCCTG 

GGGGTTCCTGCAGTTCTGGGACAGGGGCATTCCCTGCAGACCAGACGTGAATGAAGCCGCTTAGCCAGCATCTTGTG 

AACGGCCTGCCTCATGTCCTGAGCCACTTACACATGTGTTTTTTCTCCCCAG 



<210> SEQ ID NO 444 

<211> Length : 569 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 444 
>Z 4 164 4_PEA_l_node_2 0 

ATGGGAGACCCATCTCTCTTGTGCTCCAGACTTCATCACAGGCTGCTTTTTATCAAAAAGGGGAAAACTCATGCCTT 
TCCTTTTTAAAAAATGCTTTTTTGTATTTGTCCATACGTCACTATACATCTGAGCTTTATAAGCGCCCGGGAGGAAC 
AATGAGCTTGGTGGACACATTTCATTGCAGTGTTGCTCCATTCCTAGCTTGGGAAGCTTCCGCTTAGAGGTCCTGGC 
GCCTCGGCACAGCTGCCACGGGCTCTCCTGGGCTTATGGCCGGTCACAGCCTCAGTGTGACTCCACAGTGGCCCCTG 
TAGCCGGGCAAGCAGGAGCAGGTCTCTCTGCATCTGTTCTCTGAGGAACTCAAGTTTGGTTGCCAGAAAAATGTGCT 
TCATTCCCCCCTGGTTAATTTTTACACACCCTAGGAAACATTTCCAAGATCCTGTGATGGCGAGACAAATGATCCTT 
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AAAGAAGGTGTGGGGTCTTTCCCAACCTGAGGATTTCTGAAAGGTTCACAGGTTCAATATTTAATGCTTCAGAAGCA 
TGTGAGGTTCCCAACACTGTCAGCAAAAAC 

<210> SEQ ID NO 445 

<211> Length : 163 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 445 
>Z 4 1 6 4 4_PE A_l_node_2 4 

CAATATATTTGTGATTCCCCATGTAATTCTTCAATGTTAAACAGTGCAGTCCTCTTTCGAAAGCTAAGATGACCATG 
CGCCCTTTCCTCTGTACATATACCCTTAAGAACGCCCCCTCCACACACTGCCCCCCAGTATATGCCGCATTGTACTG 
CTGTGTTAT 

<210> SEQ ID NO 446 

<211> Length : 81 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 446 
>Z 4 1 6 4 4_PEA_l_node_l 

CAGAGCCGGAAGGCGCGCCCCGGGCAGAGAAAGCCGAGCAGAGCTGGGTGGCGTCTCCGGGCCGCCGCTCCGACGGG 
CCAG 

<210> SEQ ID NO 447 

<211> Length : 56 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 447 
> Z 4 1 6 4 4_PEA_l_node_l 0 

CTGCAGAGCACCAAGCGCTTCATCAAGTGGTACAACGCCTGGAACGAGAAGCGCAG 

<210> SEQ ID NO 448 

<211> Length : 17 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 448 
> Z 4 1 6 4 4_PE A_l__no de_l 3 
ACTCTGTCACCCTCCAG 

<210> SEQ ID NO 449 

<211> Length : 81 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 449 
>Z4164 4_PEA_l_node_16 

GGTCTACGAAGAATAGGGTGAAAAACCTCAGAAGGGAAAACTCCAAACCAGTTGGGAGACTTGTGCAAAGGACTTTG 
CAGA 

<210> SEQ ID NO 450 

<211> Length : 20 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 450 
> Z 4 1 6 4 4_PE A_l_no de_l 7 
T T A A A A A A A A A A A A A A A AAA 

<210> SEQ ID NO 451 

<211> Length : 108 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 451 
>Z 4 1 6 4 4_PEA_l_node_l 9 

AAGCCTTTCTTTCTCACAGGCATAAGACACAAATTATATATTGTTATGAAGCACTTTTTACCAACGGTCAGTTTTTA 
CATTTTATAGCTGCGTGCGAAAGGCTTCCAG 

<210> SEQ ID NO 452 

<211> Length : 40 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 452 
> Z 4 1 6 4 4_PEA_l_node_2 

CGCCCTCCCCATGTCCCTGCTCCCACGCCGCGCCCCTCCG 

<210> SEQ ID NO 453 

<211> Length : 23 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 453 
>Z 4 1 6 4 4JPE A_l__node__2 1 
CTTAGGAGAAAACTTAAAAATAT 

<210> SEQ ID NO 454 

<211> Length : 53 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 454 
>Z 4 1 6 4 4_ PEA_l_node_2 2 

ATGAATACATGCGCAATACACAGCTACAGACACACATTCTGTTGACAAGGGAA 

<210> SEQ ID NO 455 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 455 
> Z 4 1 6 4 4 _PEA_l_node_2 3 

AACCTTCAAAGCATGTTTCTTTCCCTCACCACAACAGAACATGCAGTACTAAAG 

<210> SEQ ID NO 456 

<211> Length : 103 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 456 
> Z 4 1 6 4 4_PEA_l_node_2 5 

ATGCTATGTACATGTCAGAAACCATTAGCATTGCATGCAGGTTTCATATTCTTTCTAAGATGGAAAGTAATAAAATA 
TAT TT G AA AT GTAC C A AAATT CT AG A 



<210> SEQ ID NO 457 

<211> Length : 3 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 457 
>Z 4 1 6 4 4_PEA_l_node_3 

GTCAGCATGAGGCTCCTGGCGGCCGCGCTGCTCCTG 



<210> SEQ ID NO 458 

<211> Length : 34 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 458 
> Z 4 1 6 4 4_PEA_l_node_4 

CTGCTGCTGGCGCTGTACACCGCGCGTGTGGACG 



<210> SEQ ID NO 459 
<211> Length : 106 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 459 
>Z4 1 64 4_PEA_l_node_6 

GGTCCAAATGCAAGTGCTCCCGGAAGGGACCCAAGATCCGCTACAGCGACGTGAAGAAGCTGGAAATGAAGCCAAAG 
TACCCGCACTGCGAGGAGAAGATGGTTAT 

<210> SEQ ID NO 460 

<211> Length : 58 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 460 
>Z 4 1 6 4 4_PEA_l_node_9 

CATCACCACCAAGAGCGTGTCCAGGTACCGAGGTCAGGAGCACTGCCTGCACCCCAAG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 461 

<211> Length : 669 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 461 
>Z 4 4 8 0 8_PE A_l_node_0 

CCTGGACCCTGGGGCGTGAGGAGGGCGCGGTGCGTCCCGTGGTTGTGCTTGGAAGCCCCCCGAGGGTGCGCGCGCGT 
GGGTATGAGTGCGTGCGTGTGCCTGGGTGTGCGTGTGTGTAAGTGTGCACGTGTGTGTGTGAGAGTGCGCGCGGGGA 
AGGAGGCACAGAGACAGCCCGGACAGGCCACTGCGCAGCCCTGGTGGCCCCCGCTCCACCTCTCGCTCCGCAGACCC 
GCGCCAGGGAGGCCTCTGGGCCGCAGCGGGCACCGGAGCGGAGCGGGCGCGGCAGCGGGCGCTGGGAGGTGGGGCTG 
GGGGAGGAGAGGGGGAGGGAGAGAGGCGGGCGGGAGGGGAGGATCCGGGAAGCTCCGGGGTATTTGACAGGAGCGAG 
GGCGGACGCAAAGAACGCGGAGGACCTCTGGGTGCCTGCAGGGGAGCTGCTCCAGCCGGGCCGCCGGGAGCGGTGGG 
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GAGAGCATCGCGGAGCCGCCCCTCCACGCGCCCGCCCAGCCGCGCTCGCCCACTGGGCTCTCCCGGCTGCAGTGCCA 
GGGCGCAGGACGCGGCCGATCTCCCGCTCCCGCCACCTCCGCCACCATGCTGCTCCCCCAGCTCTGCTGGCTGCCGC 
TGCTCGCTGGGCTGCTCCCGCCGGTGCCCGCTCAGAAGTTCTCGGCGCTCACG 

<210> SEQ ID NO 462 

<211> Length : 187 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 462 
>Z 4 4 8 0 8_PEA_l__node_l 6 

TGTCATCCTGTGACCAAGAGCACCAGTCTGCCCTGGAGGAAGCCAAGCAGCCCAAGAACGACAATGTGGTGATCCCT 
GAGTGTGCGCACGGCGGCCTCTACAAGCCAGTGCAGTGCCACCCCTCCACGGGGTACTGCTGGTGCGTCCTGGTGGA 
CACGGGGCGCCCCATTCCCGGCACATCCACAAG 



<210> SEQ ID NO 463 

<211> Length : 172 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 463 
> Z 4 4 8 0 8_PEA_l_node_2 

TTTTTGAGAGTGGATCAAGATAAAGACAAGGATTGTAGCTTGGACTGTGCGGGTTCGCCCCAGAAACCTCTCTGCGC 
ATCTGACGGAAGGACCTTCCTTTCCCGTTGTGAATTTCAACGTGCCAAGTGCAAAGATCCCCAGCTAGAGATTGCAT 

ATCGAGGAAACTGCAAAG 



<210> SEQ ID NO 464 
<211> Length : 275 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 464 
>Z 4 4 8 0 8_PEA_l_node_2 4 

GCTCTCAGAACCCGACCCCAGCCATACCCTAGAGGAGCGGGTGGTGCACTGGTACTTCAAACTACTGGATAAAAACT 
CCAGTGGAGACATCGGCAAAAAGGAAATCAAACCCTTCAAGAGGTTCCTTCGCAAAAAATCAAAGCCCAAAAAATGT 
GTGAAGAAGTTTGTTGAATACTGTGACGTGAATAATGACAAATCCATCTCCGTACAAGAACTGATGGGCTGCCTGGG 
CGTGGCGAAAGAGGACGGCAAAGCGGACACCAAGAAACGCCACA 



<210> SEQ ID NO 465 

<211> Length : 1,685 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 465 
>Z 4 4 8 0 8_PEA_l_node_32 

GATGCAATGGTGGTGTCCTCCAGACCCAAAGCCACAACCCATCGCAAGTCAAGAACACTTTCCAGAAGATAAACATG 

AGTGGGTTCATGTCTCTCTCCTTCAAAGCCAGGACAAAATCCCCACTTCTTTGCTGCCGCGAGTCAATTTGTGATTT 

ATTTTGTCTGCACCTGTTTGATGCCAGGTCGACATTTCCTAAGGCAAGCCCCTGTATTTGTTGTGGATTTAAGTGGA 

GGCGGCCAGCACACACCTTGGATGTAATTTAAAACCATTTCCTGAGGAAAGATGTGTGATATGCTTTCCTTTGTTTA 

GCAAATGTTTATGGTTTTAACTTTAAATCTCACCGCAAATCACTTACACTTGAAAACAGGGCTGGTCTGAAAGTAAT 

TACCCTCCCTGAGTGCCAAGACCTCCAGAAGTTGTTTTCATTCCCGAATGGCAATCACTGTACTCATGCGCTCCACG 

CATCTTAAATAAACTCAGTTCAAAGCACATGCCTCCTGCTTCAGCTCTTTTTTCCAAAAAGAGAAACAGAAGCAGGT 

TCCCCCTCCTTTTATAGTGCCTGCGTGGACACGCGGACCTCCATGCCTTTCATGCTGTGGCTATGTCAGCAAACTAC 

GATATTGGGATGATCCTAACGGGCAAGCCAGCTGCGGCTCCTACCGGCCGTGGCCATTGAAGGCCACCATGTTGCTT 

TGAAACATCTCAAAGAATAACATAGTGCCAGCCAGCAAGGGTTTCACCATATGCATGACCCAGACAGGAACTATCAA 

AAGAAGGGATCACGGGAAGGTGCATGATGCTAATGTGGAATCCAGAGGAGCTCTTTCCTGATCTCTTCAGCTTCCGC 

TGCCACTCCAGAATCATCAGAGCTGATATTAAATAAGTTAAAATGTTAGTCCACCGTCTCCTCCTGCAATCCTAACC 

ATCTTTTGAGACTGTTAGAATACTTTGACGGGTTGTCTTTCTGTGCAACTAATTTAAACCTCAAGTTTAGTGTAGGA 

GATGGGTTTGTCTTCTCACCTCTTCAGATCTTTATCAAGGGGGAATAAAAGCCAACCCAGAAACCTAAACTTTAAAA 

TTTAATTATTTGAAATAATAAAACAGAAGAAGGGATCAACATTTGTCGGAATTGGCACTCTTGGAAAACTAAGTCTA 

GGAGATCATATATTGCTTTTTTTTTTTCATTCTAAATTACTTTTAATTGAAAGTCAAGATGCTGAGTTACAGTTGTT 
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TATCATTATAATAAGCAAACTTTTTAAGTTGGATTTCTTCTTAAAGAGGTAAACTAGTGAACAAAAAAGATAAAAAG 
GAAAATTAAGAATCAACTATGCCTTTATCAAATTTGAAGCATAAGTTATATTATTAAAATTATTTTTGTATAATCAA 
GGTGATAAGACATTCTGGAAAACATTTAATGTATTTAGTACTTAGAATATTTACAGTGGATGTTACTTTTTTGAAAC 
GATATATTTTTCCCAATTTTTCTATCATGTCAAGGAAGGAAACTGTTAAGAAGTTACCAGTGTCCAAAATGTCTTCA 
TTGTTTCTTACTCATACTTACACCTCACATGACCTGCCCAGCCCTCTTTGGTTCAGTTCATTCCCAGAAGCCAAGCC 
TTAGTCTTCACAGATGAGCGACACACACCTCTGAATATAATGTCTCTTTTTTGTTTTTTCCTTTTCAG 



<210> SEQ ID NO 466 

<211> Length : 877 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 466 
>Z4 4 8 08_PEA_l_node_33 

CCAAGGAAACAAGGATAAATGGCTCATACCCCGAAGGCAGTTCCTAGACACATGGGAAATTTCCCTCACCAAAGAGC 
AATTAAGAAAACAAAAACAGAAACACATAGTATTTGCACTTTGTACTTTAAATGTAAATTCACTTTGTAGAAATGAG 
CTATTTAAACAGACTGTTTTAATCTGTGAAAATGGAGAGCTGGCTTCAGAAAATTAATCACATACAATGTATGTGTC 
CTCTTTTGACCTTGGAAATCTGTATGTGGTGGAGAAGTATTTGAATGCATTTAGGCTTAATTTCTTCGCCTTCCATA 
TGTTAACAGTAGAGCTCTATGCACTCCGGCTGCAATCGTATGGCTTTCTCTAACCCCTGCAGTCACTTCCAGATGCC 
TGTGCTTACAGCATTGTGGAATCATGTTGGAAGCTCCACATGTCCATGGAAGTTTGTGATGTACGGCCGACCCTACA 
GGCAGTTAACATGCATGGGCTGGTTTGTTTCTTGGGATTTTCTGTTAGTTTGTCTTGTTTTGCTTTCCAGAGATCTT 
GCTCATACAATGAATCACGCAACCACTAAAGCTATCCAGTTAAGTGCAGGTAGTTCCCCTGGAGGAAATAATATTTT 
CAAACTGTCGTTGGTGTGATACTTTGGCTCAAAGGATCTTTGCTTTTCCATTTTAAGCTTCTGTTTTGAGTTTTGCC 
CTGGGGCTTGAATGAGTCCCAGAGAGTCGTTCGGATGGTGGGAGGCTGCCTAGGAGGCAGTAAATCCAGTCACAGTG 
CCTGGGAGGGGCCCATCCTTCCAAAATGTAAATCCAGTCGCGGTGTGACCGAGCTGGCTAACAGGCTTGTCTGCCTG 

GTTTTCCTCCTACACGTGGACATTATTCTC 



<210> SEQ ID NO 467 
<211> Length : 252 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 467 
>Z4 48 0 8_PEA_l_node_3 6 

GATACGTGGTGCCCCCGGGGCTGGTGTTGGCAGCCGGGGGGAGGTGCCTGAGGGTCCCCACGGTTCCTTTCTGCTTT 
TCTGAATGCATCAAGGGTACGAGAACTTGCCAATGGGAAATTCATCCGAGTGGCACTGGCAGAGAAGGATAGGAGTG 
GAATGCCCACACAGTGACCAACAGAACTGGTCTGCGTGCATAACCAGCTGCCACCCTCAGGCCTGGGCCCCAGAGCT 
CAGGGCACCCAGTGTCTTAAG 

<210> SEQ ID NO 468 

<211> Length : 349 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 468 
>Z4 4 8 0 8_PEA_l_node_37 

GAACCATTTGGAGGACAGTCTGAGAGCAGGAACTTCAAGCTGTGATTCTATCTCGGCTCAGACTTTTGGTTGGAAAA 
AGATCTTCATGGCCCCAAATCCCCTGAGACATGCCTTGTAGAATGATTTTGTGATGTTGTGATGCTTGTGGAGCATC 
GCGTAAGGCTTCTTGCTTATTTAAACTGTGCAAGGTAAAAATCAAGCCTTTGGAGCCACAGAACCAGCTCAAGTACA 
TGCCAATGTTGTTTAAGAAACAGTTATGATCCTAAACTTTTTGGATAATCTTTTATATTTCTGACCTTTGAATTTAA 
TCATTGTTCTTAGATTAAAATAAAATATGCTATTGAAACTA 

<210> SEQ ID NO 469 

<211> Length : 0 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 469 
>Z 4 4 8 0 8_PEA__l_node_4 1 

CACCTCATAATCATGTGAAAAAGACACTCAAAAACTACCATTTGAATGGATGGATGAAAATAACCTCCGTATATTCT 
ACGAAGATGTTTAATAATAAATAGGTTTCGTTATAAGAGAATGTGTGTCACTTCGTCTCTTCCCTCACCCCCGAGAC 
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TTAGTGACAGTTATTTTTGACTTTTCCAACTATACTATTTGCCTAGAAAATGTGTCTATTAAATAGCGTATTGAGA^ 
AT 

<210> SEQ ID NO 470 

<211> Length : 51 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 470 
>Z 4 4 8 0 8__PEA_l_node_l 1 

ATGATGCCGCAGCTCCAGCGTTGGAGACTCAGCCTCAAGGAGATGAAGAAG 

<210> SEQ ID NO 471 

<211> Length : 75 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 471 
> Z 4 4 8 0 8__PEA_l_node_l 3 

ATATTGCATCACGTTACCCTACCCTTTGGACTGAACAGGTTAAAAGTCGGCAGAACAAAACCAATAAGAATTCAG 

<210> SEQ ID NO 472 

<211> Length : 83 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 472 
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>Z 4 4 8 0 8__PEA_l_node_l 8 

GTACGAGCAGCCGAAATGTGACAACACGGCCAGGGCCCACCCAGCCAAAGCCCGGGACCTGTACAAGGGCCGCCAGC 
TACAAG 

<210> SEQ ID NO 473 

<211> Length : 103 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 473 

> Z 4 4 8 0 8_PEA_l_node_2 2 

GTTGTCCGGGTGCCAAAAAGCATGAGTTTCTGACCAGCGTTCTGGACGCGCTGTCCACGGACATGGTCCACGCCGCC 
TCCGACCCCTCCTCCTCGTCAGGCAG 

<210> SEQ ID NO 474 

<211> Length : 95 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 474 

> Z 4 4 8 0 8_PE A_l_no de_2 6 

GAAGTAAGAGAAACCTGTGATGGCCAGAGCCCAGATGTTCTTAGGAGGCAAGCCAGGAGAAGCCGGGTCTGACTTTT 
CAGCTCAGAGACAGCACT 

<210> SEQ ID NO 475 

<211> Length : 38 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 475 

> Z 4 4 8 0 8_PEA_l_node_3 0 

CC CCCAGAGGTCATGCTGAAAGTACGTC TAATAGACAG 

<210> SEQ ID NO 476 

<211> Length : 75 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 476 

> Z 4 4 8 0 8_PEA_l_node_3 4 

CTGATCCTCCTACCTGGTCCACCCCAGGGCTACCGGAAGGTAAAATCTTCACCTGAACCAATTATGAGCAGTCTC 

<210> SEQ ID NO 477 

<211> Length : 19 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 477 
>Z 4 4 8 0 8_PEA_l_node_3 5 
CTTACTGAAGGTACAGCCG 

<210> SEQ ID NO 478 

<211> Length : 65 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 478 
>Z 4 4 8 0 8_PEA_l_node__3 9 

CTACTGTGGTTGAGAGGAAAGGTGTCTTTTTATTGCTTCTAGAGACGTTGAAAGTGTGACCTGAG 



<210> SEQ ID NO 479 

<211> Length : 107 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 479 
>Z 4 4 8 0 8_PEA_l_node_4 

ACGTGTCCAGGTGTGTGGCCGAAAGGAAGTATACCCAGGAGCAAGCCCGGAAGGAGTTTCAGCAAGTGTTCATTCCT 
GAGTGCAATGACGACGGCACCTACAGTCAG 

<210> SEQ ID NO 480 

<211> Length : 100 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 480 
>Z 4 4 8 0 8_PEA_l__node_6 

GTCCAGTGTCACAGCTAGACGGGATACTGCTGGTGCGTCACGCCCAACGGGAGGCCCATCAGCGGCACTGCCGTGGC 
CCACAAGACGCCCCGGTGCCCGG 



<210> SEQ ID NO 481 
<211> Length : 48 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 481 
> Z 4 4 8 0 8_PE A__l_node_8 

GTTCCGTAAATGAAAAGTTACCCCAACGCGAAGGCACAGGAAAAACAG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 482 

<211> Length : 170 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 482 
>AA161187_node__0 

GCTGGGAGTAGAGGGCAGAGCTCCCACCCCGCCCCGCCCCCAGGGGGCGCCCCGGGCCCGGCGCGTTAGGAGGCAGA 
GGGGGCGTCAGGCCGCGGGAGAGGAGGCCATGGGCGCGCGCGGGGCGCTGCTGCTGGCGCTGCTGCTGGCTCGGGCT 

G G AC T C AG G A AG C C G G 



<210> SEQ ID NO 483 

<211> Length : 120 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 483 
>AA1 6118 7_node_6 

GCACACACGCGAGGGGACCCTGGGTGGGCAAAAACGTGCTTTCCCGGACGGGGTTGAAGGGGAGAAAGGGAGAGGTC 
GGGCTTGGGGGGCTGCCTCCCGCGGCTCAGCAGTTCCTCTGAC 
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<210> SEQ ID NO 484 

<211> Length : 211 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 484 
>AA1 6118 7_node_l 4 

GCCTACTACACCCGTTACTTCGTATCGAATATCTATCTGAGCCCTCGCTACCTGGGGAATTCACCCTATGACATTGC 
CTTGGTGAAGCTGTCTGCACCTGTCACCTACACTAAACACATCCAGCCCATCTGTCTCCAGGCCTCCACATTTGAGT 
TTGAGAACCGGACAGACTGCTGGGTGACTGGCTGGGGGTACATCAAAGAGGATGAGG 



<210> SEQ ID NO 485 

<211> Length : 297 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 485 
>AA1 6118 7_node_l 6 

TGCTCACCAATGCCCCAGGCATCAGGCTCCTGGGCTGCCTCTCCATGCCTCCCACACCCACCCTAGCTCTGGCCGAT 
TCTCCTGCAGCACTGAGCCCATTCCTCTCCCCAGAAACTTCCAAGCCATGCTCAACCGCAGCTCCCACGGAAACCCC 
TCTGGGGGTTCCTCTGGTGGGCCTGCCCTGGCACCTGCGTGTCCCCCAACACACATGCCCTGAAAGAAGTGGGCCCA 
GCATCCGGAGGAGCCCCGGCAGCCCCAGACTGGGCGTGTTCCCTGTATCAGGAATCCCTTCCCTCT 

<210> SEQ ID NO 486 

<211> Length : 225 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 486 
>AA1 6118 7_node__2 5 

GAAAGCATCCTGTGTCCCTGTGCCTTATTTGACCCTCATGCCAACCCCGGGAGGTGGAGACTGTTGCCCCACTCTGC 
AGATGCAGAAACGGAGGCTTGGCTGCTGCCAGGGGGAGGAGGAGGATGTGCACCCAGTCTACCCAGCCCCATAGCCC 
TTCCCACTCTCAGCCCCTCCCCTGCCCCACTCACTCTGCCCCAGGCTGACCTCAGCCCCGCTGCTCCCCAG 

<210> SEQ ID NO 487 

<211> Length : 362 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 487 
>AA1 6118 7_node_2 6 

GGTGACTCAGGTGGACCCTTGGCCTGTAACAAGAATGGACTGTGGTATCAGATTGGAGTCGTGAGCTGGGGAGTGGG 
CTGTGGTCGGCCCAATCGGCCCGGTGTCTACACCAATATCAGCCACCACTTTGAGTGGATCCAGAAGCTGATGGCCC 
AGAGTGGCATGTCCCAGCCAGACCCCTCCTGGCCACTACTCTTTTTCCCTCTTCTCTGGGCTCTCCCACTCCTGGGG 
CCGGTCTGAGCCTACCTGAGCCCATGCAGCCTGGGGCCACTGCCAAGTCAGGCCCTGGTTCTCTTCTGTCTTGTTTG 
GTAATAAACACATTCCAGTTGATGCCTTGCAGGGCATTCTTCAAAAGCAATGGC 

<210> SEQ ID NO 488 

<211> Length : 515 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 488 
>AA1 6118 7_node_2 8 

ATAAGAGGACACAGTGAGAAGATGGGGGTCTGCTCGCCAGGAAGGAGCCCTCACCAGCAGCCGCATCGCTCAGCACC 
TTGATCCTGGACTTCCAGCCTCCAGAGCTGTGAGAAACAAACCTCTATCATCTACCAGCCGCCCACGGCGTGGGATT 
TGTGTTACAGCAGCCTGAGCTGACCCAGACGCCAAGGAGCAACACACGCACCAGGGTAGGCTGGAGAAACCAGAACC 
CGGGAATCCCGCCTCCCTCAACTTGAAACTTGGGAATAGTGTATTCTCTTTTCAACACTTGCACTAGTAGAAGGTTA 
ATTACATGAAAGATTAGGCAAAATGTATGGCTATGTGTCCTGGTTTTCCAATAAAAGTATTGAGTTTCTCTGGGGAA 
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AGTGCAGATAAAATGCTTAGTGGAGGCTGGGCGCTGTGGCTTATGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGC 
AGGCAGGCAGATCACAAGGTCAGGAGTTTGAGACCGGCCTGGCCAATATGATG 

<210> SEQ ID NO 489 

<211> Length : 27 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 489 

>AA1 6118 7_node_4 

AGTCGCAGGAGGCGGCGCCCTTATCAG 

<210> SEQ ID NO 490 

<211> Length : 8 

<212> Type t DNA 

<213> Organism : Homo sapiens 

<400> sequence : 490 
>AA1 6118 7_node_7 
CATCCGAG 

<210> SEQ ID NO 491 

<211> Length : 59 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 491 
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>AA161187_node_8 

GACCATGCGGCCGACGGGTCATCACGTCGCGCATCGTGGGTGGAGAGGACGCCGAACTC 



<210> SEQ ID NO 492 

<211> Length : 42 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 492 
>AA1 6118 7__node_9 

GGGCGTTGGCCGTGGCAGGGGAGCCTGCGCCTGTGGGATTCC 



<210> SEQ ID NO 493 

<211> Length : 65 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 493 
> A Al 6 1 1 8 7_n o de_l 0 

CACGTATGCGGAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGCGCACTGCTTTGAAAC 



<210> SEQ ID NO 494 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 494 
>AA1 6118 7_node_l 2 
CTATAG 

<210> SEQ ID NO 495 

<211> Length : 76 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 495 
>AA1 6118 7_node_l 3 

TGACCTTAGTGATCCCTCCGGGTGGATGGTCCAGTTTGGCCAGCTGACTTCCATGCCATCCTTCTGGAGCCTGCAG 

<210> SEQ ID NO 496 

<211> Length : 37 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 496 
>AA1 611 87_node__l 9 

GTTGCTGTCTCTCTCCTTCCCACTATCGTCCGCACAG 

<210> SEQ ID NO 497 

<211> Length : 26 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 497 
>AA1 6118 7_node_2 0 
CACTGCCATCTCCCCACACCCTCCAG 

<210> SEQ ID NO 498 

<211> Length : 59 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 498 
>AA1 6 1 1 87_node_2 1 

GAAGTTCAGGTCGCCATCATAAACAACTCTATGTGCAACCACCTCTTCCTCAAGTACAG 

<210> SEQ ID NO 499 

<211> Length : 28 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 499 

>AA1 6118 7_node_2 2 

TTTCCGCAAGGACATCTTTGGAGACATG 

<210> SEQ ID NO 500 

<211> Length : 42 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 500 
>AAl61187_node_2 3 

GTTTGTGCTGGCAATGCCCAAGGCGGGAAGGATGCCTGCTTC 

<210> SEQ ID NO 501 

<211> Length : 31 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 501 
> A A 161187 _n o de_2 4 

GTGAGTGTCCCTGCCACCACTCCCAGCCCAG 



Segment nucleic acid sequences : 

<210> SEQ ID NO 502 

<211> Length : 712 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 502 
>R6617 8_node_0 

GCAGCGAGCGCGGCTCCACATTGTTGCGGATCGCCGGCACCCGGCAGAGCGGCGGCGGCTGGGACGCGCGGCGCCTC 
CGACCCGTTCTCCTCGCGCCCGGCCGCGCAGCCAGAGCCACCCGGGCCGCCGACCGCCGAGCCCCGCGCCCGCCGCC 
TGGGCCCCGAGCCTTCTGCGCCGCCCGGGTGCGTCCCGCCACCCTCGGAGGACGGCCGGCCATGGACGCCTGCAAGT 
TGGAGCCGAGCGGGAGGGTGTGAGCGGGCCGGGGCCAGGAGCCCGCGCCGCGCAACCGGGCAGCCGGGCGCGCCGGG 
GGTGGGTCCCTCTCCCCAGCCCCGCTCTGCGTGGAAGAAGAGGGCGGGGACCGGCGCCGGGAGGAGAGCGGAGGAGG 
CGAAGGGGCATGACTCGTGCAACTTGCGGCGGGCATCTGCCGAGCCTCTGAGCCGGCGGCGGCCCGGGGCCCGGACT 
GCGGCCGCGCGGATCCACCCAGCCCACCCCGCCCCGGCCGACGGCTGCAGCTGACCTGGATCCTTCGAGCGCCCGCC 
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GACCGCCAGCGATCTTCCCTCATCTTCCGGGCTGGTTTCTGCTGCGCGAGGAGCGCTGCCCTCGCCGCCCCTCTCGC 
CGGACCCCCGGCCCCCGATGGCTCGGATGGGGCTTGCGGGCGCCGCTGGACGCTGGTGGGGACTCGCTCTCGGCTTG 
ACCGCATTCTTCCTCCCAG 

<210> SEQ ID NO 503 

<211> Length : 302 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 503 
>R6617 8_node_6 

CGGCACAGACGTGGTTCTGCACTGCAGCTTTGCCAACCCGCTTCCCAGCGTGAAGATCACCCAGGTCACATGGCAGA 
AGTCCACCAATGGCTCCAAGCAGAACGTGGCCATCTACAACCCATCCATGGGCGTGTCCGTGCTGGCTCCCTACCGC 
GAGCGTGTGGAATTCCTGCGGCCCTCCTTCACCGATGGCACTATCCGCCTCTCCCGCCTGGAGCTGGAGGATGAGGG 
TGTCTACATCTGCGAGTTTGCTACCTTCCCTACGGGCAATCGAGAAAGCCAGCTCAATCTCACGGTGATGG 

<210> SEQ ID NO 504 

<211> Length : 206 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 504 
>R6 61 7 8_node_8 

CCAAACCCACCAATTGGATAGAGGGTACCCAGGCAGTGCTTCGAGCCAAGAAGGGGCAGGATGACAAGGTCCTGGTG 
GCCACCTGCACCTCAGCCAATGGGAAGCCTCCCAGTGTGGTATCCTGGGAAACTCGGTTAAAAGGTGAGGCAGAGTA 
CCAGGAGATCCGGAACCCCAATGGCACAGTGACGGTCATCAGCCGCTACCGC 



<210> SEQ ID NO 505 
<211> Length : 139 



WO 2006/131783 



PCT/IB2005/004037 



269 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 505 
>R6617 8_node_15 

GCTAAATGGCTCTCTCCCCAAGGGTGTGGAGGCCCAGAACAGAACCCTCTTCTTCAAGGGACCCATCAACTACAGCC 
TGGCAGGGACCTACATCTGTGAGGCCACCAACCCCATCGGTACACGCTCAGGCCAGGTGGAG 



<210> SEQ ID NO 506 

<211> Length : 1, 474 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 506 
>R6 6 1 7 8_n o de_2 4 

GTGAGGGTCACAGTCTGCCCATCAGTCCTGGAGTTCTTCAGACCCAGAATTGCGGGCCTTGAAGGCCAGTGTCCTGG 
AAGGGAGGCAGGTGTGAGCCTGCAAGTGTGCATGCCCAGCCATGGTATGGACATGTGTCTCTGGGCATGTAAATGTG 
AACCAGTGTGAACAGGCTCCGTCTGCATGCTGAGTGTGCATGTGGGAGCCCGTGGCTGTGCCGTGGCAACGTGCCAT 
TCTCTGAGCCAGCGAATGGCAGTGTGTTGGGAGGTCTGAGAAGGCAGCTGCATCCGTGCCTCTGGGAGGATTCGGTT 
CTCCCCCAGCTTGCCGAGGCCCTGCCTGATGGTCTGACACGAGGCACAGCTGCTGCAGCTGCAGATGGACAGAAGGG 
CTTCCCAGAGGTGGACCCCAGGCCTCCCCACTCTCCCTGTGGCTGGCTGCACTGCATGCTGGGGGGTGTAGTTCTTG 
CAGCTTCCAGGCCTAATCTGATGCCGGAGCATTTCCTGCCTGAGGAAGCGCCAGGCATTGGTTTCGGAGGCAAACCC 
AAACATTCTCTTTGACCCCAAACCTCCAGATCCTAGATCCAGACTGTAAGCCCTAACACTTCACTCCACCTCAGATC 
TATCCAAAGCCCCCAGCACCAGCCCACCCACCTCAGTCAGAGAACCAGGACCCCAAAGGCATGCAGAGCCCCCACTT 
CCCCACTGTCTTGGCCAGCCAGGGACCCCAGAAGAGAGGTTACAACCCTTCAGGAATAGGGACAAGCTGCTCCCTTT 
GTAAGAGGATGTGAGGGAGGCTGGCTGGGCCCCTGCCAGCAAACACAAATGACCCTGCGGCCTGGCTCTTCTCTCTC 
CTCCCAGCTGCGGCCCTTGCAGCTCTGCTCCTGGCACAGAGACAGGAGCTACTGGCTGAGTGTAACAGCTGGAGGGA 
TGGAGGGGGAGGGGAGGACGCTCCACTCCACGCCAGACAGCCCCTTCTGCTTGCAAATGAGTTAGATCCCCATGCTT 
CTCCTTTCTTCTCTCCCTCACTCAGTATCCTCACTCGAAAGTCTTTGATGCTGGAAGGTCATCCCAGCCATTCTGCT 
GCTGCTACACAGGCCCAGCCCTAAACAAAATAACCGGGGTTCTTTGGTCCCAAAAGATCCCCAGGAAAGAGTAAACC 
TCCTTAAGACTTAAGGAAAAAAGTTGGCTAGGCGTGGTGGCTCACACCTGTAATCCCAGCACTTTGGGAGGCCGAGG 
TGGGCAGACCACTTGAGATCAGGAGTTCGAGACCAGCCTGGCCAACGTTGTGAAACCCCGTCTCTACAAAATATTAG 
CCGGGTGTGGCAGTGCGTGCCTGTAGTCCCAACTACTCAGGAGGCTGAGGCACAAGAATTGCTTGAGCCTGGGAGGC 
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GGAGGTTGCAGTGAGCCGAGATCGTGCCACTGCACTCCAGCCCAGGCGATGGAGCGAGACTCTGTCCCGCAAAAAAA 
AAAAAAAAAAA 



<210> SEQ ID NO 507 

<211> Length : 464 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 507 
>R6617 8_node_2 6 

AATTCCCCTACACCCCGTCTCCTCCCGAACATGGGCGGCGCGCCGGGCCGGTGCCCACGGCCATCATTGGGGGCGTG 
GCGGGGAGCATCCTGCTGGTGTTGATTGTGGTCGGCGGGATCGTGGTCGCCCTGCGTCGGCGCCGGCACACCTTCAA 
GGGTGACTACAGCACCAAGAAGCACGTGTATGGCAACGGCTACAGCAAGGCAGGCATCCCCCAGCACCACCCACCAA 
TGGCACAGAACCTGCAGTACCCCGACGACTCAGACGACGAGAAGAAGGCCGGCCCACTGGGTGGAAGCAGCTATGAG 
GAGGAGGAGGAGGAGGAGGAGGGCGGTGGAGGGGGCGAGCGCAAGGTGGGCGGCCCCCACCCCAAATATGACGAGGA 
CGCCAAGCGGCCCTACTTCACCGTGGATGAGGCCGAGGCCCGTCAGGACGGCTACGGGGACCGGACTCTGGGCTACC 

AG 



<210> SEQ ID NO 508 

<211> Length : 277 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 508 
>R66178__node_27 

TACGACCCTGAGCAGCTGGACTTGGCTGAGAACATGGTTTCTCAGAACGACGGGTCTTTCATTTCCAAGAAGGAGTG 
GTACGTGTAGCCCCCCTTCCAGAGCCTCTGTCTGTGACCGCTCCTAACCAGCCCCTCCCCGCACGCCCCCTGCCCAC 
CCCCCACCTCCCACTCCAGGAGCTGAACAGAGACTTGCCCAGCTGCCCAAAGCCAGCCCCGAACTCCTGGGGGGCCA 
GGGGAGCCCAGGGCAGCCACGACTTGGCTTTGTGTTTTATTTCCTC 
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<210> SEQ ID NO 509 

<211> Length : 37 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 509 
>R66178_node_4 

GCGTCCACTCCCAGGTGGTCCAGGTGAACGACTCCAT 

<210> SEQ ID NO 510 

<211> Length : 12 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 510 
>R6 617 8_node__5 
GTATGGCTTCAT 

<210> SEQ ID NO 511 

<211> Length : 97 

<212> Type : DNA 

<213> Organism : Homo sapiens ' 

<400> sequence : 511 
>R66178 node 9 
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CTGGTGCCCAGCAGGGAAGCCCACCAGCAGTCCTTGGCCTGCATCGTCAACTACCACATGGACCGCTTCAAGGAAAG 
CCTCACTCTCAACGTGCAGT 



<210> SEQ ID NO 5X2 

<211> Length : 118 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 512 
>R66178_node_ll 

ATGAGCCTGAGGTAACCATTGAGGGGTTTGATGGCAACTGGTACCTGCAGCGGATGGACGTGAAGCTCACCTGCAAA 
GCTGATGCTAACCCCCCAGCCACTGAGTACCACTGGACCAC 



<210> SEQ ID NO 513 

<211> Length : 13 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 513 
>R 6 6 1 7 8_n o de_l 6 
GT C AATATC ACAG 



<210> SEQ ID NO 514 
<211> Length : 107 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 514 
>R6 617 8_node_18 

CTTTCTGTCAACTTATCTATCCGGGCAAAGGGAGGACAAGAGCTAGGATGTTCTGAGGAGAGACTTCACCTGGGACG 
TGAAAGGAGCATGGGCTTGATGTCAGACAG 

<210> SEQ ID NO 515 

<211> Length : 20 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 515 

>R66178_node_19 

CTGTGACCCTGGACAGGGCC 

<210> SEQ ID NO 516 

<211> Length : 28 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 516 
>R6 617 8_node_2 0 

CCCCCCACCATCTGTAAAACGGGGACAG 

<210> SEQ ID NO 517 
<211> Length : 112 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 517 
>R6 617 8_node_21 

TATGATGTACCTTGAAGGGCTGTTGTCAGAATTCTACGTGATGTAAGTCAAGCACCTAGCACAGATCAGTCTGTCAA 
TAAATGGCCAATGTTCCGTGATTATTACCCTGACC 



Segment nucleic acid sequences: 

<210> SEQ ID NO 518 
<211> Length : 264 
<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 518 
>HUMPHOSLIP_PEA_2_node_0 

GGGTCTCCCACTTGTCCAGACAGCGGCCGGGCTTGTCACGGGGCTCTGTGCAGCCTTTTCCACTCTCCCGGCTGCCA 
GCGTCCCGCCCCGTCCCCTCCCAGCCCCCAAGGGAGGAGGGGAGAGCTGCAGAGAGGAGGAGGGGTCGGGGAGGCCC 
GGCTTTATAAAGGCGGCTGGAACAACCCTGCCCGCCAGACCCCGTCGCCCGGATCCCCTGAGCTGCCCGCCATCCCA 

CGTGACCGCGCCGCCCCCCAGCTCCACCGCTGA 

<210> SEQ ID NO 519 

<211> Length : 156 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 519 
>HUMPHOSLIP PEA 2_node_19 
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CTATGATGGGGGCTACATCAACGCCTCAGCTGAGGGTGTGTCCATCCGCACTGGTCTGGAGCTCTCCCGGGATCCCG 
CTGGACGGATGAAAGTGTCCAATGTCTCCTGCCAGGCCTCTGTCTCCAGAATGCACGCGGCCTTCGGGGGAACCTTC 

AA 



<210> SEQ ID NO 520 

<211> Length : 141 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 520 
>HUMPHOSLI P__PEA_2_node_3 4 

CTCCCCAACCGGGCAGTGGAGCCCCAGCTGCAGGAGGAAGAGCGGATGGTGTATGTGGCCTTCTCTGAGTTCTTCTT 
CGACTCTGCCATGGAGAGCTACTTCCGGGCGGGGGCCCTGCAGCTGTTGCTGGTGGGGGACAAG 



<210> SEQ ID NO 521 

<211> Length : 419 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 521 
>HUMPHOSL I P_PEA__2_node_6 8 

CTATCAATTACAGTCCGTCCACCACCTCCCTGTGGGCTGTCCTGAGCTCTGTTGGGTTCCTGGGATGGAATCAGTGC 
ATCATAAAGGGCATTCTTTAAGCAGAGAAGGGGCCAGGCCACCCCATTCAGGAACTGCTGCGGGAATAAAGTGCTAA 
CTTGCCCCCAGGCTGTCTATGGGAGACCCTGGGCCCAGTCTGGGATGTACAGGGCTCTGGGAAGGGGGCAGTCCTGG 
CGGCAGAACCCGGCCTGCAGGGGCACTTTGCTTAGAAGAGGACTCTCCTAGCGGGAGAGGCTGGGAGGGGCTGCATC 
AGGCCGTGGAGCTGGTTGCTGTGGTCATCAGTATGGCTGCTTGTTCAGGAAGCGGGAGAACATGGTGAAGGCAGCGA 
GGGGCTTGTCGGTGGGAACCATGTGGCCGGCGCC 



WO 2006/131783 



PCT/IB2005/004037 



276 

<210> SEQ ID NO 522 

<211> Length : 232 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 522 
>HUMPHOSLIP_PEA_2_node_7 0 

GTTCGTGAGCTCCTGACCCCACCATTCCCTCCTCCCCATATAACTGCTCACTCGGGGGCAATTCCTTCATCCCAAAC 
CCTTTATTCTTCCCAGAACCCTCCCCACCCCTCTCCAAAAAAACTTGCCCATACAGGGGCCAGATGGTGACCCATGA 
CCCAGCCTAAAAGGCAGCCAGAGGGAAAGGACGGGTGGGTCCTGCTCCTTTGCCTCCGGCCCAGTTATCTCTCAGCA 

G 

<210> SEQ ID NO 523 

<211> Length : 280 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 523 
>HUMPHOSLIP_PEA_2_node_7 5 

CTTCTGGTTGAGGGAATCCACAAACCACTCATCCCCCATGAAATTGCAGGCCATGTCTACATCTCCATTATATAATA 
GGATCTGGTATTTCTAAAGCAGGATGGGGTAAAAATGAGGGGTGTGGAACAAGCCCAGTCCCCAGCCCTTCCCTAGT 
TCAAGGCCTACCCCTCAGGAAATTCAAGGGGCCAAGCTAGATAACACGAACCAGGGAATTTTCATGTTTTCTAACGA 
CTTACTGCATGTCCAGTATTCTACTAAATGTTTTATCTGTGAAAGTAGA 

<210> SEQ ID NO 524 

<211> Length : 73 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 524 
>HUMPHOSLIP_PEA_2__node_2 

GCCCGCTCGCCATGGCCCTCTTCGGGGCCCTCTTCCTAGCGCTGCTGGCAGGCGCACATGCAGAGTTCCCAGG 

<210> SEQ ID NO 525 

<211> Length : 18 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 525 

>HUMPHOSLIP_PEA_2_node_3 

CTGCAAGATCCGCGTCAC 

<210> SEQ ID NO 526 

<211> Length : 20 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 526 
>HUMPHOSLI P_PEA_2_node_4 
CTCCAAGGCGCTGGAGCTGG 

<210> SEQ ID NO 527 

<211> Length : 8 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 527 
>HUMPHOSLI P_PE A_2_node_6 
TGAAGCAG 

<210> SEQ ID NO 528 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 528 
>HUMPHOSLI P__PEA_2_node_7 
GAGGGG 

<210> SEQ ID NO 529 

<211> Length : 35 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 529 

>HUMPHOSLIPJPEA_2_node_8 

CTGCGCTTTCTGGAGCAAGAGCTGGAGACTATCAC 

<210> SEQ ID NO 530 

<211> Length : 51 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 530 
>HUMPHOSLI P__PEA_2_node_9 

CATTCCGGACCTGCGGGGCAAAGAAGGCCACTTCTACTACAACATCTCTGA 

<210> SEQ ID NO 531 

<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 531 
>HUMPHOSLIP_PEA_2_node_14 

GCCTGGACTTGAAAGGGGAGCAGACAAATTTCCTGTCGTTGGGGGAAGTTCCCTCTTCTTGGCCCTGGATCTGACCC 
TGAGGCCTCCTGTAG 

<210> SEQ ID NO 532 

<211> Length : 16 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 532 
>HUMPHOSLIP_PEA_2_node_l 5 
GGTGAAGGTCACAGAG 

<210> SEQ ID NO 533 
<211> Length : 89 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 533 
>HUMPHOSLIP_PEA_2_node_l 6 

CTGCAACTGACATCTTCCGAGCTCGATTTCCAGCCACAGCAGGAGCTGATGCTTCAAATCACCAATGCCTCCTTGGG 
GCTGCGCTTCCG 

<210> SEQ ID NO 534 

<211> Length : 24 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 534 

>HUMPHOSLIP_PEA_2_node_17 

GAGACAGCTGCTCTACTGGTTCTT 

<210> SEQ I'D NO 535 

<211> Length : 52 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 535 
>HUMPHOSLIP_PEA_2_node_2 3 

GAAGGTGTATGATTTTCTCTCCACGTTCATCACCTCAGGGATGCGCTTCCTC 



<210> SEQ ID NO 536 
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<211> Length : 12 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 536 
>HUMPHOSLIP_PEA_2_node_2 4 
CTCAACCAGCAG 

<210> SEQ ID NO 537 

<211> Length : 85 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 537 
>HUMPHOSLI P_PEA_2_node_2 5 

GTGTGGGCAGCGACAGGTCGCAGGGTGGCAAGGGTGGGCATGCTCTCACTTTGAGAAGGCCCTGACTCTGGCTCCCA 
CCTCGCAG 

<210> SEQ ID NO 538 

<211> Length : 64 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 538 
>HUMPHOSLIP_PEA_2_node_2 6 

ATCTGCCCTGTCCTCTACCACGCAGGGACGGTCCTGCTCAACTCCCTCCTGGACACCGTGCCTG 



<210> SEQ ID NO 539 
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<211> Length : 7 
<212> Type : DNA 
<213> Organism : Homo sapiens 

<400> sequence : 539 
>HUMPHOSLIP_PEA_2_node_2 9 
TGCGCAG 

<210> SEQ ID NO 540 

<211> Length : 85 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 540 
>HUMPHOSLIP_PEA_2_node_30 

TTCTGTGGACGAGCTTGTTGGCATTGACTATTCCCTCATGAAGGATCCTGTGGCTTCCACCAGCAACCTGGACATGG 
ACTTCCGG 

<210> SEQ ID NO 541 

<211> Length : 36 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 541 

>HUMPHOSLIP_PEA_2_node_33 

GGGGCCTTCTTCCCCCTGACTGAGAGGAACTGGAGC 
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<210> SEQ ID NO 542 

<211> Length : 45 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 542 
>HUMPHOSLI P_PEA_2_node_3 6 

GTGCCCCACGACCTGGACATGCTGCTGAGGGCCACCTACTTTGGG 



<210> SEQ ID NO 543 

<211> Length : 15 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 543 

>HUMPHOSLIP_PEA_2_node_37 

AGCATTGTCCTGCTG 



<210> SEQ ID NO 544 

<211> Length : 30 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 544 

>HUMPHO S L I P_PEA__2_n o de__3 9 

AGCCCAGCAGTGATTGACTCCCCATTGAAG 
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<210> SEQ ID NO 545 

<211> Length : 87 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 545 
>HUMPHOSLIP_PEA_2_node_4 0 

CTGGAGCTGCGGGTCCTGGCCCCACCGCGCTGCACCATCAAGCCCTCTGGCACCACCATCTCTGTCACTGCTAGCGT 
CACCATTGCC 



<210> SEQ ID NO 546 

<211> Length : 30 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 546 

>HUMPHOSLIP__PEA_2__node_41 

CTGGTCCCACCAGACCAGCCTGAGGTCCAG 



<210> SEQ ID NO 547 

<211> Length : 18 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 547 
>HUMPHOSLIP_PEA_2_node_4 2 
CTGTCCAGCATGACTATG 

<210> SEQ ID NO 548 

<211> Length : 27 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 548 
>HUMPHOSL I P_PEA_2__node_4 4 
GACGCCCGTCTCAGCGCCAAGATGGCT 

<210> SEQ ID NO 549 

<211> Length : 41 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 549 
>HUMPHOSLIP_PEA_2_node_4 5 

CTCCGGGGGAAGGCCCTGCGCACGCAGCTGGACCTGCGCAG 

<210> SEQ ID NO 550 

<211> Length : 43 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 550 
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>HUMPHOSLI P_PEA_2_node_4 7 

GTTCCGAATCTATTCCAACCATTCTGCACTGGAGTCGCTGGCT 

<210> SEQ ID NO 551 

<211> Length : 15 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 551 
>HUMPHOSLI P_PEA_2__node_5 1 
CTGATCCCATTACAG 

<210> SEQ ID NO 552 

<211> Length : 49 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 552 
>HUMPHOSLIP_PEA_2_node_52 

GCCCCTCTGAAGACCATGCTGCAGATTGGGGTGATGCCCATGCTCAATG 

<210> SEQ ID NO 553 

<211> Length : 83 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 553 
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>HUMPHOSLIP_PEA_2_node_53 

GTAAGGCTGGGGTGTGAGGATGGAGGAAGAAAGGAGGGGTGAACTGGGCGGGCCCAGACTGAGCGGGGTGCTCCCAC 
CCACAG 

<210> SEQ ID NO 554 

<211> Length : 41 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 554 
>HUMPHOSLIP_PEA_2_node_54 

AGCGGACCTGGCGTGGGGTGCAGATCCCACTACCTGAGGGC 

<210> SEQ ID NO 555 

<211> Length : 36 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 555 

>HUMPHO S L I P JPE A_2_n o de__5 5 

ATCAACTTTGTGCATGAGGTGGTGACGAACCATGCG 

<210> SEQ ID NO 556 

<211> Length : 24 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 556 
>HUMPHOSLIP__PEA_2_node_5 8 
GGATTCCTCACCATCGGGGCTGAT 

<210> SEQ ID NO 557 

<211> Length : 36 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 557 
>HUMPHOSLIP_PEA_2_node_5 9 
CTCCACTTTGCCAAAGGGCTGCGAGAGGTGATTGAG 

<210> SEQ ID NO 558 

<211> Length : 23 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 558 
>HUMPHOSLIP_PEA_2_node_6 0 
AAGAACCGGCCTGCTGATGTCAG 

<210> SEQ ID NO 559 

<211> Length : 9 

<212> Type : DNA 

<213> Organism : Homo sapiens 



WO 2006/131783 



PCT/IB2005/004037 



289 

<400> sequence : 559 

>HUMPHOSLIP_PEA_2_node_61 

GGCGTCCAC 

<210> SEQ ID NO 560 

<211> Length : 23 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 560 

>HUMPHOSLIP_PEA_2_node_62 

TGCCCCCACACCGTCCACAGCAG 

<210> SEQ ID NO 561 

<211> Length : 24 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 561 

>HUMPHOSLIP_PEA_2_node_63 

CTGTCTGAGCCCTCAATCCCCAAG 

<210> SEQ ID NO 562 

<211> Length : 7 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 562 

>HUMPHOSLIP_PEA_2_node__64 

CTGGCAG 

<210> SEQ ID NO 563 

<211> Length : 20 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 563 
>HUMPHOSLI P_PEA_2_node_65 
CTGTCATTCAGGACCCCAAC 

<210> SEQ ID NO 564 

<211> Length : 90 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 564 
>HUMPHOSL I P_PEA_2_node_6 6 

CCCTCTCAGCCCCTCTTTTCCCACATTCATAGCCTGTAGTGCCCCCTCTAACCCCCAGTGCCACAGAGAAGACGGGA 
TTTGAAGCTGTAC 

<210> SEQ ID NO 565 

<211> Length : 22 

<212> Type : DNA 

<213> Organism : Homo sapiens 



WO 2006/131783 



PCT/IB2005/004037 



291 

<400> sequence : 565 

>HUMPHOSLIP_PEA_2_node_67 

CCAATTTAATTCCATAATCAAT 

<210> SEQ ID NO 566 

<211> Length : 12 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 566 
>HUMPHOSLIP_PEA_2_node_6 9 
CTGAGGAGCAAT 

<210> SEQ ID NO 567 

<211> Length : 13 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 567 

>HUMPHOSLIPJPEA_2_node_71 

GCCCAGTCCCTAC 

<210> SEQ ID NO 568 

<211> Length : 105 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 568 
>HUMPHOSLIP_PEA__2_node_7 2 

CTTGATCGTGAGAAAGGCGATGTGGGAGAACTCCTTCACGAAGCCGGCAATCTGCTCCCCGCTGTCCCCGTACTTCA 
CTAACCAGGGCCGGCGCTGCACCTCCAT 

<210> SEQ ID NO 569 

<211> Length : 108 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 569 
>HUMPHOSLI P_PEA_2_node_7 3 

CTGCCCCACCAGGAAAGACATCAGCCTACAGCAGCTGCATCCTTGCTCACAGCTACCAGCAAGACCTTAGGGCTGGG 
AATTCCTCCACACTTGCCCTCTGTGGGCCAG 

<210> SEQ ID NO 570 

<211> Length : 90 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 570 

> HUMP HO S L I P_PE A_2 _n o de_7 4 

AGCCAGGCAGCCAGCTGGCCACTCCCAGGCATACCCGCTCCCAATCCTCCACAGCAGCCCCTATCCCAGGGCCAGGA 
ATCTCTACCTTAC 



Segment nucleic acid sequences: 
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<210> SEQ ID NO 571 

<211> Length : 774 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 571 
>Al07 602 0_node_0 

CGCTCAGTCCGGCAGCGCAGCAGGAGGGAGGCGCGAGGCAGGAGCCGGCGGCTGGGCTCCGCAGCGCAGCCAGCGCA 
GCGCGGCGCCCCGGGCCCGGCCCCATGCCCGCAGCCCCGCCGGACCGTCCTTGAGCGCGGGCGCCTAGCCCGCGCCC 
CCTGCCCGCCGGCACCATTGCCCCGACGGCGCGGCCGGGCGGCCCGGCGCTCCCCAGGCTCCGCGCGGGCCGAAAGA 
CGCTGCTAGCGGCCGCCGCGGGTGTGGTGATGCTGCTGGTGCTGGTGGTGCTCATCCCCGTGCTGGTGAGCTCGGGC 
GGCCCGGAAGGCCACTATGAGATGCTGGGCACCTGCCGCATGGTGTGCGACCCCTACCCCGCGCGGGGCCCCGGCGC 
CGGCGCGCGGACCGACGGCGGCGACGCCCTGAGCGAGCAGAGCGGCGCGCCCCCGCCTTCCACGCTGGTGCAGGGCC 
CCCAGGGGAAGCCGGGCCGCACCGGCAAGCCCGGCCCTCCGGGGCCTCCCGGGGACCCAGGTCCTCCCGGCCCTGTG 
GGGCCGCCGGGGGAGAAGGGTGAGCCAGGCAAGCCGGGCCCTCCGGGGCTGCCGGGCGCGGGGGGCAGCGGCGCCAT 
CAGCACTGCCACCTACACCACGGTGCCGCGCGTGGCCTTCTACGCCGGCCTCAAGAACCCCCACGAGGGTTACGAGG 
TACTCAAGTTTGACGACGTGGTCACCAACCTAGGCAACAACTACGACGCGGCCAGCGGCAAGTTTACGTGCAACATT 
CCCG 



<210> SEQ ID NO 572 

<211> Length : 170 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 572 
>Al07 6020_node_3 

GTGCGGGCCAGTGCTATTGCCCAGGACGCGGACCAGAACTACGACTACGCCAGCAACAGCGTGATCCTGCACCTGGA 
CGCCGGCGACGAGGTCTTCATCAAGCTGGATGGAGGCAAAGCACACGGCGGCAACAGCAACAAATACAGCACGTTCT 
CTGGCTTCATCATCTA 
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<210> SEQ ID NO 573 

<211> Length : 175 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 573 
>AI07 6020_node_8 

CGGCGGCGGGTTTGCTTCCTGCGCTCTGAGATGAGCTGCCCTCGGCTCCCTCCGGGGTGGCGCGCCCGGGGGAGGGG 
GGAGTTGGGGGCTGGATAGCTTCCCAGCACCCTCAGAGCCCCCGCCCGGCTGTGCCCCGTCTGACCAAAGTTATAAT 

AAAAACATTTTCACCCCGCAG 

<210> SEQ ID NO 574 

<211> Length : 83 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 574 
>AI 0 7 6 0 2 0_node_l 

GCACCTACTTTTTCACCTACCATGTCCTCATGCGCGGCGGCGACGGCACCAGTATGTGGGCAGACCTCTGCAAGAAT 
GGCCAG 

<210> SEQ ID NO 575 

<211> Length : 102 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 575 
>AI 07602 0_node_4 

CTCCGACTGAGCTCCCCACGTCTCCCTCCACCCACGTCCCTCACCCGCCGGGGTCCCCTCCGGGCGGGGCAGACGAT 
GACTCGCCCCTCGCCCACCCGCTCG 

<210> SEQ ID NO 576 

<211> Length : 115 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 576 
> AI 0 7 6 0 2 0_node_5 

CTGCCCGGCCCTCCCCGGCTATGACGCCCCCGGCCCGTGCTCAACACCGCCTGGGCCACAGCTAGGCCCTCCCACCG 
GCTCGCTGCAGAGCCGGGCCCAGCGCGCCCTGTCCCCG 

<210> SEQ ID NO 577 

<211> Length : 76 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 577 
>AI 0 7 6 0 2 0_node_6 

TGCCAGGGAACCGGGGTTGACCGCCCCCGCCCAGCCCGCGCTATATATTTGTACAATAGGACTGTTTACTGCCCAC 

<210> SEQ ID NO 578 

<211> Length : 38 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 578 
>AI07 6020_node_7 

CTCCGCCTGCCAGCCCACCCCAGCCTGGGGAGAGGTCG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 579 

<211> Length : 1, 098 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 579 
>T2 35 80_node__17 

TTTTTGTTTGTGTACTTAATAAAGGGTAAATATGTCATGTTTGTTTGGAACAGTCATGGTTAATATCTATGTTGTCC 
CAGTATATCTATTAATAGAACTCTCTTTCACTCTCAACAGCGTCCTAGTCCGGATGACAAATTATATGGTTATCTCT 
CAGTAAAGGGTCTTTTTTTAAAATGATTTTTTTTTCAGGGGGTAGGGTAGGCAGGAAGCTTAAACTGGGTAATTTAG 
TTGTAGAAGAGTGCCCTGTGGCAAATAATTGATTATTCATTTCCAGCATCCCCTTTTCTTCTCCTTGACAGTTATTA 
AAAAAAAAAAAGTTACCAGCTTATGTCATTTTAAAGAACACTCGCCCTGAAAACTTCTGAGAGGTTGGCCATTTGAA 
ACCCTGGTTTTAGTGTCTGTATTATTAGTGAACTACCGTGTTCCCATGTGGCTACACAACCACAATTATGTACTATC 
TGGCTCTTTACCAAAGTTTGCAGACCTCTAATCTAGAGTGCGACATTTCCCCTCATTAACTCTTAGGTCCCTTGGCT 
CTAAAAGGGTATATTCATCTTGGCCCTATACAGGGAAAGGGGGAATGGGATTAATGATGTGCTTTGTAAGAAGAACC 
AATTTTAATTTTCACAAAGGCTTGACGTAGCTGTGAGAGAAAGGGTAAGAAGAAGCAGGCTTCTTCTTAGAAGTCTG 
AGATGGCCTAAAGTGGTGGGGGAAGAAGGGAGAGTGGGGAGAAAGAGAAACAAGAAAAGCTGAGAGTGAATTCCCCA 
GAGAGGTAGCCACTGATTCTGCCCTACTCTTTGCTGGAATTCTGGAAAACACCTGGGCTTCTAAAAGATAGGGAGCT 
CATGCATCATGGTAGGGCCACAGCTCAGGCTAGGGCCAGAGATAGCTCAGAGTAGCGCCACGGCTCAGGGTAGGTCC 
ACAGTGCAGGGTAGGGCCATAGCTCAGAGTAGGGCCATAGCTCATAACCACAGCTCAGGGTAGGACCTGCTGATCTA 
TTTGGGGACCCCCAGCAGAGCCTGTCTAATTGCATATCTTGAAAAGGATTGGAAAACTGTCATAATGACATTATTCC 

CTCTCACTTTCCTTGTCCAG 



<210> SEQ ID NO 580 
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<211> Length : 259 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 580 
>T2 3 5 8 0_node_l 8 

GAAAGCCCAGCAGAATCGGAGAGGCTTTTCCGAGGAGCAGCTTCGCCAGGGACAGAACGTAATAGGCCTGCAGATGG 
GCAGCAACAAGGGAGCCTCCCAGGCGGGCATGACAGGGTACGGGATGCCCAGGCAGATCATGTAGGACGCGGCATCC 
TGCCCCTGGTAGAGAGGACGAATGTTCCACACCATGGTCTCTACGAAAAAGAAATAGTTAGTCACCTTCTGACCTTC 
TCCTCTTTCTCAAAGCCTTCTGTCCCTG 

<210> SEQ ID NO 581 

<211> Length : 201 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 581 
>T2 358 0_node_2 1 

CCGAGAATCCGCGTTGCCTACTGCTGCCACCTCCTGTTCATTTAGAACTATGCAAAGACTCCGCTTCCGTTTTCCTG 
AGCTCCTCGGGCCCCAGAGTCTCTGTTTGATTATTTATTTATTTATTTATTTATTTGCCAAAAATTCTCCTCTTCAA 
CTTATAGAATGCACCTAATAAAGTAATTAGTCTTGTGTCTTACAGTG 

<210> SEQ ID NO 582 

<211> Length : 13 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 582 
>T23580 node 19 
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GTTTTTGCAAGTG 

<210> SEQ ID NO 583 

<211> Length : 11 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 583 

>T23580_node_20 

CTGCATTTCCG 

<210> SEQ ID NO 584 

<211> Length : 128 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 584 
>M7 9217_PEA_l_node_2 

ATTTTCAAACACCGCATTCCCCTTAAACCTCCCAGGCTCCCATCACTCCCAGCAAAGAGCCAGCCCGCTACTTTATG 
GAGACAGGAGAGACTGTCAGACAGTTCCTGCTGCTGCCCAGGTGCATCTGA 

<210> SEQ ID NO 585 

<211> Length : 177 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 585 
>M7 92 1 7_PEA__l_node_4 

GGCGCCCTTCGGCAAGTTCCGCAGTCGCCTGTCGGAAATGGCTGCCGGCCGGCAGGGGGAGCGGCGGATCAGGCGCG 
GCCTGGAAGGCGGGCGGCCGGCAGCCAGAACGGCTTCTGGGACGCCGACTTTCGCGCAGGCGGCGGCGGCGGCGGCG 
GCGGGTCCCTGAGCTGGAAGCCG 



<210> SEQ ID NO 586 

<211> Length : 597 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 586 
>M7 92 17_PEA_l_node_9 

TGATGTCTGCTAATGGAAGATAAATGAGAAGCAAACTGGGAAATACATTTTGTCTCAGGATTACTGCATCTACTACT 
GGATAAAGATCAAAAGAGTTCTCTTCAACCCTTTCAACTCTACATTTAACAAATTGAGCTTTTCAGAGTCTTTTTTT 
GTAAAGTATTTCCAAAGAAGACTTATAGGTTAGGAATAAACATAAACTACCCAGGTTGGCTAGGAAGGTATTTCTGT 
TCATCTAAAGATGATGCCCAGGTGTGGAACAGGATAAGAAAAGACCATGGACATCTTTGTCCCAT GAATTTAGTTGG 
TCATCGTGTTACAGGGCTATAATGCCGCTCTAGATCCAGTTAAATAAGAAGTGGGGAGGGGTTGTAAACTGCAGCTT 
TTTGGGGCACTTATCCATTTATTACCCCAAGTAAAAGACCTATACCAAACAGCAAACAACATCTCTGCATTGTCATT 
ATAATGTTCTTTGAGACACAGCCAGTGTTCCAGCCATTGTTCCATCTAAGATTTAAGCATTTTCTAGAAATGTATGG 
TGGCAGGGGTGTTGAACATAACTTCTTCAAGACTGACATGGTTCTCTTTCTTTTGCAG 



<210> SEQ ID NO 587 

<211> Length : 483 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 587 
>M79217 PEA 1 node 10 
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GCCTGATTGTTGGCAAAGGCATCATAAGAAGCTGGCATTTATTTCTGTTCTAACCTATTACTGTATAACTGTGAATA 
GACACTATGCATATTTGTTGGTCAGCAAAACCAAGAAACAAGAGCTATGGCATTTGAAAAAGTCTGTCTGATTCCAG 
GGTGTTTTTCCTGGGTTTCATCATCAGGTACCTCCTCCCTTTCATCTCAGCAAGAATGTGGCACCTTTTATCGTTTG 
ATAAAGATTAAGGACATGTTCTTTGGTCAACAGCCAGAACTTAAAATCTGCTGGAATAGGGTCAGAGACCATTTCAG 
CTGCAGCTGAGGAAAAT GAAATGTTCATTTTATTTGGTGCCTTGTCTGGGGAGCACACTAACTCTTCTGGAAACGTG 
TCAGTGAAACAGAGATCGTTTTGTGGAATAGCTAACCCATGGTTATGGCGAGTGACCCGACGTGATCTGGGGGGCAG 
GCTGCAGAGGACTCATGACAG 

<210> SEQ ID NO 588 

<211> Length : 443 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 588 
>M7 92 1 7_PEA_l_node_ll 

GCTATACCATGCTGCGGAATGGGGGCGCGGGGAACGGAGGTCAGACCTGCATGCTGCGCTGGTCCAACCGCATCCGC 
CTCACGTGGCTCAGCTTCACGCTCTTTGTCATCCTGGTCTTCTTCCCGCTCATCGCCCACTATTACCTCACCACTCT 
GGATGAGGCTGATGAGGCAGGCAAGCGGATTTTTGGTCCCCGGGTGGGGAACGAGCTGTGCGAGGTGAAGCACGTGC 
TGGATCTGTGCCGCATCCGGGAGTCGGTGAGTGAAGAGCTCCTGCAGCTGGAGGCCAAGCGCCAAGAGCTGAACAGC 
GAGATCGCCAAGCXGAATCTGAAGATCGAAGCCTGTAAGAAGAGCATTGAGAACGCCAAGCAGGACCTGCTCCAGCT 
CAAGAATGTCATCAGCCAGACCGAGCATTCCTACAAGGAGCTCATGGCCCAGAACCAG 

<210> SEQ ID NO 589 

<211> Length : 528 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 589 
>M7 92 1 7_PEA_l_node_13 

CTGCTCCCAGAGAAGGACGATGCCGGCCTCCCTCCCCCGAAGGCCACTCGGGGCTGCCGGCTACACAACTGCTTTGA 
TTATTCTCGTTGCCCTCTCACCTCTGGCTTCCCGGTCTACGTCTATGACAGTGACCAGTTTGTCTTTGGCAGCTACC 
TGGATCCCTTGGTCAAGCAGGCTTTTCAGGCGACAGCACGAGCTAACGTTTATGTTACAGAAAATGCAGACATCGCC 
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TGCCTTTACGTGATACTAGTGGGAGAGATGCAGGAGCCGGTGGTGCTGCGGCCTGCTGAGCTGGAGAAGCAGTTGTA 
TTCCCTGCCACACTGGCGGACGGATGGACACAACCATGTCATCATCAATCTGTCACGTAAGTCAGATACACAGAACC 
TTCTCTATAACGTCAGTACTGGCCGTGCCATGGTGGCCCAGTCCACCTTCTACACTGTCCAGTACAGACCTGGCTTT 
GACTTGGTCGTATCACCGCTGGTCCATGCCATGTCTGAGCCCAACTTCATGGAAATCCCACCACAG 

<210> SEQ ID NO 590 

<211> Length : 1,146 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 590 
>M7 921 7_PEA_l_node_l 4 

GTGCCGGTGAAGCGGAAATATCTCTTCACCTTCCAGGGCGAGAAGATTGAGTCTCTGAGGTCTAGCCTTCAGGAGGC 
CCGCTCCTTCGAAGAGGAAATGGAGGGCGACCCTCCCGCCGACTACGATGACCGGATCATTGCCACCCTGAAGGCGG 
TGCAGGACAGCAAGCTGGATCAGGTCCTGGTGGAATTCACCTGCAAAAACCAGCCCAAACCCAGCCTGCCAACTGAG 
TGGGCACTGTGTGGAGAGCGGGAGGACCGCTTGGAATTGCTGAAGCTCTCCACCTTCGCCCTCATCATTACCCCCGG 
GGACCCTCGCTTGGTTATTTCCTCTGGGTGTGCAACACGGCTCTTCGAAGCCCTGGAAGTCGGTGCCGTCCCGGTGG 
TGCTGGGGGAGCAGGTCCAGCTTCCCTACCAGGACATGCTGCAGTGGAACGAGGCGGCCCTGGTGGTGCCAAAGCCT 
CGTGTTACCGAGGTTCATTTCCTGCTCAGAAGCCTCTCCGATAGTGACCTCCTGGCTATGAGGCGGCAAGGCCGCTT 
TCTCTGGGAGACTTACTTCTCCACTGCTGACAGTATTTTTAATACCGTGCTGGCTATGATTAGGACTCGCATCCAGA 
TCCCAGCCGCTCCCATCCGGGAAGAGGCGGCAGCTGAGATCCCCCACCGTTCAGGCAAGGCGGCTGGAACTGACCCC 
AACATGGCTGACAACGGGGACCTGGACCTGGGGCCAGTGGAGACGGAGCCGCCCTACGCCTCACCCAGATACCTCCG 
CAATTTCACTCTGACTGTCACTGACTTTTACCGCAGCTGGAACTGTGCTCCAGGGCCTTTCCATCTTTTCCCCCACA 
CTCCCTTTGACCCTGTGTTGCCCTCAGAGGCCAAATTCTTGGGCTCAGGGACTGGCTTTCGGCCTATTGGTGGTGGA 
GCTGGGGGTTCTGGCAAGGAATTTCAGGCAGCGCTTGGAGGCAATGTTCCCCGAGAGCAGTTCACGGTGGTGATGTT 
GACTTATGAGCGGGAGGAAGTGCTTATGAACTCTTTAGAGAGGCTGAATGGCCTCCCTTACCTGAACAAGGTCGTGG 
TGGTGTGGAATTCTCCCAAGCTGCCATCAGAGGACCTTCTGTGGCCTGACATTGGCGTCCCCATCATG 



<210> SEQ ID NO 591 
<211> Length : 128 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 591 
>M7 9 2 1 7_PE A_l_node_l 6 

GTGGTCCGTACTGAGAAGAACAGTTTGAACAACCGATTCTTACCCTGGAATGAAATTGAGACAGAGGCCATCCTGTC 
CATTGATGACGATGCTCACCTCCGCCATGACGAAATCATGTTTGGGTTCCG 

<210> SEQ ID NO 592 

<211> Length : 145 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 592 
>M7 921 7_PEA_l_node_2 3 

GGTGTGGAGAGAAGCTCGGGACCGCATCGTGGGCTTCCCTGGCCGTTACCACGCATGGGACATCCCCCATCAGTCCT 
GGCTCTACAACTCCAACTACTCCTGTGAGCTGTCCATGGTGCTGACAGGTGCTGCCTTCTTTCACAAG 

<210> SEQ ID NO 593 

<211> Length : 412 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 593 
>M7 921 7_PEA__l_node_2 4 

GTAAGAAAAAGCTGGTAATAATGGCATCGACTTGGTGAGAGTTTCACCTTTGTGTGGTAGCGGAATGCTGCCCTCAG 
CTTAGCTCTCCTAACGCTTCTTACATGTTTCTTTTGTGCTAGAAGTCAGTTTTTTCTATTTTTACAGACAATGATCA 
AGATGCTTAGAGCAACTCTGGGATAAAAAGTCAAGATGAGAGGGCTGCCTGTACAGTTGCACATAGGCCATTTGGAA 
ACCACTTTATCTTTCTGGGCGTTGGCTCTCCGTTTGTAAAACTGAGGGCACTGGGCTAAAGACACCTCAAATACCTT 
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CCAGTTTTAACACTGCCACCCTAGATATGGCCCAGCCATCAGAAGGTGACCTGGGCACTTTTCTGACTTAGATATA^ 
CATGCCTGTCCCGGGCCCCACGATGAG 

<210> SEQ ID NO 594 

<211> Length : 245 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 594 
>M7 92 1 7_PEA__l_node_31 

CTTCTTCGTGAAGGTGTACGGCTACATGCCCCTCCTGTACACGCAGTTCAGGGTGGATTCTGTGCTCTTCAAGACAC 
GCCTGCCCCATGACAAGACCAAGTGCTTCAAGTTCATCTAGGGGCAGCGCACGGTCTGGGGAAGAGGATGAGCAGAG 
GGAGGAAGATGGCTCCCAAGGTTCCTAGGCATTGCAGGACCTTGGGCACATCTGCTGGTGGGTGGCCCAGAGCCTCT 

GCTGGAAGGGGCAG 



<210> SEQ ID NO 595 

<211> Length : 617 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 595 
>M7 921 7__PEA__l__node_3 3 

CTGGAGCCCTGGGCGGAGTCCCCGGGGTTCCCCACACAGGGCACTGACTGATAGCTTACACTGAGGACTGTGGCGAC 
TCTGCAGAGTCACTCACACCGTTCGTACGCCCAGGACAGCTGGTTCGTGGTTTTTACATTCAATAACAACTATTATG 
ATTATTTAAAAAGAGAAAGTTTCAGATTTGCCATTCAAGGCTTATTTATATATATGTGTGTGTATATAAATACATGC 
ACACACTTGCATACATATATATTTTTGGCTGGGGGAGTGTGAGTTTTGCCTTTCTAAGGGAGGGACCGCGCAGGCTC 
CTTTGTTCTGTATTCTGGCGGAGATGGGTCCTGGCCTTGTGTCACTGGCTTATCCTTAAAGATCATCTCCCATCCTC 
CCCAGCGCCATCTGTGTGCAGCAACCAGAAAGGGATGAACTTGGCCCTCTTGCGGGCCTGGACAAGGTCTCTTCCTT 
ACCCTTTCTGTTGCCAGTCAGCAACCTGTAACTCACATTCTCTTCCCAGTGAATCCCTGGGAGCGCCTGACCCTGGT 
GGGCTGTTCAGCTTCCTGCTGCTGGGGCCAGCGATTTTTGAGGATTTATCTTTAGGCCAGGCTTGCCTCCGTACTTA 

T 
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<210> SEQ ID NO 596 

<211> Length : 238 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 596 
>M7 9217_PEA_l_node_34 

CCCTGCTCTCCCATTTCTCTCTTGTTTGAGAGAGAATGAGGAAGCAAAGAGTGAGAAAGAATAGGGGCTGAAGACGC 
CACTCCCAGATGGCTCTTTCTATCCTGCTCTTCTGTTGAAACACACGTGCTGTGGGCCTCAGGCGTTTCTGAAGTGC 
TCTTTCTTGGATTGGACAGGAGATCAGCAGCGTGCACATCTGCTGTGGTCTGAAGTGGTTTGCAGGTCAGCCTCCTC 

TCCCTAG 

<210> SEQ ID NO 597 

<211> Length : 128 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 597 
>M7 921 7__PEA_l_node_3 5 

TGTAGAGCAAGCCAGTGTCCTTCGAGGAACCCACCCGGCTGGCCGGGAAGTTTTACAGCAAGGCGCCTGCCTTGGGA 
TAATTCCTTGGTGAAATTCACCTTCCCCCCGCCTCTGTCTGGAGCCCCATC 

<210> SEQ ID NO 598 

<211> Length : 242 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 598 
>M7 9217_PEA_l_node_37 

CTGTAGGACTCCCCGAGGTTTGGTATGTGCTAGAACAATGGGAGGCTGTGATTTGCTGTGTAAGCTCACATCCAGCC 
TTGGAATCTAACGGGCATTCACAACCCGAGTTACCACTTTCCACTCCCTGCTTAGGATTCTGTTCCCTGGGCTGAAA 
CTGAAATAAGCTAATTTTTTGGGTCACGGTGGCAGTAGGGGAACCTAGGAGGGTGTGAGTGGCATTTGTCAGGGATT 
TAGCCCATGAC 

<210> SEQ ID NO 599 

<211> Length : 156 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 599 
>M7 9217_PEA_l_node_3 8 

GTGTTTCTTGAACCCTACTTTCTGGAAGTGGAGTTGACTCTGGAAGTTTTCTAGCAACTGAACAAAAGCTCAGGTTT 
GTCCTGGTCATGCACATGCCTTAAGCCAGTTCCGTCTTCCCTAGACCTTGGCATCCTGTGCTTCTATTTCTTGGAAT 
AC 

<210> SEQ ID NO 600 

<211> Length : 730 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 600 
>M7 921 7_PEA_l_node_4 1 

CCTGTAGCTCCCCTGCTGAGGCAGTGTGGTTATGTTCCCAGCAGTGGGGGTCAGACGCCCTTCCTCAGAACTTTCTA 
GTTGCCCTCTACCTGACTCCTGACTTGTATTCCTTTTAGCAGTAGCCTTCTTCCCTCGGGGAGCCAAAGAGTGTGGT 
GTGTGGCGCTATATTGTGGCTGCTATTTCATCTGGTTTCTTTTAATGTGAGGAACTCACATACTGACTTCAGTGGGA 
CTCGGTGAGCCGGGGCCGTCTGTGTGGTGGGACCCCCTTTAGCGGGACTCAGTGAGCTGGGGCCGTCTGTGTGGTGG 
AGCCAGGGCCTCTCCCTTTAGTGGAGCCAGGTTGTCGGGCCCCGAATGTCACTGGTGGATCTAAGAAGGGCTGAGTG 
GTCTGACACCAAAACATGCCGCAGGGAGGGCTGTGGTGCCGGTGCTTCCAACAAGGACAGCCCTCCTTGACCCTGAA 
AGGAACACTGGCTTGAAGGACTGCAGACAGGCTCTGAGGGGCACGCCCTCCTCAGCGAGAGGCAGCAAGGTGGCCAC 
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AGTGTCACTGGTCAGGTGCTTCTCACCACGGGAAAGCCGCCGACCTGTGACTCGCTTGAGATGGGAAAGCGGCGCCA 
CAGACCCCGGGTCTCCTTGGCTGTCTGTGGGCCGCCCCTGGCCACCTTGTCCTGGCTCGCAGGGTGCAGGAGCGCCT 
CGTTCTCTGGGTGGCCGGCTTGCTGCTCCGGTTTGGG 

<210> SEQ ID NO 601 

<211> Length : 188 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 601 
>M7 921 7_PEA_l_node_4 4 

CTGTCCGGAAGGATGGGATCTTTCTGGGAGCTGCGCCGGACAGAGTGGGGAGCTCCTAGTTTGTGGGGGGAAGCTTT 
GATATCCATGCCACGTCCATCCACCCCACCCCTTTTCGTCACGAGCACAATGGTCTTACATTGGATTTTTGTAAAAA 

A AT A AAA AT A A AT G GAG AC T T T A A C T C AAG C AG C 

<210> SEQ ID NO 602 

<211> Length : 49 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 602 
>M7 9217_PEA_l_node_0 

GGCTGCAGTGGCCGCCGCTGGAGACCGCGGGACCTGGCGCTGCAGCGAG 

<210> SEQ ID NO 603 
<211> Length : 94 
<212> Type : DNA 
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<213> Organism : Homo sapiens 



<400> sequence : 603 
>M7 92 17_PEA_l__node_7 

GAGAGCAAGCCCTGGAGGTTCACTCTTTCAAGAAGTCGTGTGCTGAGGTGTAATGCTACACAAGTCAGAGGAAGGAA 
GGGTCCTGAAACACATG 



<210> SEQ ID NO 604 

<211> Length : 24 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 604 
>M7 9217_PEA_l_node_12 
CCCAAGCTGTCCCTGCCCATCCGA 



<210> SEQ ID NO 605 

<211> Length : 79 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 605 
>M7 921 7_PEA_l_node_l 9 

TCCTGAGCTCAGGCAACCTGCCCGCCTCGGCCTCCCAGAGTGCTGGGATTACAGGCATGAGCCACGGTGCCCAGCCC 
AG 



<210> SEQ ID NO 606 
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<211> Length : 77 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 606 
>M7 9217_PEA_l_node_21 

ATGGGTTCTCACTTTATTGTTCAAGCTGGTCTGAAACTCCTGGCCTCGAGCAAGCCTCCCAAGTGCTGGGATTACAG 

<210> SEQ ID NO 607 

<21X> Length : 36 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 607 
>M7 921 7_PEA_l_node_2 6 

TATTATGCCTACCTGTATTCTTATGTGATGCCCCAG 

<210> SEQ ID NO 608 

<211> Length : 93 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 608 
>M7 921 7_PEA_l_node_2 7 

GCCATCCGGGACATGGTGGATGAATACATCAACTGTGAGGACATTGCCATGAACTTCCTTGTCTCCCACATCACTCG 
GAAGCCCCCCATCAAG 
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<210> SEQ ID NO 609 

<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 609 
>M7 9217_PEA_l_node_3 0 

GTGACCTCACGGTGGACATTCCGATGCCCAGGATGCCCTCAGGCCCTGTCTCATGATGACTCCCACTTCCACGAGCG 
GCACAAGTGCATCAA 

<210> SEQ ID NO 610 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 610 
>M7 921 7_PE A_l_node_3 2 

CAGGAGGAGTGGAAGGAAACCGCTGCCTTTATCTTGAAGTCAGCCACACTGGGC 

,<210> SEQ ID NO 611 
<211> Length : 41 
<212> Type : DNA 
<213> Organism : Homo sapiens 

<400> sequence : 611 
>M7 921 7_PEA_l_node_3 6 

CTGTGTTATCTGTGGTTTTTGGACCCCTAATGTCAGCTTGG 
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<210> SEQ ID NO 612 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 612 
>M7 921 7_PE A_l_node_3 9 

GTTCTCCTCTGACCTGCCTGTACCACGTGGGTCCTCTTCAAGTACTGTTTTGAAGCTGGGCTCTTTTGTGTAGCTCC 
CACCCAC 

<210> SEQ ID NO 613 

<211> Length : 107 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 613 
>M7 92 17_PEA_l_node_4 0 

CTGTAGGGCTAGCTCGGCTTAAGGGAACTCTCCCCATTGGCAAACCGGACCCGGCCGCCGCCAGGACTGTGTTTCCA 
AAGGTTCCCCGCCCCCAACCCCAGCATCAG 

<210> SEQ ID NO 614 

<211> Length : 86 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 614 
>M7 921 7_PE A_l_node_4 2 

CTGTCTTACCATAACACCGTCCCAGGGCTCTGCAGGCCACTGTGAGCGCTGGCTCCCTGGGCAGTGCTCCTCCGTGT 
GGACTGTGC 
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<210> SEQ ID NO 615 

<211> Length : 28 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 615 

>M7 92 17_PEA_l_node_4 3 

CTCAGGCCAGGGCTCACCAGCTGGGGTC 

Segment nucleic acid sequences: 

<210> SEQ ID NO 616 

<211> Length : 355 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 616 
>M62 0 96_PEA__l_node_0 

CGGCAGAGCGCGCTGGTGCTGATGCAGGATGGCTGAGCGCGCAGGAGCCCGGGAGGTCTGAGCCGGGCGAGGCTCGC 
TCCCTGCGCATCGCCTCCTCCGCCCGCCGCGTGGTCGCGGGCAGGTGGGCCGGGGGGCGCTGGGCAGGGGCGGGGCA 
GGGCCAGGGCAGGCCGGTCTGCAGCCGGAGGGGCCGGAGCGGAGAAGCTGCCCACCTTCCCGGGCTCGGAGCGGCCG 
GGGCTGCTCAGCCGGCCGGGCTCGCGATGACCTGCTGAGAAGCGTCGTCGGAGGCTGCAGGAGGCGGCCTAGCTGTG 
GGCGGTGCAGCTCGCGGCCTCCTCCCTCGTCGTTCCCGGCCCCGGCC 

<210> SEQ ID NO 617 

<211> Length : 148 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 617 
>M620 9 6_PEA_l_node_2 

CCCCTCCCTACCGCCGGCCGAGATGGCGGATCCAGCCGAATGCAGCATCAAAGTGATGTGCCGGTTCCGGCCCCTCA 
ACGAAGCGGAGATCCTCCGCGGGGACAAATTCATCCCCAAATTTAAAGGCGATGAGACCGTGGTGATCGGG 

<210> SEQ ID NO 618 

<211> Length : 125 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 618 
>M6 2 0 9 6_PEA_l_node_l 5 

ACATGAATGAACACAGCTCTAGAAGTCACAGTATCTTCCTGATAAATATTAAACAAGAGAATGTAGAGACTGAAAAA 
AAACTCAGTGGGAAACTTTATTTGGTTGATTTGGCTGGGAGCGAAAAG 

<210> SEQ ID NO 619 

<211> Length : 147 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 619 
>M62 096_PEA_l_node_17 

GGTTTGACTTAGCCTGTGGCAGAGGCAGCTGCACACATGTGGGAGGGATATGAACTTCTGAGAAAAGAGAAAATCCC 
ATGTTGTACCCGACTCTAATAGGCCAGGAACGAGCTTTGCCATTGGAATGTGGCAGTCCTGCCTCTGCAG 

<210> SEQ ID NO 620 
<211> Length : 125 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 620 
>M620 96_PEA_l_node_19 

GCTGATTGTCCCCATGAAGGCCAGCCTTGAAGCTTGGTCAGTCTCCCTAACTGTATGATTGATCCCCACTTATTGCA 
CTACATCACTGAGTTCCCGTATGCCAAGTTATGGCCACTTACATCCAC 

<210> SEQ ID NO 621 

<211> Length : 149 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 621 
>M62 0 9 6_PEA__l_node_2 3 

AAAACACATGTGCCATACCGGGACAGCAAGATGACTCGGATTCTTCAGGACTCTTTGGGTGGGAACTGCAGAACCAC 
CATCGTCATTTGCTGTTCTCCTTCTGTCTTCAATGAGGCTGAGACCAAGTCCACACTGATGTTCGGACAGAG 

<210> SEQ ID NO 622 

<211> Length : 149 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 623 
>M62 0 96_PEA__l_node_27 

AGCTAAGACCATCAAGAATACAGTCTCTGTGAACCTAGAACTGACAGCAGAAGAATGGAAGAAGAAATATGAAAAAG 
AGAAAGAGAAAAACAAGACTTTGAAGAATGTTATCCAGCATCTGGAGATGGAGCTAAACAGGTGGAGGAATG 
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<210> SEQ ID NO 624 

<211> Length : 167 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 624 
>M62 0 96_PEA_l_node_2 9 

ATTTTCTAGCAGCACACGTGTTTGGAAAGCTACTAGAATAATTGAATAATTCAGCACCTGAGGCTGGTGGATGATTC 
TTTGCAATTTGGCAGGAATGGGAGAGTCGGGAGCAGTAGTTGGCAAGGTGGGGAGTAGCCATATGAAGTTTTATTTC 

GGGAATCCTCCAG 

<210> SEQ ID NO 625 

<211> Length : 176 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 625 
>M62 0 9 6_PEA_l_node_3 1 

GAGAAGCTGTGCCTGAGGATGAACAGATCAGTGCCAAGGACCAGAAGAACCTGGAGCCTTGTGATAACACCCCCATC 
ATAGACAATATTGCTCCTGTTGTTGCTGGCATCTCTACAGAGGAGAAAGAGAAGTACGATGAGGAGATCTCCAGTCT 
CTACAGACAACTGGATGACAAG 

<210> SEQ ID NO 626 

<211> Length : 504 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 626 
>M62096 PEA 1 node 34 
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GTAAAGAATGCAATATATTTTTTTTTCCACAAAGTTCTTCTATTACTCTTTGTTGTTGATGTTTGTTCCAGGAATTT 
AATTGGCATAGAAGCTTTTCATAATTACAGAATCATGTGGAAATTTCTTGGTAGATGTCCCTTCACTGCCTCTTACA 
AGCTGATTATCACTGAATTTAGAAAATAAATGTCTGACTTTCAAAAACCCCTGATGTTTTGAGATTGAGTAGCCAGT 
GGCTACAGTTCGTTCTGGAAGGGCAGAGACCTTTGGTTGGGTGATCAAGCAAGGATGATCCTTTTTTATTTTTATTT 
TTTTGAGACAGGGTCTCTCTGTTGTCCAGGCTGGAATGCAGTGGTGCAATCATGGCTCACTGCAACCTCCAGAGCTC 
AAATGATCTTCCCGCCTAAGACTCTCAAGTAGCTAAGACTACAAGAATGTGCCACCATACCTAGCTAATTTTTTAAT 
ATTTTGAGACAGAGTTTCTCTATGTTGGTCAGGGTGATCTTG 

<210> SEQ ID NO 627 

<211> Length : 207 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 627 
>M62 0 9 6_PEA_l_node_3 6 

CTTTTAGCTTCCACAAGAAGAGACTATGAGAAGATACAGGAGGAGCTGACACGTCTCCAGATTGAAAATGAGGCAGC 
CAAGGATGAGGTGAAAGAAGTTCTCCAGGCCCTGGAGGAGCTGGCTGTCAATTATGACCAGAAATCACAGGAAGTGG 
AGGATAAGACCCGGGCCAATGAGCAGCTGACAGACGAGCTGGCCCAGAAAACG 

<210> SEQ ID NO 628 

<211> Length : 147 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 628 
>M62 0 9 6__PEA_l_node_3 8 

ACTACATTGACAACCACACAGAGAGAGCTGAGCCAGCTACAAGAGCTTAGCAACCACCAGAAGAAAAGGGCAACTGA 
GATCCTGAATTTGCTGTTGAAAGATCTGGGGGAGATAGGTGGAATTATTGGCACCAATGATGTGAAAACT 
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<210> SEQ ID NO 629 

<211> Length : 189 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 629 
>M62 0 9 6 J?EA__l__node_4 0 

TTGGCAGATGTGAATGGAGTCATTGAGGAGGAGTTTACCATGGCCCGCCTGTACATCAGCAAGATGAAGTCAGAGGT 
CAAGTCCCTGGTGAACCGCAGCAAACAGCTCGAGAGCGCCCAGATGGACTCCAACAGGAAGATGAATGCCAGCGAGC 
GGGAGCTGGCAGCCTGCCAGCTGCTCATCTCCCAG 

<210> SEQ ID NO 630 

<211> Length : 340 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 630 
>M62 0 9 6_PEA_l_node_4 8 

GTGAGTCGGCCCAGGGCACCAGGGGTGTGGGGGTGGTCATGCCTCGGTCCTCTTGGGGAAGCCTGGAAGGATGTGGC 
TCTTAGTCGAGGGCCCTGCTCACCTTGCCCTGTGGGCACTGCCCTGGTGACACAGCAGGCTGGGCGGGCTGCTTCCA 
AGGTCTGTTCTCCGATCTGGAGCTGAGCCTCCTGGAGCCCTGGCATGGCAGGTGGCAGGCGGGCCCAGCTGCCTCTC 
CTAGTCCCCGAGGGCCAGGGTCACATAGGTGATTCCGCTGGATGGACGCATGCCCCAGTAGATGGGGGGGAGCATCT 
GTAATGAAAGCACCAAAAAAAAAAAAAAAAAA 

<210> SEQ ID NO 631 

<211> Length : 688 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 631 
>M6 2 0 9 6_PEA_l_node__5 0 

AAGCTTCCCTGGCTTGTGCCTTTCAAATGGACTCTGGGTTTTCCTTCTGGTAGTGCATAGCTGTTCCTTTACAGGCG 
CTTAGGCGTGGCTCTAGGAAAGGTTTTATGAGTCCTGGGCTGATGTAAATGTTGACCAAACACCCTCAACCAGATGG 
CGAGTTTCTGTTTGCAGCAGAGCCCAGGCTGTCTTTTCTTCATAATTCTCTCTGTGCCCACTCCTCGAGGGCAGGAA 
CTGTCCCTGTATCAGTGAGGCATTCGGACTTGGGAGATGTTTTTAGAACATCAGACCAGAAATGAGGGAAGGTGGAA 
ATGGCCAAATCAGGTTCCCCAAGTGACTGCATGCCATCCGAGGGGCCGAGGAAGCAGAGTTCTTCTGACATGGGCTC 
TCTGTTTTAAAATATCAGCCCTTCTCCCATCTCATTATTTTTCCCCTGAAGCTCTTGCACAAGCAAACTATAAATAC 
ATCCTCAAAGCCTTATGTTTCATGACTCTTAGATGCACCCCAGAAATTAGTTTTCACTTGGGCACGGAGGAAGCTGT 
GAGGCTGTTCTGTGATCCCTCCAAATCCTGCAGAATTACTGCCTTTATTGTACAGAGCTAATAGGGTTGGAACAGAA 
CCACGGTTTTAGCCTGATGACTCAGAATTTCAGACTGATGTGGAATATATTGCTTTTTCCTCTCAATTTCAG 

<210> SEQ ID NO 632 

<211> Length : 136 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 632 
>M62 0 9 6_PEA_l_node_5 6 

GTGAGTTCTCTTTGTCTGAATGGGACTGAGAAGAAAATCAAAGATGGCAGGGAAGAATCATTTTCAGTTGAAATATC 
ACTTGCTTAAGTCGGGGCTGGTTATGCTTAAAAATTAATTACTGCACACCAGAGAATGT 

<210> SEQ ID NO 633 

<211> Length : 217 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 633 
>M62096 PEA 1 node 60 
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CTGGTCCGGGACAAGGCAGACCTGCGCTGTGAACTGCCCAAGCTGGAGAAGCGGCTGCGTGCCACGGCGGAGCGCGT 
CAAGGCTCTGGAGAGCGCGCTGAAGGAGGCCAAGGAGAACGCCATGCGGGACCGTAAGCGCTACCAGCAGGAGGTGG 
ATCGTATCAAGGAGGCCGTGCGGGCCAAGAACATGGCCAGAAGGGCCCATTCAGCCCAGATCG 



<210> SEQ ID NO 634 

<21X> Length : 1, 320 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 634 
>M62 0 9 6_PEA__l_node_6 5 

ATATGACTCCACGTAGCATGTCAAGGACTACATTAATCACCAATTCCTTTATTTTTCCCCCCCTACAGTTTCCATTT 
TTTTTTTATACTTGCTTACTCCAGCCATCTGCAGTACACCAGTTTCAGGTCTTTTGAGCTGTGTAGAGTTTCTGTGT 
GTACAGATGTGTGCTCGGACTTTTCTCTTTTTGAGAAATCTGAAGGAGATGGTTGCAGAAGATCCACTTACTACTGA 
GAACCATTACCACCGACTCGGCCTCCGGGGTGTTGGGTGGTTTCTGGGTGGTTCCTGGAGCCTCCTCTGGGCAGTGC 
ACTGTCCCATCTGTACGCCCTAATGTGCCATTCCCTAGAGGGGAACAACCAAGTGCCGTGGAGGCAGATGATCATGG 
TCTGCCTCAACTGTCTGGTTTCCTGTAAAATAAACACATTGTTTTATATTTTTAGGGAACAAAAAGTGCTGCTATAG 
GGTTCAAAGTTTTCCTTCTGAACACTTTTCCGAAACAAATTACCCCAAAGACACATTTTGAATATCCTGGTCACATC 
TTTGGATCTGTAAAATATACCTTTTAGTATGGCACCTGTTAAAATGCAAAGCAAATTTCTTTGGGGCAGAAAAACAA 
TCTGACAGTAGCAGTGTAGAATTTGTTCATTCAAATACATCTGTGTAAATGCAAAAAGTCATAAAATTCACCTCCGA 
GCTGCTTGCTTTTGAACCTGCAGCAACTAGTCTTAGCCGGCCCGGTTTGAACATCGTTCTTTCAGAAGTGCTGAAAA 
TGCTGCAAAGTTGGATAAGTGGAAATGTGGCTGCCCCTCTCCTCACTACTTCCTCTCTGATCGTTCTGAAGCTTGCA 
TTGGGAATGGCTGCTTTCTCTAACCATTTTCAGCTTGAGTGGGTATTGCTGAAGAAATCCAACATCATTCCAGCAGT 
TGAAAAAGGAAGCCTTCGGGAGAAAGTGCTTGTCAAAATTTTGTTCTTTGTGCTTGTGTATGAGTAAGTTGCCATGA 
ATAAGTTATTATTTTAACCCATAATTGGCGACTGTTTATATGAATTCTTTCTTTGGCACCAAATAGGTTTCATCTTC 
TTAGGCACAATTAGAAAAAATCCACATAGATGGATATTTTACATTTAGTTATTGCTTTATCCAAATACATGAATCTA 
AAGCTGAATCAACCCTTACTTCCAGTTGTGCTTATTAAGAAGATCAATTTCCAAGTAGTAAAGTTTTCAGGGAAACT 
GACTGTGCTGCTATTTGTTTTGACAAATTTGGGGGTAAGTCAATGACAACCAAACCAATCTCGGTGGAAACTCCTAT 

CCTATCATGTT 



<210> SEQ ID NO 635 
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<211> Length : 933 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 635 
>M62 0 9 6_PEA_l_node_ 6 9 

GTGTGTGTGTGTGTGTGTGTGTGTGTGTGTATGTGTGTAAAGTGCTAAGAACTGTGCATTGACATCCAAACATTTCT 
TGTACAAAATTTCCCTAGCAAAGCAAACCTGCTTTGACTTAATTTATTTGTTAAATGTTGCACTTTGTTTATGTATG 
TTTTGTTTTTGGTGGGGAATAAGGAGAGAGAGGACGACAAATTCTATTGAAGTATTTATTTTGTGAAGATGGCAATT 
TTGCATTTGTTTAAATTTTTTTCATTCTTTAATTTTGTTATCAGTGCCAGCCCAATATACCTGCTCTACCATTATTT 
GCGGTCTGATAAAAGGGTCCTTGTGGGGCAGGTTTTGCAAAGCTTATCAGGTAATAACATATGCCACATAACCTTGT 
TGATATGTTTGCTTCTGATTTGGGAAGCTAAACATTGGTGTTTGAGAGGATTGCCAATTATTAATTGTCATTACCAC 
TACTCTCCATTACTTTTTGTTTGGAAATTGAACAAAGGTCAGTAATGGTTTTTGGCTCTTGTTAATATCCATCATAA 
AATAGATTGTTTTAGATTCTTTCCAGGGTGATTTTTCCCTGGGTACCCCGTTTCTACTTCTAAAGAATTGCTTGGCA 
CTTTCATGTTTCAAAGGGAAACATTCGCTTGTAGTTCCATTTTACTTGATCTCTACAAGGGACTGACAACATTTGCT 
TTACTTTTATTCACAGAGAAAGTTGGCTTTGATGTCTCTTAAAGATAATTCTGCTAGTTGCTGATCAGCCAGTCAGT 
TCACCTAGCTTCAATCTTTATAGGACTTCTAATCTAATTTTCCTATAGTGTGACTAAAAGGGAGGCAAATTATTGGA 
ACGGATTATTCAAATGGATCCTTAAATATTGCTATGTATAATAAGCCAGTTATTATATCAGGACCATGTTCTCTGTA 

GGCCACTTT 

<210> SEQ ID NO 636 

<211> Length : 1,247 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 636 
>M62 0 9 6JPEA__l_node_7 1 

TTTAAATGTAAAAACCACACTTCTGAACAACTAAGCTCATGAATATGATTTTGGTTATATGCAGCTTTTGACTAGCA 
TGTATTGTGTCTTTTTCTCCTCTATGAATAATTTTATATTTCATGCTACTTCTTGAAAGTTTACTCTTTGATGCTCT 
AAGAGAACAGCCAGATGGTTTATATGAATAATCTTTATCTGCAGGATGGTGGATTGGTAAATTAGGAGAATGTTGTT 
TGAGATATCAAGATTTATGTCTGGGAACTAAAATATATAATGCCAAATGTGTTTTTGTCAATTACTAGAGAATTCTG 
TGCAAACATATCATCTCTTCAAATGCTGCACACTTTGCTTTTGTTAAACAGCAGGTAGTAGACAGAACAATAACAGT 
TTCGCGTTAAGACTTTTAAAGGAAATAGAATCGTGATTAAGAAATCAGAATTTATAGATATATTGGGATAAATGAAG 
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AAATAAAAATGTTTGTCTAGAATGTAGCATCTAGTGACTTTTTAAAGCCCTAACGTTTACATAAAGAAGCTCTAGTT 
CTTATAGAAATAACAAAGCAAATAAAAGTTCTTAACAATCCCCTCTTTCGAAGTGCATTTTTTTAAAGCAGGGCAGG 
AGACATTTGGACTCTAGCTATATGACATACTGGGAAAGGCAGAGGGTGGAGGGAAGATTTCACTTCATTGTCTAGCC 
CAGAATCTTGAGCAAGCTAAAGAAAC CATC ATAATCTAAAATTGCTTCATTTAAC ACT AACAATTTAGACTTTTTAA 
ACCAAGCATTGAATAATGGCTGGATAACTGCCGAAGTAAGCGCCGCTCCATGAAGTCTGCTTACTTATTTAAAAATT 
GTGTATCAGTTTTAAATACTGTTCATTGTGTGCAGATATAAGGGGAATAGGGCATTCTGTAGAATTATACATGTCTA 
GTTTGTAAAGTGTGTCCTGTGTACTGCAGATGTGTGTTCTCTGGGCTTTATGTATCTGTACAGTAGCTTTCACATTA 
AAAAAATTGTGGACAAACTTGTCCGGGGGGTTTGAGGGGAGAATGGTGGTTTATATCAATAACGATGCTGTACTATA 
GTCCATGTAACAAAAGATCTGGAAGTCACCCTCCTCTGGCCCACGGAAAATTTTGGTAATCTTCTAGGTTCTAAAAT 
GAAGATGTATGGGTACTCTGGCAGACTGCATGTTGTATAATTTGAAAAATACTAAAAGTGGAAAATAAAATTGAATT 
AAACTTTGGCTGGTC 

<210> SEQ ID NO 637 

<211> Length : 18 

<212> Type : DNA 

<213> Organism. : Homo sapiens 

<400> sequence : 637 
>M6 2 0 9 6_PE A_l_node__l 
CCCCACCCATCCCCGTGC 

<210> SEQ ID NO 638 

<211> Length : 91 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 638 
>M62 0 96_PEA_l_node_4 

CAAGGGAAGCCATATGTCTTCGACAGAGTGCTACCTCCCAACACGACCCAAGAGCAGGTTTACAATGCATGTGCGAA 
GCAAATTGTCAAAG 
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<210> SEQ ID NO 639 

<211> Length : 74 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 639 
>M620 96_PEA_l_node_6 

ATGTCCTTGAAGGTTATAACGGGACGATTTTTGCGTATGGGCAGACTTCATCAGGAAAAACCCACACCATGGAG 

<210> SEQ ID NO 640 

<211> Length : 105 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 640 
>M62 0 9 6_PEA_l_node_7 

GGGAAGCTGCATGACCCCCAGCTCATGGGGATCATCCCACGAATTGCCCATGATATCTTTGACCATATCTACTCCAT 
GGATGAGAACCTGGAGTTTCACATAAAG 

<210> SEQ ID NO 641 

<211> Length : 49 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 641 
>M6 2 0 9 6_PE A_l_node_9 

GTTTCCTATTTTGAGATCTACTTGGACAAAATAAGGGACTTACTTGATG 
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<210> SEQ ID NO 642 

<211> Length : 56 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 642 
>M62 0 9 6_PEA_l_node_l 1 

TATCCAAGACCAACTTGGCTGTTCATGAAGATAAAAACAGAGTCCCGTATGTAAAG 

<210> SEQ ID NO 643 

<211> Length : 88 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 643 
>M 62 0 9 6_PE A_l_node_l 3 

GGGTGCACTGAGCGGTTTGTGTCGAGCCCTGAGGAAGTCATGGATGTAATAGATGAAGGCAAAGCAAACCGACACGT 
GGCTGTGACAA 

<210> SEQ ID NO 644 

<211> Length : 105 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 644 
>M62 0 9 6_PEA_l_node_2 1 

GTCAGCAAAACTGGTGCCGAGGGAGCTGTTCTTGACGAAGCTAAAAATATCAATAAGTCTTTGTCTGCTCTTGGAAA 
TGTGATCTCTGCTTTGGCAGAAGGGACA 
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<210> SEQ ID NO 645 

<211> Length : 61 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 645 
>M62 096_PEA_l_node_25 

GGTATGAAATCAAGGCTTAGGTGCAAAGCCATTGGATACCATACCTGAGACCACACAGCCA 

<210> SEQ ID NO 646 

<211> Length : 69 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 646 
>M62 0 9 6_PEA_l_node_33 

GATGATGAAATTAACCAGCAGAGCCAGCTGGCTGAAAAGCTGAAGCAACAGATGTTGGATCAGGATGAG 

<210> SEQ ID NO 647 

<211> Length : 118 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 647 
>M62096 PEA 1 node 42 
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CACGAAGCCAAGATCAAGTCTCTGACAGACTACATGCAGAACATGGAACAGAAGAGGAGGCAGCTAGAAGAGTCCCA 
GGACTCGCTCAGCGAAGAGCTGGCAAAGCTCCGAGCCCAGG 

<210> SEQ ID NO 648 

<211> Length : 77 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 648 
>M62 0 9 6_PEA_l_node_4 4 

AAAAAATGCACGAAGTCAGCTTCCAGGATAAGGAGAAGGAACATCTGACGCGGTTGCAGGATGCTGAAGAAATGAAG 

<210> SEQ ID NO 649 

<211> Length : 110 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 649 
>M62 0 9 6_PEA_l_node_4 7 

AAGGCGCTGGAGCAGCAGATGGAGAGCCACCGGGAAGCTCACCAGAAGCAGCTGTCCAGACTCCGAGACGAAATTGA 
GGAGAAGCAGAAAATCATTGATGAGATTCGGGA 

<210> SEQ ID NO 650 

<211> Length : 102 

<212> Type : DNA 

<213> Organism : Homo sapiens 



WO 2006/131783 



PCT/IB2005/004037 



325 

<400> sequence : 650 
>M62 0 9 6_PEA_l_node_51 

TTTGAATCAGAAACTGCAACTGGAACAGGAGAAGCTTAGTTCTGATTATAACAAGCTGAAAATAGAGGACCAAGAGA 
GAGAAATGAAGCTGGAAAAGCTCTT 

<210> SEQ ID NO 651 

<211> Length : 61 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 651 
>M62 0 9 6_PEA_l_node_5 3 

ATTGCTCAACGATAAAAGGGAACAAGCCAGAGAAGACCTCAAAGGGCTGGAGGAGACAGTG 

<210> SEQ ID NO 652 

<211> Length : 72 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 652 
>M62 0 9 6_PEA_l_node__5 5 

TCTAGAGAATTGCAGACACTGCACAACCTTCGGAAACTCTTTGTCCAGGATCTGACCACCCGAGTTAAAAAA 

<210> SEQ ID NO 653 

<211> Length : 105 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 653 
>M620 9 6_PEA_l__node_58 

AGTGTGGAGTTGGACAACGATGATGGAGGGGGCAGTGCTGCCCAGAAGCAGAAAATTTCCTTCTTGGAGAATAACCT 
GGAGCAGCTCACCAAAGTTCACAAGCAG 

<210> SEQ ID NO 654 

<211> Length : 114 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 654 
>M62 0 9 6_PEA_l_node_62 

CCAAGCCCATCCGCCCCGGACACTACCCGGCCTCATCTCCAACGGCCGTCCATGCCATTCGAGGGGGAGGAGGCAGC 
TCTTCAAATTCCACTCACTACCAGAAATAAATACAAA 

<210> SEQ ID NO 655 

<211> Length : 118 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 655 
>M62 0 9 6_PEA__l_node_6 6 

GTGTGCCCAAGATGAGTGAGCTGGCACTGTGCCCTGAAGCTTTCACCACTGTAATGAAATATATGCCAGGGGAGACT 
TTGGGCTTTTCTCATGACTGTGTGGGTCGAAGGTAGCTCAA 



<210> SEQ ID NO 656 
<211> Length : 6 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 656 
>M6 2 0 9 6_PEA_l_node_6 7 
GTGTGT 



<210> SEQ ID NO 657 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 657 
>M62 0 9 6__PEA_l_node_6 8 
GTGTGT 



<210> SEQ ID NO 658 

<211> Length : 55 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 658 
>M62 0 9 6_PEA_l__node_7 0 

CTAAAAAAGCCACATATGTGCAATTTTCAGGTTTTTAGACTATTGCTCCCTGTAC 



<210> SEQ ID NO 659 
<211> Length : 160 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 659 
>M7 807 6_PE A_l__n ode_0 

CGCGGGGCGGGGCTGGCGGCGCCGGCGCAGCCCGGGGGCGGCGGGAGGAGGAGGTGGCGGCGGTGGCGCTGGGAGCT 
CCTGTCACCGCTGGGGCCGGGCCGGGCGGGAGTGCAGGGGACGTGAGGGCGCAAGGGCCGGGACATGGGGCCCGCCA 

GCCCCG 

<210> SEQ ID NO 660 

<211> Length : 133 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 660 
>M7 8 07 6_PEA_l_node_10 

ATGTACCCGGAGCTGCAGATTGCACGTGTGGAGCAGGCTACGCAGGCCATCCCCATGGAGCGCTGGTGCGGGGGTTC 
CCGGAGCGGCAGCTGCGCCCACCCCCACCACCAGGTTGTGCCCTTCCGCTGCCTGC 

<210> SEQ ID NO 661 

<211> Length : 134 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 661 
>M7 807 6_PE A_l_node_l 5 

GCCTGCAGCTCCCAGGGCCTCATCCTGCACGGCTCGGGCATGCTCTTACCCTGTGGCTCGGATCGGTTCCGTGGTGT 
GGAGTATGTGTGCTGTCCCCCTCCAGGGACCCCCGACCCATCTGGGACAGCAGTTGG 
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<210> SEQ ID NO 662 

<211> Length : 179 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 662 
>M7 8 0 7 6_PEA_l_no de_l 8 

TGACCCCTCCACCCGGTCCTGGCCCCCGGGGAGCAGAGTAGAGGGGGCTGAGGACGAGGAAGAGGAGGAATCCTTCC 
CACAGCCAGTAGATGATTACTTCGTGGAGCCTCCGCAGGCTGAAGAGGAAGAGGAAACGGTCCCACCCCCAAGCTCC 

CATACACTTGCAGTGGTCGGCAAAG 



<210> SEQ ID NO 663 

<211> Length : 131 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 663 
>M7 807 6_PEA__l_node_2 0 

TCACTCCCACCCCGAGGCCCACAGACGGTGTGGATATTTACTTTGGCATGCCTGGGGAAATCAGTGAGCACGAGGGG 
TTCCTGAGGGCCAAGATGGACCTGGAGGAGCGTAGGATGCGCCAGATTAATGAG 



<210> SEQ ID NO 664 

<211> Length : 159 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 664 
>M7 8 07 6_PE A_l_node_2 4 

CACTTCCAGTCCATTCTGCAGACTCTGGAGGAGCAGGTGTCTGGTGAGCGACAGCGCCTGGTGGAAACCCACGCCAC 
CCGCGTCATCGCCCTTATCAACGACCAGCGCCGGGCTGCCTTGGAGGGCTTCCTGGCAGCCCTGCAGGCAGATCCGC 
CTCAG 

<210> SEQ ID NO 665 

<211> Length : 129 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 665 
>M7 8 07 6_PEA_l_node_2 6 

GCGGAGCGTGTCCTGTTGGCCCTGCGGCGCTACCTGCGTGCGGAGCAGAAGGAACAGAGGCACACGCTGCGCCACTA 
CCAGCATGTGGCCGCCGTGGATCCCGAGAAGGCACAGCAGATGCGCTTCCAG 

<210> SEQ ID NO 666 

<211> Length : 1, 643 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 666 
>M7 807 6_PE A_l_node_2 9 

TCACATCCTTCCAGCTCCCAAATGCGCCGCTATTCCTCAGACGCCCGCGCCTCAGGCTCTTCTCTTGTCCCTTAGAC 
CCTCTTTCTGTCTCTTGGACCCCTTCCTATCCCCTGAACACCGCTTCTCTGCCCCTTCCCAGTCTCTCAGCTCAGCT 
TCCTGACCCTGAAACATGGACCCTCACATGCTGTGTCTTTGACCCCTGCTTCTTGGCCCTTGGATTCCTACTCCCCC 
CGCCGTCGATCCTATGTTCTGTCCCTTGGATTTTCACTGCCTTTCCCAGAATCGTCTTTTTTTTTTTTTTTTTTTTG 
AGACAGGTTCTTGCTCTGTCGCCCAGGCAGGAGAGCAGTGTGCGATCTTGGCTCATTGCAACTTCCACCTCCTGGGT 
TCAAGCAATTCTCCTGCCTCAGCCTCTCGAGTAGCTGGGATTACAGGAGCCTGCCACCACACTGGGCTAATTTTTTT 
TTTTTTTTTTGACAGAGTCTCGCTCTGTTTCCCAGGCTGGAGTGCAGTGACATGATCTGGGCTCACTGCAACCTCCG 
CCTACTGGGTTCAAGCTATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGACTACAGGCGGGTGTCACCACATCTGGC 
TGATTTTTGTATTTTTAGTAGAGACAGGGTTTCACCATACTGGTCAGGCTGGTCTTGAACTCGAC CTCAG GTGATCC 
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ACCCTTGGCCTCCTAAAGTACTCGGATTACAGGTGTGAGCCACCACGCCCGGCCCCAGCTAATTTTTGTATTTTTGG 
TAGACACGGGTTTCAGCATGTTGGCCAGGCTGGTCTTGAACTCCTGACCTCAGGTGATCTGCCTGCCTTGGCCTCCC 
AAAGTGCTGGGATTACAGGCGTGAGCCACCATGCCCAGCCAGAAACCCCAATAACTTTTGCACCAATCTAATATTTT 
TAGCAGAGACAGGGTTTTGCCATGTTGCCCAGGCTGGTCTCGAACTCCTGACCTCAGGTGATCTGCCCACCTCGGCC 
TCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCATGCCCGGCCAGAAACCCCAATAACTTGCACCAATCTAATATT 
TTTAGCAGAGACAGGGTTTTGCCATGTTGCCCAGGCTAGTCTCAAACTCCTGACCTCAGGTGATCTGCCTACCTCGG 
CCTCCCAAAGTGCTGGGATTACAGGCATGAGCCACCGCGCCCGGTCGAGAATCTCCTTCTTGTTCCTTGAACCCTCT 
TCCTGTCCCTCAACCTCCTTTCTCCATAACTTCACTTGTTTTCCCTGGAACCCCTGTTCTGTGCGCTCAAATTTGAA 
TTCCCCTTTCCTGGATGTTTTCTTCCTGTCTATGAAACTCCATTCTGTGCTCTTGAACTCCAAATCTTGCCTTGAAC 
CATGTCATTTCTATATGACCCTCCAATCCTCAATCTCTGTCTCTGGAATCCCCTCAAACCCCACTTTCTGTTCCTTG 
GACTTTATTCTTCAATTTCCTTCTCCTATGGCCCAGTTCCTAACCCTTGTACCACACATCCTGTCCATTGCATGTGC 
CGCTTTTCCTCAGTCGCTATTGAATTCCTCCTTCATACTGCTTCAGTTTCCTCATCTCCAGCCTGCATTGCGCAGTT 
CATCCTTCATGTCCACTCACCCACAG 

<210> SEQ ID NO 667 

<211> Length : 872 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 667 
>M7 8 07 6_PEA_l__node_32 

GTGAGTGTCTATTACCCTGGCTCCCATTACAGATCTCTGAGGGCAGATCTTGACTCCTAAATGTTGGGCCCCCCCAA 
TTTCATTTATTCCTCTATAACAAACAGCCCAGACCTTAGCAGTGAAAATCAACAATGATTTTTCTTTGTTCATGATT 
CTGCCATCCGGTCTGCGCTCAGCAGAGTGGTTCTTTCAGTGGTCTTGCCAGTGGTCAAGCATGCAGCTGTATTTAGC 
TAGCAGATCATCTAGGGGCTGGGAGTCTAGCACAAATGGACCTTTCTCTCTCTCCAAGGAAGCGCAAGGCCTCTCTT 
CTCCGTGGAGCTTCTCCATGTGGTCTCATCAGCAGGGTAGCTAGATTCCCTACATGGTGGTTTATGCTCTCTAAGAC 
ATCACAGTGGAAGTTGCTAGGTCTTAAGGCTTGGGCCCACATTCTATTTGTTAAAGCAAGTTACAAATTCAGTCCAG 
ATTCAAGGGAAGGAACCTATATGCATACCGGAAAGTGTGACCTATTGCAGCCCCCACATCTATTGTGTCTTTCTCCT 
GGATATCTCACACATAACCCTGATTCTCCTAGTATTTAAGAAAGCTATCATCTTGAGGCGCGGTGGCTCACGCCTAT 
AATCCCAGCACTTTAGGAGGCCGAGGCGGGTGGATCACTTGAGGTCAGGAGTTCGAGACCAGCCTGGCCAACATGGT 
GAAACCCCGTCTTTACTAAAAATACAAAAATCAGCCGGGCATGATGTCGCTTGCCTGTAATCCCAGCTACTTAGGAG 
GCTGAGGCAAGAGAATTGCTTGAACCCGGGAGGTGGAGGTTGCAGTGAGCTGAGATCGCATCATTGCACTCCAGCTG 
GGCAACAAGAGTGAGACTCTGTCTC 
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<210> SEQ ID NO 668 

<211> Length : 259 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 668 
>M7 8 07 6_PEA_l_node_3 5 

GTGAGTGAGCCCACATATAGATGACCCCAGACATTAGGGAACAGGCCCCAGCCTAATTTGTAATCCCCTAGAGTCTG 
AGGGTGTCTTCACCACCACAGTGACTGGGAGAGGATGAGGAGGAACGTCTAAGGTTGCAGGGGCCTCTGTAGGATCC 
CCAATCCTCCTTCTTAGTCCCTGGAAGGATGTTTCTCCACCTTTCTTTGCTGATACCCTCCTCTCTTCACTGTTCCA 
CTCCCTTGCTTCCTCTGGCTGCCAGCAG 

<210> SEQ ID NO 669 

<211> Length : 463 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 669 
>M7 8 0 7 6_PE A_l_node_3 7 

GTGAGTGTCTCACAGTTAACCCCAGCCTCCAAATCCCACTGAATCCCTGAACCCAGAAGGAAACAGGGTCCATCCAT 
TGGGAACCTCAGACCCCCTGGGGTAGAGTTTGATGTACTTTCCAGCCCCCTCCTCTGGACCCTAAAGAATGAGATAG 
GGCCAGGCGCTGGTGACTCACACCCGTAATCCTAGCACTTTCAGAGGCTGAGGCAGGAGGATCCCTTGAGGCCACGA 
GTTCTAGACCAGCCTGGGCAACATAATGAGACCCTGTACCTACAAATAATTTAAAAATTACCTGGGTGTGGTGGGGC 
ATGTCTGTAGTCCCAGCTGCTCAGGAGGCTGACGTAGAAGGATCACTGGAGCCCAGGAAGTTGAGGCTGCAGTGAGC 
TGAGATCATGCCACTGCACTCCAGCCTGGGTGACAGAGTGAGACTCTGTCTAAAGAAAAAAAAAAAGAATGAGATCA 

G 

<210> SEQ ID NO 670 
<211> Length : 121 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 670 
>M7 8 07 6_PE A_l_node_4 6 

GTAAGAGGAGGAACAGCCGGGTACCTAGGGGAAGAGACCAGAGGTCAGCGGCCAGGCTGTGATTCCCAAAGCCACAC 
AGGACCCTCAAAGAAGCCCTCTGCCCCATCTCCTCTCCCTGCAG 

<210> SEQ ID NO 671 

<211> Length : 144 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 671 
>M7 8 07 6_PEA_l_node_4 7 

GCACCAGCTGGGACAGGGGTGTCCCGTGAGGCTGTGTCGGGTCTGCTGATCATGGGAGCGGGCGGAGGCTCCCTCAT 
CGTCCTCTCCATGCTGCTCCTGCGCAGGAAGAAGCCCTACGGGGCTATCAGCCATGGCGTGGTGGAG 

<210> SEQ ID NO 672 

<211> Length : 304 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<4 00> sequence : 672 
>M7 8 07 6_PEA_l_node__5 4 

AACCCCAACTCCCAGCCTAGGGCAGCAGGGAGTCTTGAAGTGATCATTTCACACCCTTTTGTGAGACGGCTGGAAAT 
TCTTATTTCCCCTTTCCAATTCCAAAATTCCATCCCTAAGAATTCCCAGATAGTCCCAGCAGCCTCCCCACGTGGCA 
CCTCCTCACCTTAATTTATTTTTTAAGTTTATTTATGGCTCTTTAAGGTGACCGCCACCTTGGTCCTAGTGTCTATT 
CCCTGGAATTCACCCTCTCATGTTTCCCTACTAACATCCCAATAAAGTCCTCTTCCCTACCAGGCCAGTCTGA 
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<210> SEQ ID NO 673 

<211> Length : 44 

<212> Type : DNA 

<2I3> Organism : Homo sapiens 

<400> sequence : 673 
>M7 807 6_PEA__l_node_l 

CTGCTCGCGGTCTAAGTCGCCGCCCGGGCCAGCCGCCGCTGCCG 

<210> SEQ ID NO 674 

<211> Length : 20 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 674 
>M7 8 07 6_PEA_l_node_2 
CTGCTGCTGCCACTATTGCT 

<210> SEQ ID NO 675 

<211> Length : 64 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 675 
>M78076 PEA 1 node 3 
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GCTGCTTCTGCGCGCGCAGCCCGCCATCGGGAGCCTGGCCGGTGGGAGCCCCGGCGCGGCCGAG 

<210> SEQ ID NO 676 

<211> Length : 82 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 676 
>M7 807 6_PEA_l_node_6 

GCCCCGGGGTCGGCCCAGGTGGCTGGACTATGCGGGCGCCTAACCCTTCACCGGGACCTGCGCACCGGCCGCTGGGA 
ACCAG 

<210> SEQ ID NO 677 

<211> Length : 62 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 677 
>M7 807 6_PEA__l_node_7 

ACCCACAGCGCTCTCGACGCTGTCTCCGGGACCCGCAGCGCGTGCTGGAGTACTGCAGACAG 

<210> SEQ ID NO 678 

<211> Length : 113 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 678 
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>M7 8 07 6_PEA_l_node_l 2 

CTGGTGAATTTGTGAGTGAGGCCCTGCTGGTGCCTGAAGGCTGCCGGTTCTTGCACCAGGAGCGCATGGACCAATGT 
GAGAGTTCAACCCGGAGGCATCAGGAGGCACAGGAG 

<210> SEQ ID NO 679 

<211> Length : 75 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 679 
>M7 807 6_PEA__l_node_2 2 

GTGATGCGTGAATGGGCCATGGCAGACAACCAGTCCAAGAACCTGCCTAAAGCCGACAGACAGGCCCTGAATGAG 

<210> SEQ ID NO 680 

<211> Length : 4 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 680 
>M7 8 0 7 6_PEA_l_node_2 7 
GTGC 

<210> SEQ ID NO 681 

<211> Length : 72 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 681 
>M7 8 07 6_PE A_l_node_3 0 

GTGCATACCCACCTTCAAGTGATTGAGGAGAGGGTGAATCAGAGCCTGGGCCTGCTTGACCAGAACCCCCAC 

<210> SEQ ID NO 682 

<211> Length : 28 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 682 

>M7 807 6_PE A_l_node_3 1 

CTGGCTCAGGAGCTGCGGCCCCAAATCC 

<210> SEQ ID NO 683 

<211> Length : 108 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 683 
>M7 8 0 7 6_PE A_l_node_3 4 

AGGAACTCCTCCACTCTGAACACCTGGGTCCCAGTGAATTGGAAGCCCCTGCCCCTGGGGGCAGCAGCGAGGACAAG 
GGTGGGCTGCAGCCTCCAGATTCCAAGGATG 

<210> SEQ ID NO 684 

<211> Length : 24 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 684 
>M7 807 6_PEA_l_node_3 6 
ACACCCCCATGACCCTTCCAAAAG 

<210> SEQ ID NO 685 

<211> Length : 12 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 685 
>M7 8 07 6_PEA_l_node_4 1 
ACTTGGGGGTAG 

<210> SEQ ID NO 686 

<211> Length : 9 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 686 
>M7 8 07 6_PEA_l__node_4 2 
GGTCCACAG 

<210> SEQ ID NO 687 

<211> Length : 62 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 687 
>M78076 PEA 1 node 43 
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AACAAGATGCTGCATCCCCTGAGAAAGAGAAGATGAACCCGCTGGAACAGTATGAGCGAAAG 

<210> SEQ ID NO 688 

<211> Length : 63 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 688 
>M7 8 0 7 6_PE A_l_node_4 5 

GTGAATGCGTCTGTTCCAAGGGGTTTCCCTTTCCACTCATCGGAGATTCAGAGGGATGAGCTG 

<210> SEQ ID NO 689 

<211> Length : 36 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 689 
>M7 807 6_PEA_l_node_4 9 

GTGGACCCCATGCTGACCCTGGAGGAGCAGCAGCTC 

<210> SEQ ID NO 690 

<211> Length : 39 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 690 
>M7 807 6_PE A_l_node_5 0 

CGCGAACTGCAGCGGCACGGCTATGAGAACCCCACTTAC 
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<210> SEQ ID NO 691 

<211> Length : 59 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 691 
>M7 8 0 7 6_PE A_l_node_5 1 

CGCTTCCTGGAGGAACGACCCTGACCCGGCCCCCTTCACCCCTTCAGCCGAGCCCAGAC 

<210> SEQ ID NO 692 

<211> Length : 17 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 692 
>M7 8 07 6_PEA_l_node_5 2 
CTCCCCTCTTCCTGGAG 

<210> SEQ ID NO 693 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 694 
>M7 8 07 6_PEA_l_node_5 3 
CCCCAG 
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<210> SEQ ID NO 695 

<211> Length : 307 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 695 
>T 9 9 0 8 0_PE A_4_node_l 

GCGGTCAGCCCAAGGTCACTTGACCCAGTCAGTGTCCGGCCAACTCTGCAGCTCGGTCCAGCCCTGCCCTTGGGGAG 
CCGGGGAGGGGCGGGAGAGGCTTTTCTGGAGCTCCTTCAAAGAAGAACTTGTACTTTTCTGAGAACGACGCTCCCAG 
ACCTTGGGGTGTGCCCTTGTCTGGCAAAGGGCGGAGGCCCTGGCTGTGCCTCCGCGTGCTTCCGCCGCAGGATGCCG 
GCGTCCGCCCGCCTGGCGGGAGCGGGGCTGCTGCTGGCCTTTCTCCGCGCGCTCGGCTGCGCTGGGCGGGCCCCAG 

<210> SEQ ID NO 696 

<211> Length : 447 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 696 
>T 9 9 0 8 0_PEA_4_node_6 

GTATGTGGCCTGCAGGCTTTGGGGTGGTAACCTAGGTTGTGGGCATAGGAACAAGGGACGTGTTCTTGACAAATGTG 
AACTAGAAGCCGCTGGCTATTTGGTAGCTGCATGGCAAAGAGGTAGTTTGTAGAGCAATGAGTATGAAAATGCTGTG 
CAATCGGTAAAACATGGTGTAAATGGAAGATCACCCTGCTGTTATTATTAGTCAGTGGTGCCTGAGCATCTAGATAC 
ACATATTAATGAGCTTCTCTCTTCCAAGGGAAATAGAGGGCTTCCCCAGTGCTCGCCGTTGTGGCATCATCAAACCA 
AGTCAGGTTTCTTATAGAAAGGCTAACATTGATTGAAGAGCTGGCATTTAGATGACCTGATGTCATGTATATAACAT 
ATATAAATGCTTCCTCTGCAGCTGCTGCACTTTCTTCAGACCTCTTTCTCCAAGCTTCCCTC 

<210> SEQ ID NO 697 
<211> Length : 523 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 697 
>T 9 9 0 8 0_PEA__4__no de_l 1 

GTATTAATGAATTTAAAAGGATTTAAATCAAGGAATGTTCTCCAACTACAGTGGAACTAAACCACATTAAAAAAATA 
AAAAGGATAACTGGAAAATCCCAAAATATTTGGAAACCATATAGCACACTTACTTCTAAAATTGTGGTAGAATACAT 
ATAACATAGAAATTATTGTTCTAACCATTTTTAAATGTACAATTCAGTGGTCTTAAGCACATTCACATTGTTCTGTT 
TATCTACAGAACGCTTTTCATCTTGCAAAACTGAAACTCTGTATTCATTAAACACTAACTCCCCATTTTCTCCTTCC 
CCCATGCCCCTGACAATCATAAATCTACATTCTATTAATTCAACTGCTCTAGTTACCTCATATAAGTGGAATTTTAC 
AGTATTTGTCCTTTTGTGGCTGGCTTATTTCACTTAGCATAATATCCTCAGGGTTCATCTGTGTTATATCATGAAAG 
TAAAAACAATTTCCTTTCTTTGTAAGACGGAATAATATCCTGTTGTATGTGTATACTTTCA 

<210> SEQ ID NO 698 

<211> Length : 1,288 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 698 
>T 9 9 0 8 0_PEA_4_node_l 9 

CTGGGTTAATTTGAAGCAGAAGAGGGCAGTTGTTATACTGCCCTGTCAGTTGGATGCGGAGTCTTACTCAAAATTCA 
TTCTCAGCATTCTTCTTTTATGGTATCTTCTTTGGCACTTAGCAGCGCATCAGGTAGGCATCTTCTATTTTTCTTCA 
TTCCTTAATTTCCTTTGTATCCCTCAAATGGTTATTTATTTGGCTGGAGTCTGTTTTGTTCATTAAGCAAACATGTC 
TTTGCTTTGAACATGTCTTTGATTTGATGGATACTTAAATTCCTCATCAAACATTTTGTTGCTATGCATAACGTTTT 
CTTTGGCCAACTCCAGCAATTTCCCACATTTTGACATGCAATCATGTTAACTCCCATTTTCTTTTGTAATCCAACAT 
CTTCTATTTAGATAATTACTTTAACAATCAATGACTTAATATTCTAATCATAAATTTATACAAAAATAAAATTACCT 
CCAAAACATTGCTACCTTTCCTAAACATTCAGTCTTGCCACAGTTTAATAAAAGGAAAGAACATTAAAAAGGATAAG 
ACACTGTAATGATTAGATGCTTTTTATAAGCCTAAAGGCATTGTGATTATTTAGACAGAAGAGAAGAAAGTGAAGTG 
AAAACCTGATAGTTATGTAGTCTCATGGTTTGCTGTTGAGAGGCTGAACACCAGCTGCTTTCCTTTTCTAGGAAGAT 
AATAAAGTGGGCTTTGGCTACAACATAAAGATGTTGGGTTAGACAGTTTCACTACAGTAAGAACAACGGGATGAGTT 
GCCCAGGAAATTGTGAAATACTTTCTAATGATCTTTAAAGATATAATGAACACTAATTCATCTGGATTTGTTTAGGT 
GTGGTCCTGGTTAAAGGCAAAGGGAAGGATCAGATAACTTCATGTTTTTTCCATTTAACATACCCAATAGATTCTTG 
ATTAGGGGAAGGGAAAATGAGCAAGATACAGTCCAGTATTCTAAAAACAATCAGCCTTAGGGGATCATTTCAAAAGC 
ATCTGTTTTGGACTTAAGTCTTTGATACTTAACCAAATTGACTACACAGTGAAAAATTCTAGTGCCTGGGTTTTATA 
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GGGTAGAAGAAAGACATGCAGTCAAGTGGCCAATACTTCATGTGAAGATAAGCAATGAGATCCTTCTTGCTGTCTTT 
CTTTTGACTGTTCTGGGCAATATCAAATTAGTTTCAGTGGCTTGATTCTAGGCCAAGATTCTGGCAACAGATTGTAG 
TCTTACCTTGTTTTCTTCAATCTCACTGGATCTCTCTCTCTTTTTACCCCCCTTAG 

<210> SEQ ID NO 699 

<211> Length : 439 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 699 
>T 9 9 0 8 0_PEA__4_node_2 0 

GCTGAGGGTAAAAAGCTGGGATTGGTAGGCTGGGTCCAGAACACTGACCGGGGCACAGTGCAAGGACAATTGCAAGG 
TCCCATCTCCAAGGTGCGTCATATGCAGGAATGGCTTGAAACAAGAGGAAGTCCTAAATCACACATCGACAAAGCAA 
ACTTCAACAATGAAAAAGTCATCTTGAAGTTGGATTACTCAGACTTCCAAATTGTAAAATAATGGCCTGAATTTAAG 
TTTTCTAAGATAAACTCAGTGGTTTGGTTTTTATTATTAATAGAGATAGAACTATTGTGTGTTAATATTAGCATTAG 
TCAATAAGTTATTTTAATGTCAGATTTTTGAATGTTATTATATATTACCTGTATGATGGAAGGATTACCACTGTACA 

C A A AT C T A AT C AAT A AA A AC G T T A G A AC C T T C T GC T TAG AG T AC AT T T A A A A A A 

<210> SEQ ID NO 700 

<211> Length : 88 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 700 
>T990 8 0_ PEA_4_node_3 

TCCGGGCGCGGAGGTTTGCGCGCCTTGGTGAGCCGTTGGCGTGGTGGTCCCGGAGTGATCCTGGCAGCCGGTGGGGA 
AGACAAGGAGG 

<210> SEQ ID NO 701 
<211> Length : 92 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 701 
>T 9 9 0 8 0_PEA_4_node_5 

GTTTGAGCATGGCAGAAGGAAACACCCTGATATCAGTGGATTATGAAATTTTTGGGAAGGTGCAAGGGGTGTTTTTC 
CGTAAGCATACTCAG 

<210> SEQ ID NO 702 

<211> Length : 79 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 702 
>T9 90 8 0_PEA_4_node_8 

GAAATGACTGTTGAAAACAGAATTGCTGAAACTCACAGCAAGAGCTGTGTTCCAGTTAGCTTTGCTACCAGTTATGC 
AG 

<210> SEQ ID NO 703 

<211> Length : 77 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 703 
>T 9 9 0 8 0_PEA_4_node_l 3 

TTTTGAGACAGAGTCTTGCTCTGTTGCCTAGGCTAAAGTGCAGTAGTGCGATCTCGGCTCACTGCAACCTCCACTTC 

<210> SEQ ID NO 704 
<211> Length : 114 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 704 
>T990 80_PEA_4_node_15 

CTTTGTTTTTGGGAGGCTGAGGAAGGAGGATCATTTGAGCCTGGGAGGTTAAGGCTGCAATAAGCTGTGACTGTGCC 
ACCATCCTT C AG AAA A A AAA A A G AA A AAG G AA A AG AG 

<210> SEQ ID NO 705 

<211> Length : 49 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 705 
>T 9 9 0 8 0_PEA_4_node_l 8 

CTATTTACCATATAAGATAACTCTTAAGAACTGGAGATAGTCAGCTCCC 

<210> SEQ ID NO 706 

<211> Length : 287 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 706 
>T 0 8 4 4 6__PE A_l_node_2 

GCTACGGAAGGGTTTTCAGCAGGGAGAGATGGGACCAGCATGCTCTCCATCCTGTTGCCCCGCCTCACGTCTGGCCC 
CTGCTTCTCCAGTCCCCCACCCCAGACCACACACGAAGAAGCAGTCCTGTCCTCAGCCCAGCCCTCACCTCCCCCGA 
CCTGCCATCCTGCTTCATGCTCAGGGCGGTGTGTGGAGCGCCCGGGGCTCTGGACCCGCGCTGCCAGATAACAATGC 
TCTCGTTGTCTCTTTGCTCCCATCTCTGGGGGCCTCTGATTCTTTCTGCTCTACAG 
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<210> SEQ ID NO 707 

<211> Length : 138 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 707 
>T 0 8 4 4 6__PEA__l_node_9 

GGCCGTTCCTGGCCGGTTCTCCGGAGTTACGATGACTTTCGTTCCCTGGATGCCCACCTCCACCGGTGCATATTTGA 
CCGGAGGTTCTCCTGCCTTCCGGAGCTTCCCCCGCCCCCCGAGGGTGCCAGGGCTGCCCAG 

<210> SEQ ID NO 708 

<211> Length : 140 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 708 
>T 0 8 4 4 6__PE A_l_node_l 5 

GCGTGTGCCACCACGCCCAGCACAGAGAGATTTTAGATGGCAACCGTGTGGCATCTGCTGTGGAGGATGAAGGTGCA 
GAGGTGGATGGGGAAGCCTTCAGGTGGGGAAGCCTTTGGGTGGGAGAGTCCTGGGACATGTGA 

<210> SEQ ID NO 709 

<211> Length : 123 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 709 
>T08446 PEA 1 node_17 
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CTGGACAATCACGGCCGGCGACTGCTCCTCAGTGAGGAGGCGTCACTCAATATCCCTGCAGTGGCGGCCGCCCATGT 
GATCAAACGGTATACAGCCCAGGCGCCAGATGAGCTGTCCTTTGAG 

<210> SEQ ID NO 710 

<21X> Length : 153 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 710 
>T 0 8 4 4 6_PE A_l_node_2 5 

CTGTGCCACGGCCTCGTGGGAAGCTGGCCGGCCTGCTCCGCACCTTCATGCGCTCCCGCCCTTCTCGGCAGCGGCTG 
CGGCAGCGGGGAATCCTGCGACAGAGGGTGTTTGGCTGCGATCTTGGCGAGCACCTCAGCAACTCAGGCCAGGATG 

<210> SEQ ID NO 711 

<211> Length : 145 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 711 
>T0 8 4 4 6_PEA_l_node_2 9 

GCACGAGTTTGACAGTGAGAGGATCCCGGAGCTGTCTGGCCCTGCATTCCTGCAGGACATCCACAGCGTGTCCTCCC 
TCTGCAAGCTCTACTTCCGAGAGCTTCCGAACCCTCTGCTCACCTACCAGCTCTATGGGAAGTTCAGT 

<210> SEQ ID NO 712 

<211> Length : 146 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 712 
>T084 4 6_PEA_l_node_38 

GTCCATGGAGCTGGAGTCAGTGGGAATGGGTGGCGCGGCGGCGTTCCGGGAAGTTCGGGTGCAGTCGGTGGTGGTGG 
AGTTTCTGCTCACCCATGTGGACGTCCTGTTCAGCGACACCTTCACCTCCGCCGGCCTCGACCCTGCAG 

<210> SEQ ID NO 713 

<211> Length : 154 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 713 
>T 0 8 4 4 6_PE A_l_node_4 3 

GCCGCTGCCTGCTCCCCAGGCCCAAGTCCCTTGCGGGCAGCTGCCCCTCCACCCGCCTGCTGACGCTGGAGGAAGCC 
CAGGCACGCACCCAGGGCCGGCTGGGGACGCCCACGGAGCCCACAACTCCCAAGGCCCCGGCCTCACCTGCGGAAAG 

<210> SEQ ID NO 714 

<211> Length : 348 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 714 
>T 0 8 4 4 6_PE A_l_node_5 1 

GCCTCCAGAGGCTGCACAGGCTGCGGCGACCCCACTCCAGCAGCGACGCTTTCCCTGTGGGCCCAGCACCTGCTGGC 
TCCTGCGAGAGCCTGTCCTCGTCCTCCTCCTCCGAGTCCTCCTCCTCTGAGTCCTCCTCTTCCTCCTCTGAGTCCTC 
AGCAGCTGGGCTGGGGGCACTCTCTGGGTCTCCCTCACACCGTACCTCAGCCTGGCTAGATGATGGTGATGAGCTGG 
ACTTCAGCCCACCCCGCTGCCTGGAGGGACTCCGGGGGCTGGACTTTGATCCCTTAACCTTCCGCTGCAGCAGCCCC 
ACCCCAGGGGATCCCGCACCTCCCGCCAGCCCAGCACCCC 

<210> SEQ ID NO 715 
<211> Length : 123 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 715 
>T0 8 4 4 6_PEA_l_node_52 

CCGCCCCTGCCTCTGCCTTCCCACCCAGGGTGACCCCCCAGGCCATCTCGCCCCGGGGGCCCACCAGCCCCGCCTCG 
CCTGCTGCCCTAGACATCTCAGAGCCCCTGGCTGTATCAGTGCCAC 

<210> SEQ ID NO 716 

<211> Length : 177 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 716 
>T0 84 4 6_PEA__l_node_55 

AACTGCTGGGGGCTGGGGGAGCACCTGCCTCAGCCACCCCAACACCAGCTCTCAGCCCCGGCCGGAGCCTGCGCCCC 
CATCTCATACCCCTGCTGCTGCGAGGAGCCGAGGCCCCGCTGACTGACGCCTGCCAGCAGGAGATGTGCAGCAAGCT 
CCGGGGAGCCCAGGGCCCACTCG 

<210> SEQ ID NO 717 

<211> Length : 392 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 717 
>T0 8 4 4 6_PEA_l_node_57 

GTCCTGATATGGAGTCACCACTGCCACCCCCTCCCCTGTCTCTCCTGCGCCCTGGGGGTGCCCCACCCCCGCCCCCT 
AAGAACCCAGCACGCCTCATGGCCCTGGCCCTGGCTGAGCGGGCTCAGCAGGTGGCCGAGCAACAGAGCCAGCAGGA 
GTGTGGGGGCACCCCACCTGCTTCCCAATCCCCCTTCCACCGCTCGCTGTCTCTGGAGGTGGGCGGGGAGCCCCTGG 
GGACCTCAGGGAGTGGGCCACCTCCCAACTCCCTAGCACACCCGGGTGCCTGGGTCCCGGGACCCCCACCCTACTTA 
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CCAAGGCAACAAAGTGATGGGAGCCTGCTGAGGAGCCAGCGGCCCATGGGGACCTCAAGGAGGGGACTCCGAGGCCC 
TGCCCAG 

<210> SEQ ID NO 718 

<211> Length : 311 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 718 
>T 0 8 4 4 6_PE A_l_node_5 9 

GTTCCTACCCCCGGCTTCTTCTCCCCAGCCCCCAGGGAGTGCCTGCCACCCTTCCTCGGGGTCCCCAAGCCAGGCTT 
GTACCCCCTGGGCCCCCCATCCTTCCAGCCCAGTTCCCCAGCCCCAGTCTGGAGGAGCTCTCTGGGCCCCCCTGCAC 
CACTCGACAGGGGAGAGAACCTGTACTATGAGATCGGGGCAAGTGAGGGGTCCCCCTATTCTGGCCCCACCCGCTCC 
TGGAGTCCCTTTCGCTCCATGCCCCCCGACAGGCTCAATGCCTCCTACGGCATGCTTGGCCAATCACCCCCACTCCA 

CAG 

<210> SEQ ID NO 719 

<211> Length : 206 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 719 
>T 0 8 4 4 6__PEA_l_no de_ 6 2 

CTCTACGTCAACCTAGCTCTAGGGCCCAGGGGTCCCTCACCTGCCTCTTCCTCCTCCTCTTCCCCTCCTGCCCACCC 
CCGAAGCCGTTCAGATCCCGGTCCCCCAGTCCCCCGCCTTCCCCAGAAACAACGGGCACCCTGGGGACCCCGTACCC 
CTCATAGGGTGCCGGGTCCCTGGGGCCCTCCTGAGCCTCTCCTGCTCTACAG 

<210> SEQ ID NO 720 
<211> Length : 426 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 720 
>T084 4 6_PEA_l_node_63 

GGCAGCCCCGCCAGCCTACGGAAGGGGGGGCGAGCTCCACCGAGGGTCCTTGTACAGAAATGGAGGGCAAAGAGGGG 
AGGGGGCTGGTCCCCCACCCCCTTACCCCACTCCCAGCTGGTCCCTCCACTCTGAGGGCCAGACCCGAAGCTACTGC 
TGAGCACCAGCTGGGAGGGGCCGTCCTTCCTTCCCTTCACCCTCACTGGATCTTGGCCCAACCAAATCCCTTGTTTT 
GTATTTTCTTGAACCCCGACCACTACCCCAGGTTTCTAACTTTGTAACTTGCTTCTGATGTGGGTCCCTAACCTATA 
ATCTCAGCTTCCCTACCCTGGACTGAAGGGTCTGCCCATCCCCCCACCACCCTCCATCCTGGGGGCCCTCGCACAAA 
TCTGGGGTGGGAGGGGCTAGGCTGACCCCATCCTCCTCTCC 



<210> SEQ ID NO 721 

<211> Length : 98 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 721 
>T 0 8 4 4 6_PEA_l_node_3 

GCACGCAGCACTGACAGCCTGGATGGCCCAGGGGAGGGCTCGGTGCAGCCTCTACCCACTGCTGGGGGGCCCAGTGT 
GAAGGGGAAGCCTGGGAAGAG 

<210> SEQ ID NO 722 

<211> Length : 85 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 722 
>T0 8 4 4 6_PEA_l_node_5 

GCTCTCAGCTCCTCGAGGCCCCTTCCCGCGGCTGGCTGACTGCGCCCATTTCCACTACGAGAACGTTGACTTTGGCC 
ACATTCAG 



WO 2006/131783 



PCT/IB2005/004037 



352 

<210> SEQ ID NO 723 

<211> Length : 81 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 723 
>T0 8 4 4 6_PEA_l_node_7 

CTCCTGCTGTCTCCAGACCGTGAAGGGCCCAGCCTCTCTGGAGAGAATGAGCTGGTGTTCGGGGTGCAGGTGACCTG 
TCAG 

<210> SEQ ID NO 724 

<211> Length : 93 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 724 
>T 0 8 4 4 6_PE A__l_no de_l 2 

ATGCTGGTGCCACTGCTGCTGCAGTACCTGGAGACACTGTCAGGACTGGTGGACAGTAACCTCAACTGCGGGCCTGT 
GCTCACCTGGATGGAG 

<210> SEQ ID NO 725 

<211> Length : 46 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 725 
>T 0 8 4 4 6_PEA_l_node_l 3 

GTGGGCCTGGGCAGGGGGCTTGGAGATTCCGAGTGGGTGAGGGGGT 

<210> SEQ ID NO 726 

<211> Length : 78 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 726 
>T 0 8 4 4 6_PEA_l_no de_l 9 

GTGGGAGACATTGTCTCGGTGATCGACATGCCACCCACAGAGGATCGGAGCTGGTGGCGGGGCAAGCGAGGCTTCCA 
G 

<210> SEQ ID NO 727 

<211> Length : 67 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 727 
>T 0 8 4 4 6_PEA_l_node_2 1 

GTCGGGTTCTTCCCCAGTGAGTGTGTGGAACTCTTCACAGAGCGGCCAGGTCCGGGCCTGAAGGCGG 

<210> SEQ ID NO 728 

<211> Length : 60 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 728 
>T0 84 4 6_PEA_l_node_23 

ATGCCGATGGCCCCCCATGTGGCATCCCGGCTCCCCAGGGTATCTCGTCTCTGACCTCAG 

<210> SEQ ID NO 729 

<211> Length : 103 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 729 
>T0 8 4 4 6_PEA_l_node_2 7 

TGCCCCAGGTGCTGCGCTGCTGCTCCGAGTTCATTGAGGCCCACGGGGTGGTGGATGGGATCTACCGGCTCTCAGGC 
GTGTCTTCCAACATCCAGAGGCTTCG 

<210> SEQ ID NO 730 

<211> Length : 83 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 730 
>T0 84 4 6_PEA_l__node_32 

GAGGCCATGTCAGTGCCTGGGGAGGAGGAGCGTCTGGTGCGGGTGCACGATGTCATCCAGCAGCTGCCCCCACCACA 
TTACAG 

<210> SEQ ID NO 731 

<211> Length : 108 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 731 
>T0 8 4 4 6_PEA__l_node_3 4 

GACCCTGGAGTACCTGCTGAGGCACCTGGCCCGCATGGCGAGACACAGTGCCAACACCAGCATGCATGCCCGCAACC 
TGGCCATTGTCTGGGCACCCAACCTGCTACG 

<210> SEQ ID NO 732 

<211> Length : 89 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 732 
>T 0 8 4 4 6_PE A_l__no de_4 5 

GAGGAAAGGGGAGAGAGGGGAGAAGCAGCGGAAGCCAGGGGGCAGCAGCTGGAAGACGTTCTTTGCACTGGGCCGGG 
GCCCCAGTGTCC 

<210> SEQ ID NO 733 

<211> Length : 57 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 733 
>T0 8 4 4 6_PEA_l_node_4 6 

CTCGAAAGAAGCCCCTGCCCTGGCTGGGGGGCACCCGTGCCCCACCGCAGCCTTCAG 

<210> SEQ ID NO 734 
<211> Length : 75 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 734 
>T0 8 4 4 6_PEA_l_node_4 8 

GCAGCAGACCCGACACCGTCACACTGAGATCTGCCAAGAGCGAGGAGTCTCTGTCATCGCAGGCCAGCGGGGCTG 



<210> SEQ ID NO 735 

<211> Length : 12 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 735 
>T0 8 4 4 6__PEA_l_node_5 4 
CCGCTGTCCTAG 



<210> SEQ ID NO 736 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 736 
>T 0 8 4 4 6_PEA_l_no de_5 8 

GTCAGTGCCCAGCTCAGGGCAGGTGGCGGGGGCAGGGATGCGCCAGAGGCAGCAGCCCAGTCCCCATGTTCTGTCCC 
CTCACAG 



<210> SEQ ID NO 737 
<211> Length : 50 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 737 
>T08 44 6_PEA__l_node_60 

GTCCCCCGACTTCCTGCTCAGCTACCCGCCAGCCCCCTCCTGCTTTCCCC 

<210> SEQ ID NO 738 

<211> Length : 62 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 738 
>T0 8 4 4 6_PEA_l_node_6 1 

CTGACCACCTTGGCTACTCAGCCCCCCAGCACCCTGCTCGGCGCCCTACACCGCCTGAGCCC 

<210> SEQ ID NO 739 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 739 
>T0844 6_PEA_l_node_64 
CTCCAG 

<210> SEQ ID NO 740 
<211> Length : 52 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 740 
>T0 8 4 4 6_PEA_l_node_65 

GAGCCCCCAGCATGTCCTGACCTGTGCACGGGGATGGGGGGACAACTCCTAC 

<210> SEQ ID NO 741 

<211> Length : 67 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 741 
>T0 8 4 4 6_PEA_l_node_6 6 

CCTTCTTTCCCCACATGCCCCACTAAACCATCTGACAACATTAATGAATAAAATGGTGAAAATGTGA 

<210> SEQ ID NO 742 

<211> Length : 424 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 742 
>HUMCA!XIA__node_0 

ACACAGTACTCTCAGCTTGTTGGTGGAAGCCCCTCATCTGCCTTCATTCTGAAGGCAGGGCCCGGCAGAGGAAGGAT 
CAGAGGGTCGCGGCCGGAGGGTCCCGGCCGGTGGGGCCAACTCAGAGGGAGAGGAAAGGGCTAGAGACACGAAGAAC 
GCAAACCATCAAATTTAGAAGAAAAAGCCCTTTGACTTTTTCCCCCTCTCCCTCCCCAATGGCTGTGTAGCAAACAT 
CCCTGGCGATACCTTGGAAAGGACGAAGTTGGTCTGCAGTCGCAATTTCGTGGGTTGAGTTCACAGTTGTGAGTGCG 
GGGCTCGGAGATGGAGCCGTGGTCCTCTAGGTGGAAAACGAAACGGTGGCTCTGGGATTTCACCGTAACAACCCTCG 

CATTGACCTTCCTCTTCCAAGCTAGAGAGGTCAGAGGAG 
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<210> SEQ ID NO 743 

<211> Length : 168 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 743 
>HUMCAlXIA_node_2 

CTGCTCCAGTTGATGTACTAAAAGCACTAGATTTTCACAATTCTCCAGAGGGAATATCAAAAACAACGGGATTTTGC 
ACAAACAGAAAGAATTCTAAAGGCTCAGATACTGCTTACAGAGTTTCAAAGCAAGCACAACTCAGTGCCCCAACAAA 

ACAGTTATTTCCAG 



<210> SEQ ID NO 744 

<211> Length : 214 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 744 
>HUMCAlXIA_node_4 

GTGGAACTTTCCCAGAAGACTTTTCAATACTATTTACAGTAAAACCAAAAAAAGGAATTCAGTCTTTCCTTTTATCT 
ATATATAATGAGCATGGTATTCAGCAAATTGGTGTTGAGGTTGGGAGATCACCTGTTTTTCTGTTTGAAGACCACAC 
TGGAAAACCTGCCCCAGAAGACTATCCCCTCTTCAGAACTGTTAACATCGCTGACGGGAA 



<210> SEQ ID NO 745 

<2il> Length : 163 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 745 
>HUMCA1XIA node 6 
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GTGGCATCGGGTAGCAATCAGCGTGGAGAAGAAAACTGTGACAATGATTGTTGATTGTAAGAAGAAAACCACGAAAC 
CACTTGATAGAAGTGAGAGAGCAATTGTTGATACCAATGGAATCACGGTTTTTGGAACAAGGATTTTGGATGAAGAA 

GTTTTTGAG 

<210> SEQ ID NO 746 

<211> Length : 129 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 746 
>HUMCAlXIA_node_8 

GGGGACATTCAGCAGTTTTTGATCACAGGTGATCCCAAGGCAGCATATGACTACTGTGAGCATTATAGTCCAGACTG 
TGACTCTTCAGCACCCAAGGCTGCTCAAGCTCAGGAACCTCAGATAGATGAG 

<210> SEQ ID NO 747 

<211> Length : 215 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 747 
>HUMCAlXIA__node_9 

GTGAGGAGCACAAGACCAGAGAAGGTCTTCGTATTTCAGTGATATGGACATAGCAGTGCAGTTTTTGAACTTATACT 
TATTTATTCCATTTTAATTAAGCTATGCTTTGTATTTTAATTGTGTTGTAATATTTCCAGGAAAAAGTGACTTGAAT 
ATATTTGGTACTTGTTTTCTTGCTGTTTAAGCATTTGATTACATAAATTTATTAGCAATAA 

<210> SEQ ID NO 748 
<211> Length : 214 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 748 
>HUMCAlXIA_node_l 8 

CCAAATCCAGTTGAAGAAATATTTACTGAAGAATATCTAACGGGAGAGGATTATGATTCCCAGAGGAAAAATTCTGA 
GGATACACTATATGAAAACAAAGAAATAGACGGCAGGGATTCTGATCTTCTGGTAGATGGAGATTTAGGCGAATATG 
ATTTTTATGAATATAAAGAATATGAAGATAAACCAACAAGCCCCCCTAATGAAGAATTTG 

<210> SEQ ID NO 749 

<211> Length : 430 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 749 
>HUMCAlXIA_node_5 4 

GTCTCCTTTTCTTTCTCATTATTTTACAAAAAAGTAATTAAGTTTGCTTGTGACAAAAGATTTGTAGGAAGACATGA 
TGAAAGGAAAGTTGTGAAGTTGTCATTGCCATTATATCTCATATATGAATAGCAAATGATGTTCTTGTTAATAACTC 
AATGTCTGGATCAAAACAAGAATAAATCTATAATTAAACACATGTCTTTTTCCCTGATCTCCACTGTGAGATTCCTT 
GAGTAATATTTTGTCCCCTGTAGTCATAGCACATTTCTATCTGGCTCTTTCCAACACCTTTTTTCTTTCATTATTTT 
TGTTGATTTCTACAAACATATTAATTAAAAAAAACTAATAGCTTTATCGAAGTGTAATTTAAATACTATAAATATTC 
CCTTGTTTTAAGTGTACAAGTGAATTATTTTTACTAAATTTACAG 

<210> SEQ ID NO 750 

<211> Length : 639 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 750 
>HUMCAlXIA_node_55 

ATGTGCTGCAATCTAAGTTTCGGAATACTTATACCACTCCAGAAATAATCCTCGTGTTAATATTCAGTCCATGTTTC 
CACCCCCTCAGACCTAGCTAACAATTAATTTGCTTTCTGTCTCTGTAGATTTGAGTTTTCTGAATATTTCATGTGAA 
TGAAACTTATGAATAAATTTATGATTTAATCATATGAATGTGTATGTGACCTTTATGTCTGACTTCTTTCATTTTAG 
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GTTCATCCATGCTGTAGCATACATATATTAGTACTTTGCTCCTTATTTTGTTCTCATAGTAXTCCATTGATTGGGTA 
TACCAGGTTCTGTTTACTTTTACTTGGCAGTTGATAGAATAGGTGTAGTTTATACTTTTTCGCTATTCTCCATACCG 
GTGCTGTAGTGAATAATTGCATACAAGTCTTTGTATAGATGTGTTTTCATTCTTTTTGGTATATACTTAGAAGCAGA 
ATTCTTGTGTTATGGTAAACTTATATTCAATATTTTGTGAATTCCACTCTTTTCCATATCGATTGTACCATTTTCCC 
TTCCAAGTAACCATGTATGAGGATAGTCATTTCTGCACATTCTCACTAATGCTTGTTATTGTCTGTCTTCTTGATTA 

CGATCATTCTCGTTGGTGTGAAA 

<210> SEQ ID NO 751 

<211> Length : 129 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 751 
>HUMCAlXIA_node__92 

GTAAGTATGATGATAATAAATAGCCAGACAATCATGGTTGTGAATTACAGCTCTTCTTTCATTACTCTCATGCTGTG 
ATTCCACAGTGTGTGGGAGAGAAAATAAACACATGTCAATCAAATCAGTCAA 

<210> SEQ ID NO 752 

<211> Length : 117 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 752 
>HUMCAlXIA_node_ll 

TATGCACCAGAGGATATAATCGAATATGACTATGAGTATGGGGAAGCAGAGTATAAAGAGGCTGAAAGTGTAACAGA 
GGGACCCACTGTAACTGAGGAGACAATAGCACAGACGGAG 

<210> SEQ ID NO 753 
<211> Length : 93 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 753 
>HUMCAlXIA_node_15 

GCAAACATCGTTGATGATTTTCAAGAATACAACTATGGAACAATGGAAAGTTACCAGACAGAAGCTCCTAGGCATGT 
TTCTGGGACAAATGAG 

<210> SEQ ID NO 754 

<211> Length : 41 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 754 
>HUMCA1XI A_node_l 9 

GTCCAGGTGTACCAGCAGAAACTGATATTACAGAAACAAGC 

<210> SEQ ID NO 755 

<211> Length : 63 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 755 
>HUMCAlXIA_node_21 

ATAAATGGCCATGGTGCATATGGAGAGAAAGGACAGAAAGGAGAACCAGCAGTGGTTGAGCCT 

<210> SEQ ID NO 756 

<211> Length : 42 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 756 
>HUMCAlXIA_node_2 3 

GGTATGCTTGTCGAAGGACCACCAGGACCAGCAGGACCTGCA 

<210> SEQ ID NO 757 

<211> Length : 63 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 757 
>HUMCAlXIA_node_25 

GGTATTATGGGTCCTCCAGGTCTACAAGGCCCCACTGGACCCCCTGGTGACCCTGGCGATAGG 

<210> SEQ ID NO 758 

<211> Length : 75 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 758 
>HUMCAlXIA_node_2 7 

GGCCCCCCAGGACGTCCTGGCTTACCAGGGGCTGATGGTCTACCTGGTCCTCCTGGTACTATGTTGATGTTACCG 

<210> SEQ ID NO 759 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 759 
> HUMC A 1 X I A_n o d e_2 9 

TTCCGTTATGGTGGTGATGGTTCCAAAGGACCAACCATCTCTGCTCAGGAAGCTCAGGCTCAAGCTATTCTTCAGCA 
GGCTCGG 

<210> SEQ ID NO 760 

<211> Length : 57 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 760 
>HUMCAlXIA_node_31 

ATTGCTCTGAGAGGCCCACCTGGCCCAATGGGTCTAACTGGAAGACCAGGTCCTGTG 

<210> SEQ ID NO 761 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 761 
>HUMCAlXIA_node_33 

GGGGGGCCTGGTTCATCTGGGGCCAAAGGTGAGAGTGGTGATCCAGGTCCTCAG 

<210> SEQ ID NO 762 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 762 
>HUMCAlXIA_node_35 

GGCCCTCGAGGCGTCCAGGGTCCCCCTGGTCCAACGGGAAAACCTGGAAAAAGG 

<210> SEQ ID NO 763 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 763 
>HUMCAlXIA_node_37 

GGTCGTCCAGGTGCAGATGGAGGAAGAGGAATGCCAGGAGAACCTGGGGCAAAG 

<210> SEQ ID NO 764 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 764 
>HUMCA1XI A_node_3 9 

GGAGATCGAGGGTTTGATGGACTTCCGGGTCTGCCAGGTGACAAAGGTCACAGG 

<210> SEQ ID NO 765 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 765 
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>HUMCAlXIA_node_4 1 

GGTGAACGAGGTCCTCAAGGTCCTCCAGGTCCTCCTGGTGATGATGGAATGAGG 

<210> SEQ ID NO 766 

<211> Length : 45 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 766 

i 

>HUMCA1XI A_node_4 3 

GGAGAAGATGGAGAAATTGGACCAAGAGGTCTTCCAGGTGAAGCT 

<210> SEQ ID NO 767 

<211> Length : 0 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 767 
>HUMCAlXIA_node_4 5 

GGCCCACGAGGTTTGCTGGGTCCAAGGGGAACTCCAGGAGCTCCAGGGCAGCCT 

<210> SEQ ID NO 768 

<211> Length : 45 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 768 
>HUMCA1XIA node 4 7 
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GGTATGGCAGGTGTAGATGGCCCCCCAGGACCAAAAGGGAACATG 

<210> SEQ ID NO 769 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 769 
>HUMCAlXIA_node_4 9 

GGTCCCCAAGGGGAGCCTGGGCCTCCAGGTCAACAAGGGAATCCAGGACCTCAG 

<210> SEQ ID NO 770 

<211> Length : 45 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 770 
>HUMCAlXIA_node_51 

GGTCTTCCTGGTCCACAAGGTCCAATTGGTCCTCCTGGTGAAAAA 

<210> SEQ ID NO 771 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 771 
>HUMCAlXIA_node_57 

GGACCACAAGGAAAACCAGGACTTGCTGGACTTCCTGGTGCTGATGGGCCTCCT 
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<210> SEQ ID NO 772 

<211> Length : 45 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 772 
>HUMCA1XI A_node_5 9 

GGTCATCCTGGGAAAGAAGGCCAGTCTGGAGAAAAGGGGGCTCTG 

<210> SEQ ID NO 773 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 773 
>HUMCAlXIA_node__62 

GGTCCCCCTGGTCCACAAGGTCCTATTGGATACCCGGGCCCCCGGGGAGTAAAG 

<210> SEQ ID NO 774 

<211> Length : 45 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 774 
>HUMCAlXIA_node_64 

GGAGCAGATGGTGTCAGAGGTCTCAAGGGATCTAAAGGTGAAAAG 



<210> SEQ ID NO 775 
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<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 775 
>HUMCAlXIA_node_6 6 

GGTGAAGATGGTTTTCCAGGATTCAAAGGTGACATGGGTCTAAAAGGTGACAGA 

<210> SEQ ID NO 776 

<211> Length : 108 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 776 
>HUMCAlXIA_node_68 

GGAGAAGTTGGTCAAATTGGCCCAAGAGGGGAAGATGGCCCTGAAGGACCCAAAGGTCGAGCAGGCCCAACTGGAGA 
CCCAGGTCCTTCAGGTCAAGCAGGAGAAAAG 

<210> SEQ ID NO 777 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 777 
>HUMCA1XI A_node_7 0 

GGAAAACTTGGAGTTCCAGGATTACCAGGATATCCAGGAAGACAAGGTCCAAAG 



<210> SEQ ID NO 778 
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<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 778 
>HUMCAlXIA_node_72 

GGTTCCACTGGATTCCCTGGGTTTCCAGGTGCCAATGGAGAGAAAGGTGCACGG 

<210> SEQ ID NO 779 

<211> Length : 45 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 779 
>HUMCAlXIA_node_7 4 

GGAGTAGCTGGCAAACCAGGCCCTCGGGGTCAGCGTGGTCCAACG 

<210> SEQ ID NO 780 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 780 
>HUMCAlXIA_node_7 6 

GGTCCTCGAGGTTCAAGAGGTGCAAGAGGTCCCACTGGGAAACCTGGGCCAAAG 

SEQ ID NO:781 
> AAQ89265 

MSLLPRRAPPVSMRLLAAALLLLLLALYTARVDGSKCKCSRKGPKIRYSDVKKLEMKPKYPHCEEKMVIITTKSVSR 
YRGQEHCLHPKLQSTKRFIKWYNAWNEKRRVYEE 
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<210> SEQ ID NO 782 

<211> Length : 45 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 782 
>HUMCAlXIA_node_7 8 

GGCACTTCAGGTGGCGATGGCCCTCCTGGCCCTCCAGGTGAAAGA 

<210> SEQ ID NO 783 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 783 
>HUMCA1XI A_node_8 1 

GGTCCTCAAGGACCTCAGGGTCCAGTTGGATTCCCTGGACCAAAAGGCCCTCCT 

<210> SEQ ID NO 784 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 784 
>HUMCAlXIA_node_8 3 

GGACCACCTGGGAAGGATGGGCTGCCAGGACACCCTGGGCAACGTGGGGAGACT 
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<210> SEQ ID NO 785 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 785 
>HUMCAlXIA_node_8 5 

GGATTTCAAGGCAAGACCGGCCCTCCTGGGCCAGGGGGAGTGGTTGGACCACAG 

<210> SEQ ID NO 786 

<211> Length : 108 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 786 
>HUMCAlXIA_node_87 

GGACCAACCGGTGAGACTGGTCCAATAGGGGAACGTGGGCATCCTGGCCCTCCTGGCCCTCCTGGTGAGCAAGGTCT 
TCCTGGTGCTGCAGGAAAAGAAGGTGCAAAG 

<210> SEQ ID NO 787 

<211> Length : 90 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 787 
>HUMCAlXIA_node_8 9 

GGTGATCCAGGTCCTCAAGGTATCTCAGGGAAAGATGGACCAGCAGGATTACGTGGTTTCCCAGGGGAAAGAGGTCT 
TCCTGGAGCTCAG 
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<210> SEQ ID NO 788 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 788 
> HUMC A 1 X I A_node_ 9 1 

GGTGCACCTGGACTGAAAGGAGGGGAAGGTCCCCAGGGCCCACCAGGTCCAGTT 

<210> SEQ ID NO 789 

<211> Length : 211 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 789 
>T 1 1 62 8_PE A_l_node_7 

GCAAGCCAAAACCCTGGGCAGACTCAATCCAAAAATAAACAATCAAAGAGCATGTTGGCCTGGTCCTTTGCTAGGTA 
CTGTAGAGCAGGTGAGAGAGTGAGGGGGAAGGACTCCAAATTAGACCAGTTCTTAGCCATGAAGCAGAGACTCTGAA 
GCCAGACTACCTGGGTCCCAATCTTGGGCTTGGTATTTCCTCGCTGTGTGACTCTGG 

<210> SEQ ID NO 790 

<211> Length : 131 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 790 
>Tll628_PEA_l_node_ll 

CCCAGATTTACCAAAGGGAATTGTCAGCTGTCCAAGGGCTAGCAAATTCCTAGGTCACCTAGATTGGATTTTCTGAC 
CATAAAAACTGTGGGCCAGGTGCACAGCTGCCTGAGGGGCTCAAACCTGTGCAG 
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<210> SEQ ID NO 791 

<211> Length : 214 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 791 
>T1 162 8_PEA_l_node_l 6 

TCCTCCCCTTCTTTCCACACGCACAACCACCCCACCCCCTGTGGCCTGAGCTGTCCTGCCTCGCCACAATGGCACCT 
GCCCTAAAATAGCTTCCCATGTGAGGGCTAGAGAAAGGAAAAGATTAGACCCTCCCTGGATGAGAGAGAGAAAGTGA 
AGGAGGGCAGGGGAGGGGGACAGCGAGCCATTGAGCGATCTTTGTCAAGCATCCCAGAAG 

<210> SEQ ID NO 792 

<211> Length : 140 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 792 
>T1 1 62 8_PEA_l_node_2 2 

GCCTTATTTCTCTGCTGGTTGAGCGAAGGGATTGTCTTCCATGGTCTCCGAGATCCCGTCCCACGCTCATGCCCTAG 
AATTCTCTGAGTCCTTGATGCACTTTTGCCTTTGGCGAGGAGGCAGGACAGTCAGGCGTGGAG 

<210> SEQ ID NO 793 

<211> Length : 143 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 793 
>T11628 PEA 1 node_25 
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CTGAGGACTTAAAGAAGCATGGTGCCACCGTGCTCACCGCCCTGGGTGGCATCCTTAAGAAGAAGGGGCATCATGAG 
GCAGAGATTAAGCCCCTGGCACAGTCGCATGCCACCAAGCACAAGATCCCCGTGAAGTACCTGGAG 

<210> SEQ ID NO 794 

<211> Length : 130 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 794 
>T 1 1 62 8 JPE A__l_node_3 1 

CCCCACCCATCTGGGCCCCGGGTTCAAGAGAGAGCGGGGTCTGATCTCGTGTAGCCATATAGAGTTTGCTTCTGAGT 
GTCTGCTTTGTTTAGTAGAGGTGGGCAGGAGGAGCTGAGGGGCTGGGGCTGGG 

<210> SEQ ID NO 795 

<211> Length : 140 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 795 
>T 1 1 6 2 8_PE A_l_node_3 7 

CTTTGCAATGAGGAGGAGATCTGGGCTGGGCGGGCCAGCTGGGGAAGCATTTGACTATCTGGAACTTGTGTGTGCCT 
CCTCAGGTATGGCAGTGACTCACCTGGTTTTAATAAAACAACCTGCAACATCTCAAAAAAAGC 

<210> SEQ ID NO 796 

<211> Length : 93 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 796 
>T1 1 62 8_PEA_l_node_0 

TGTCTCTGAGAGCATTGATGAGGTCAGGAAGCCTCCTGTTGGGTAGAGGAGCAACTAAGAGACTGAACTTGGCCCCC 
ACCCTGAGGCTCACAA 

<210> SEQ ID NO 797 

<211> Length : 103 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 797 
>T11628__PEA_l_node_4 

GCTTGAATTGCACCTGAGTTCCAAAGGAGAAGTTGACATTCTTCCAGAACATATGCCCAGTGTCTTCAACTTGAGAT 
GGAGCTGGGATGCCAAGTCTGCAAAT 

<210> SEQ ID NO 798 

<211> Length : 47 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 798 
>T1162 8__PEA_l_node_9 

CTTGGCTGGAGGCTCTGCGAGGACAGCTGGGGAGAAGGGGAGCTGTG 

<210> SEQ ID NO 799 

<211> Length : 18 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 799 
>T1 1 62 8_PEA__l_node_13 
ATGGAGGCTCGCTCTGTT 

<210> SEQ ID NO 800 

<211> Length : 98 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 800 
>Tll62 8_PEA_l_node__14 

GCCAGGCTGGAGTACAGCGATCTCGGCTCACTGCAACCTCTGCCTCCCGGGTTCAAGTGATTCTCCTGCCTCAGCCT 
CCCAAGTAGCTGGGACTACAG 

<210> SEQ ID NO 801 

<211> Length : 96 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 801 
>T1 1 62 8__PEA_l_node_17 

GTATAAAAACGCCCTTGGGACCAGGCAGCCTCAAACCCCAGCTGTTGGGGCCAGGACACCCAGTGAGCCCATACTTG 
CTCTTTTTGTCTTCTTCAG 

<210> SEQ ID NO 802 

<211> Length : 78 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 802 
>T X 1 62 8_PE A_l_node_l 8 

ACTGCGCCATGGGGCTCAGCGACGGGGAATGGCAGTTGGTGCTGAACGTCTGGGGGAAGGTGGAGGCTGACATCCCA 
G 

<210> SEQ ID NO 803 

<211> Length : 25 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 803 

>X 1 1 62 8__PEA_l_node_l 9 

GCCATGGGCAGGAAGTCCTCATCAG 

<210> SEQ ID NO 804 

<211> Length : 80 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 804 
>T11628_PEA_l_node__2 4 

GCTCTTTAAGGGTCACCCAGAGACTCTGGAGAAGTTTGACAAGTTCAAGCACCTGAAGTCAGAGGACGAGATGAAGG 
CGT 

<210> SEQ ID NO 805 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 805 
>Tll628_PEA_l_node_27 

TTCATCTCGGAATGCATCATCCAGGTTCTGCAGAGCAAGCATCCCGGGGACTTTGGTGCTGATGCCCAGGGGGCCAT 
GAACAAG 

<210> SEQ ID NO 806 

<211> Length : 29 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 806 

>T 1 1 6 2 8_PE A_l_node_2 8 

GCCCTGGAGCTGTTCCGGAAGGACATGGC 

<210> SEQ ID NO 807 

<211> Length : 28 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 807 

>T1 1 62 8_PEA_l_node_2 9 

CTCCAACTACAAGGAGCTGGGCTTCCAG 

<210> SEQ ID NO 808 

<211> Length : 23 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 808 
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>T1162 8_PEA_l_node_30 
GGCTAGGCCCCTGCCGCTCCCAC 

<210> SEQ ID NO 809 

<211> Length : 13 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 809 

>Tll628_PEA__l_node_32 

GTGTTGAAGTTGG 

<210> SEQ ID NO 810 

<211> Length : 22 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 810 
>T1 1 62 8_PEA_l_node_33 
CTTTGCATGCCCAGCGATGCGC 

<210> SEQ ID NO 811 

<211> Length : 45 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 811 
>T11628 PEA 1 node_34 
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CTCCCTGTGGGATGTCATCACCCTGGGAACCGGGAGTGGCCCTTG 

<210> SEQ ID NO 812 

<211> Length : 56 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 812 
>T1 1 6 2 8_PEA_l_node_3 5 

GCTCACTGTGTTCTGCATGGTTTGGATCTGAATTAATTGTCCTTTCTTCTAAATCC 

<210> SEQ ID NO 813 

<211> Length : 118 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 813 
>T1 1 628_PEA_l_node_3 6 

CAACCGAACTTCTTCCAACCTCCAAACTGGCTGTAACCCCAAATCCAAGCCATTAACTACACCTGACAGTAGCAATT 
GTCTGATTAATCACTGGCCCCTTGAAGACAGCAGAATGTCC 

<210> SEQ ID NO 814 

<211> Length : 178 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 814 
>HUMCEA PEA l_node_0 
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CTCAGGGCAGAGGGAGGAAGGACAGCAGACCAGACAGTCACAGCAGCCTTGACAAAACGTTCCTGGAACTCAAGCTC 

TTCTCCACAGAGGAGGACAGAGCAGACAGCAGAGACCATGGAGTCTCCCTCGGCCCCTCCCCACAGATGGTGCATCC 
CCTGGCAGAGGCTCCTGCTCACAG 



<210> SEQ ID NO 815 

<211> Length : 278 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 815 
>HUMCEA_PEA_l__node_2 

CCTCACTTCTAACCTTCTGGAACCCGCCCACCACTGCCAAGCTCACTATTGAATCCACGCCGTTCAATGTCGCAGAG 
GGGAAGGAGGTGCTTCTACTTGTCCACAATCTGCCCCAGCATCTTTTTGGCTACAGCTGGTACAAAGGTGAAAGAGT 
GGATGGCAACCGTCAAATTATAGGATATGTAATAGGAACTCAACAAGCTACCCCAGGGCCCGCATACAGTGGTCGAG 
AGATAATATACCCCAATGCATCCCTGCTGATCCAGAACATCATCCAG 

<210> SEQ ID NO 816 

<211> Length : 400 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 816 
>HUMCEA_PEA_l_node__ll 

GTGAGTATATCTGCTCCTCTCTGGCCCAGGCTGCCAGCCCAAATCCACAGGGCCAGAGGCAGGATTTCTCAGTCCCT 

CTCAGGTTCAAGTACACAGACCCTCAACCCTGGACATCCAGACTGTCTGTGACTTTCTGCCCCAGAAAAACCTGGGC 

AGACCAAGTCTTGACCAAGAATAGGAGGGGAGGGGCTGCTTCTGTCCTGGGAGGCTCAGGGTCCACACCCTATGATG 

GGAGAAACAGGTGAATATCTCAGACTCAGGCTCAGTAGATACAAGAGGGGTTTGGCTGAGACTTTAGGATTGTGATT 

CAGCTTAGAGGGACACTGTGGTCCTTCCATAGACCAGGAACTTCCACTTCCCTCTGACAATATCACCTGTGGCTTTA 
TTTTGTTTGCTCCAG 
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<210> SEQ ID NO 817 

<211> Length : 255 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 817 
>HUMCEA_PEA__l__node__12 

ATGGCCCGGATGCCCCCACCATTTCCCCTCTAAACACATCTTACAGATCAGGGGAAAATCTGAACCTCTCCTGCCAC 

GCAGCCTCTAACCCACCTGCACAGTACTCTTGGTTTGTCAATGGGACTTTCCAGCAATCCACCCAAGAGCTCTTTAT 

CCCCAACATCACTGTGAATAATAGTGGATCCTATACGTGCCAAGCCCATAACTCAGACACTGGCCTCAATAGGACCA 
CAGTCACGACGATCACAGTCTATG 

<210> SEQ ID NO 818 

<211> Length : 190 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 818 
>HUMCEA_PEA_l_node_31 

CTCTCCTGCCATGCAGCCTCTAACCCACCTGCACAGTATTCTTGGCTGATTGATGGGAACATCCAGCAACACACACA 

AGAGCTCTTTATCTCCAACATCACTGAGAAGAACAGCGGACTCTATACCTGCCAGGCCAATAACTCAGCCAGTGGCC 
ACAGCAGGACTACAGTCAAGACAATCACAGTCTCTG 

<210> SEQ ID NO 819 

<211> Length : 127 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 819 
>HUMCEA PEA 1 node 3 6 
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CTGTCCAATGGCAACAGGACCCTCACTCTATTCAATGTCACAAGAAATGACGCAAGAGCCTATGTATGTGGAATCCA 
GAACTCAGTGAGTGCAAACCGCAGTGACCCAGTCACCCTGGATGTCCTCT 

<210> SEQ ID NO 820 

<211> Length : 255 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 820 
>HUMCEA_PEA_l_node_4 4 

ATGGGCCGGACACCCCCATCATTTCCCCCCCAGACTCGTCTTACCTTTCGGGAGCGAACCTCAACCTCTCCTGCCAC 
TCGGCCTCTAACCCATCCCCGCAGTATTCTTGGCGTATCAATGGGATACCGCAGCAACACACACAAGTTCTCTTTAT 
CGCCAAAATCACGCCAAATAATAACGGGACCTATGCCTGTTTTGTCTCTAACTTGGCTACTGGCCGCAATAATTCCA 
TAGTCAAGAGCATCACAGTCTCTG 

<210> SEQ ID NO 821 

<211> Length : 1,174 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 821 
>HUMCEA_PEA_l__node_4 6 

CTGGGGTGGAGTCTATCTGGTTCTCACCAAAGAGCCAAGAAGACATTTTCTTTCCCAGTCTGTGTTCCATGGGCACA 
AGGAAATCCCAAATTCTATCCTGAGCCCCCTCACTCCATCTCGGCCAACTCTCTCCTCCCCGGCTTCTCTGATATCT 
CACGGCTGACCTCGGGTCCAGCCTGGAATGTGGGGAGGGGCCTCCCTTAGCCCCAGAAGGCCCCCAATAGTGAAAGG 
GACTTCATAGTCCAGAAGAAAGAAGGGTCCTTAAGGTCGAGTTGCTCCTCTCTATCACCAATATGTCCCTTTCTGTC 
ACCTCTTTGTGTTTTTTCACCTACTCTGTGAGCTACAAGGAACAAGGAGGCTTTGAAACCAGCCCACACTTTTTCCC 
CAAATGAGAGGAGGAAGCCCCTTGGATGAGGCAGGAGCAGCTCAGACTCTGCTCCCTGCTCTGCGCCCGGCTCACCC 
GGTGACTGGCTCTGCCCTGGCTCCACTTGGGGTGGGACCGGGGCATGTGGAGAAGGTGTCCAGGTGGCCTGTTTTGA 
ATCTGGGTAAATCAAGCTGCCAATCCACAGCAGAGCCTCCCTTGGGTCAGGTTGCAGGGAAATGGGAAAAGAGGGAG 
CCTCGGGACAGACTCCTGAGCTGTGTCCTGGCTCTGAAGTCACTGGCTGTATGAGGCTGTGGACACAGCACATAGGA 
CACAGCAGAGGAAAGTGAGTGACACACACTTGGAGAAATAGGGAGATTCAGCCATAGGGGCTCTGCATGGGAGGGAA 
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CAGGCAGTGCCAAAAAGTGTGTGTTTATAGAGAGGGTAAGACTATCAGCCACTATATATATCTAACATAAAACTTAC 

CATTAACCATTTCTAAGTGTACAATTAAGTGAAACAGCATAAATATCAATCAAGTATATTGCCCGGTGTGGTGGCTC 

ATCCCTGTAATCCCAGCACTTTGGGAGGCCAAGGCGAGTGGATCACCTGAGGTCAGGAGTTCAAGATACAGAAAAAA 

AAAAATAGCTAGGCATGGTGGTGGGTGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATCGCTCGAACC 

TGGGCGGTGTAGTTTGCAGTGAGCCGAGATTGAGCCACTGCACTCCAGCCTGGGTGACAGAGTGAGACTACATCACA 
A A A A A AAA A A A A A A A A AG G 



<210> SEQ ID NO 822 

<211> Length : 179 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 822 
>HUMCEA_PEA_l_node_63 

CTCAAAAAGAAAAGAAAAGAAGACTCTGACCTGTACTCTTGAATACAAGTTTCTGATACCACTGCACTGTCTGAGAA 

TTTCCAAAACTTTAATGAACTAACTGACAGCTTCATGAAACTGTCCACCAAGATCAAGCAGAGAAAATAATTAATTT 
CATGGGACTAAATGAACTAATGAGG 

<210> SEQ ID NO 823 

<211> Length : 732 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 823 
>HUMCE A_PE A_ l_n o de_ 6 5 

TTTGCTGATTCTTTAAATGTCTTGTTTCCCAGATTTCAGGAAACTTTTTTTCTTTTAAGCTATCCACAGCTTACAGC 
AATTTGATAAAATATACTTTTGTGAACAAAAATTGAGACATTTACATTTTCTCCCTATGTGGTCGCTCCAGACTTGG 
GAAACTATTCATGAATATTTATATTGTATGGTAATATAGTTATTGCACAAGTTCAATAAAAATCTGCTCTTTGTATA 
ACAGAATACATTTGAAAACATTGGTTATATTACCAAGACTTTGACTAGAATGTCGTATTTGAGGATATAAACCCATA 
GGTAATAAACCCACAGGTACTACAAACAAAGTCTGAAGTCAGCCTTGGTTTGGCTTCCTAGTGTCAATTAAACTTCT 
AAAAGTTTAATCTGAGATTCCTTATAAAAACTTCCAGCAAAGCAACTTTAAAAAAGTCTGTGTGGGCCGGGCGCGGT 
GGCTCACGCCTGTAATCCCAGCACTTTGATCCGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCCAGACCATCCTG 
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GCTAACACAGTGAAACCCCGTCTCTACTAAAAATACAAAAAAAGTTAGCCGGGCGTGGTGGTGGGGGCCTGTAGTCC 

CAGCTACTCAGGAGGCTGAGGCAGGAGAACGGCATGAACCCGGGAGGCAGGGCTTGCAGTGAGCCAAGATCATGCCG 
CTGCACTCCAGCCTGGGAGACAAAGTGAGACTCCGTCAA 

<210> SEQ ID NO 824 

<211> Length : 280 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 824 
>HUMCEA_PEA_l_node_6 7 

CGGAGCTGCCCAAGCCCTCCATCTCCAGCAACAACTCCAACCCCGTGGAGGACAAGGATGCTGTGGCCTTCACCTGT 
GAACCTGAGGTTCAGAACACAACCTACCTGTGGTGGGTAAATGGTCAGAGCCTCCCGGTCAGTCCCAGGCTGCAGCT 
GTCCAATGGCAACATGACCCTCACTCTACTCAGCTGTCAAAAGGAACGATGCAGGATCCTATGAATGTGAAATACAG 
AACCCAGCGAGTGCCAACCGCAGTGACCCAGTCACCCTGAATGTCCTCT 

<210> SEQ ID NO 825 

<211> Length : 82 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 825 
>HUMCEA_PEA_l_node_3 

AATGACACAGGATTCTACACCCTACACGTCATAAAGTCAGATCTTGTGAATGAAGAAGCAACTGGCCAGTTCCGGGT 
ATACC 

<210> SEQ ID NO 826 

<211> Length : 104 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 826 
>HUMCEA_PEA_l_node_7 

CGGAGCTGCCCAAGCCCTCCATCTCCAGCAACAACTCCAAACCCGTGGAGGACAAGGATGCTGTGGCCTTCACCTGT 
GAACCTGAGACTCAGGACGCAACCTAC 

<210> SEQ ID NO 827 

<211> Length : 48 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 827 
>HUMCEA_PEA_l_node_8 

CTGTGGTGGGTAAACAATCAGAGCCTCCCGGTCAGTCCCAGGCTGCAG 

<210> SEQ ID NO 828 

<211> Length : 48 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 828 
>HUMCEA__PEA_l_node_9 

CTGTCCAATGGCAACAGGACCCTCACTCTATTCAATGTCACAAGAAAT 

<210> SEQ ID NO 829 

<211> Length : 79 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 829 
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>HUMCEA_PEA_l_node__l 0 

GACACAGCAAGCTACAAATGTGAAACCCAGAACCCAGTGAGTGCCAGGCGCAGTGATTCAGTCATCCTGAATGTCCT 
CT 

<210> SEQ ID NO 830 

<211> Length : 3 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 830 

>HUMCEA_PEA_l_node_15 

CAG 

<210> SEQ ID NO 831 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 831 
>HUMCEA_PEA_l_node_l 6 
AGCCAC 

<210> SEQ ID NO 832 

<211> Length : 7 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 832 
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>HUMCEA_PEA_l_node_17 
CCAAACC 

<210> SEQ ID NO 833 

<211> Length : 18 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 833 
>HUMCEA_PEA_l_node_l 8 
CTTCATCACCAGCAACAA 

<210> SEQ ID NO 834 

<211> Length : 70 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 834 
>HUMCEA_PEA_l_node_l 9 

CTCCAACCCCGTGGAGGATGAGGATGCTGTAGCCTTAACCTGTGAACCTGAGATTCAGAACACAACCTAC 

<210> SEQ ID NO 835 

<211> Length : 24 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 835 
>HUMCEA PEA 1 node 20 
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CTGTGGTGGGTAAATAATCAGAGC 

<210> SEQ ID NO 836 

<211> Length : 24 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 836 

>HUMCEA_PEA_l_node_21 

CTCCCGGTCAGTCCCAGGCTGCAG 

<210> SEQ ID NO 837 

<211> Length : 78 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 837 
>HUMCEA_PEA_l_node_22 

CTGTCCAATGACAACAGGACCCTCACTCTACTCAGTGTCACAAGGAATGATGTAGGACCCTATGAGTGTGGAATCCA 
G 

<210> SEQ ID NO 838 

<211> Length : 30 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 838 
>HUMCEA PEA 1 node_23 
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AACGAATTAAGTGTTGACCACAGCGACCCA 

<210> SEQ ID NO 839 

<211> Length : 19 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 839 

>HUMCEA__PEA_l_node_24 

GTCATCCTGAATGTCCTCT 

<210> SEQ ID NO 840 

<211> Length : 19 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 840 

>HUMCEA__PEA_l_node_27 

ATGGCCCAGACGACCCCAC 

<210> SEQ ID NO 841 

<211> Length : 18 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 841 
>HUMCEA_PEA_l__node_2 9 
CATTTCCCCCTCATACAC 
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<210> SEQ ID NO 842 

<211> Length : 28 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 842 
>HUMCEA_PEA_l_node_3 0 
CTATTACCGTCCAGGGGTGAACCTCAGC 

<210> SEQ ID NO 843 

<211> Length : 22 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 843 
>HUMCEA_PEA_l__node_3 3 
CGGAGCTGCCCAAGCCCTCCAT 

<210> SEQ ID NO 844 

<211> Length : 82 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 844 
>HUMCEA_PEA_l_node_3 4 

CTCCAGCAACAACTCCAAACCCGTGGAGGACAAGGATGCTGTGGCCTTCACCTGTGAACCTGAGGCTCAGAACACAA 
CCTAC 

<210> SEQ ID NO 845 
<211> Length : 48 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 845 
>HUMCEA_PEA_l_node_3 5 

CTGTGGTGGGTAAATGGTCAGAGCCTCCCAGTCAGTCCCAGGCTGCAG 

<210> SEQ ID NO 846 

<211> Length : 33 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 846 
>HUMCEA_PEA_l_node_4 5 
GTAAGTGGCTCCCTGGAGCATCAGCATCATATT 

<210> SEQ ID NO 847 

<211> Length : 27 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 847 

>HUMCEA_PEA_l_node_50 

CATCTGGAACTTCTCCTGGTCTCTCAG 

<210> SEQ ID NO 848 

<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 848 
>HUMCEA_PE A_l_node_5 1 

CTGGGGCCACTGTCGGCATCATGATTGGAGTGCTGGTTGGGGTTGCTCTGATATAGCAGCCCTGGTGTAGTTTCTTC 
ATTTCAGGAAGACTG 

<210> SEQ ID NO 849 

<211> Length : 26 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 849 
>HUMCEA_PEA_l_node_5 6 
ACAGTTGTTTTGCTTCTTCCTTAAAG 

<210> SEQ ID NO 850 

<211> Length : 101 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 850 
>HUMCE A_PEA__l_node_5 7 

CATTTGCAACAGCTACAGTCTAAAATTGCTTCTTTACCAAGGATATTTACAGAAAAGACTCTGACCAGAGATCGAGA 
CCATCCTAGCCAACATCGTGAAAC 

<210> SEQ ID NO 851 

<211> Length : 34 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 851 
>HUMCEA PEA 1 node 58 
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CCCATCTCTACTAAAAATACAAAAATGAGCTGGG 

<210> SEQ ID NO 852 

<211> Length : 44 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 852 
>HUMCEA__PEA_l_node_60 

CTTGGTGGCGCGCACCTGTAGTCCCAGTTACTCGGGAGGCTGAG 

<210> SEQ ID NO 853 

<211> Length : 4 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 853 

>HUMCEA_PEA_l_node_61 

GCAG 

<210> SEQ ID NO 854 

<2X1> Length : 88 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 854 
>HUMCEA_PEA__l_node_62 

GAGAATCGCTTGAACCCGGGAGGTGGAGATTGCAGTGAGCCCAGATCGCACCACTGCACTCCAGTCTGGCAACAGAG 
CAAGACTCCAT 



<210> SEQ ID NO 855 
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<211> Length : 30 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 855 

>HUMCEA_PEA_l_node_64 

ATAATATTTTCATAATTTTTTATTTGAAAT 



Segment nucleic acid sequences : 

<210> SEQ ID NO 856 

<211> Length : 266 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 856 

>R3 513 7_PE A_1_PE A_1_PE A_l_node_2 

TGCCACCTCACCCACTGCCTCTGCCTCCCTGGGGCAGAGCTGTTCCCAGACGGGTGGGGCGGGGCCCAACTGTCCCC 
AGCTCCTTCAGCCCTTTCTGTCCCTCCCAGTGAGGCCAGCTGCGGTGAAGAGGGTGCTCTCTTGCCTGGAGTTCCCT 
CTGCTACGGCTGCCCCCTCCCAGCCCTGGCCCACTAAGCCAGACCCAGCTGTCGCCATTCCCACTTCTGGTCCTGCC 
ACCTCCTGAGCTGCCTTCCCGCCTGGTCTGGGTAG 

<210> SEQ ID NO 857 

<211> Length : 166 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 857 

>R35137 PEA 1 PEA 1 PEA 1 node 3 
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AGTCATGGCCTCGAGCACAGGTGACCGGAGCCAGGCGGTGAGGCATGGACTGAGGGCGAAGGTGCTGACGCTGGACG 
GCATGAACCCGCGTGTGCGGAGAGTGGAGTACGCAGTGCGTGGCCCCATAGTGCAGCGAGCCTTGGAGCTGGAGCAG 
GAGCTGCGCCAG 

<210> SEQ ID NO 858 

<211> Length : 134 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 858 
>R35137_PEA_l_PEA_l_PEA_l_node_9 

GGGCCTACAGCGTCAGCTCCGGCATCCAGCTGATCCGGGAGGACGTGGCGCGGTACATTGAGAGGCGTGACGGAGGC 
ATCCCTGCGGACCCCAACAACGTCTTCCTGTCCACAGGGGCCAGCGATGCCATCGTG 

<210> SEQ ID NO 859 

<211> Length : 190 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 859 
>R35137_PEA_l_PEA_l_PEA_l_node__ll 

ACGGTGCTGAAGCTGCTGGTGGCCGGCGAGGGCCACACACGCACGGGTGTGCTCATCCCCATCCCCCAGTACCCACT 
CTACTCGGCCACGCTGGCAGAGCTGGGCGCAGTGCAGGTGGATTACTACCTGGACGAGGAGCGTGCCTGGGCGCTGG 
ACGTGGCCGAGCTTCACCGTGCACTGGGCCAGGCGC 

<210> SEQ ID NO 860 

<211> Length : 137 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 860 
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>R3 5137_PEA_l_PEA_l_PEA_l_node_16 

GTGTACCAGGACAACGTGTACGCCGCGGGTTCGCAGTTCCACTCATTCAAGAAGGTGCTCATGGAGATGGGGCCGCC 
CTACGCCGGGCAGCAGGAGCTTGCCTCCTTCCACTCCACCTCCAAGGGCTACATGGGCGA 

<210> SEQ ID NO 861 

<211> Length : 175 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 861 
>R35137_PEA_l_PEA_lJPEA_l_node_18 

GTGCGGGTTCCGCGGCGGCTATGTGGAGGTGGTGAACATGGACGCTGCAGTGCAGCAGCAGATGCTGAAGCTGATGA 
GTGTGCGGCTGTGCCCGCCGGTGCCAGGACAGGCCCTGCTGGACCTGGTGGTCAGCCCGCCCGCGCCCACCGACCCC 
TCCTTTGCGCAGTTCCAGGCT 

<210> SEQ ID NO 862 

<211> Length : 156 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 862 

>R3 5 1 3 7_PE A_1_PE A_l_PEA_l_node_2 0 

GAGAAGCAGGCAGTGCTGGCAGAGCTGGCGGCCAAGGCCAAGCTCACCGAGCAGGTCTTCAATGAGGCTCCTGGCAT 
CAGCTGCAACCCAGTGCAGGGCGCCATGTACTCCTTCCCGCGCGTGCAGCTGCCCCCGCGGGCGGTGGAGCGCGCTC 

AG 

<210> SEQ ID NO 863 

<211> Length : 2,023 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 863 

>R3 5137_PEA_l_PEA_l_PEA_l_node_2 7 

GAGCCCTGGGAGGCTCTGGAGCCCACTGTACTTGCTCTTGATGCCTGGCGGGGTGGGGTGGGGGGGGTGCTGGGCCC 

CTGCCTCTCTGCAGGTCCCTAATAAAGCTGTGTGGCAGTCTGACTCCAAAAAGGAAGCGTTGGCAGCTGCGTGGCCC 

GCTCCCACCTGCCTACCCTTCTTGCAGGCCTGAGTCCCTTCAGAGAAGGGACCTTCCACGGCCACCACCCACCTCTT 

CCTCCTGAAGACCCCGTGCCCACCATAGGCTGGGTCTTCCCTCTGGCCTCTGGTTGTGGGGCAGAGCCCGTCAGATC 

ACACAGAAATGGGTTGAGAGGGTCCAGAGTGTGAGGAAAGCGCAGGCCCCACACCCCTTTGTGGAAGCCCCCAAGAA 

TCTAGGGAGCCAGGGGCCCAGGTGGCCACCCGAAGAAACACAGCCTTTCCTGAGGAAGGCACAGTGAACTGCCTCCT 

TCCTGGCTCCCTTTCCTGTGAGGTCCATGTCTTCCCTGGGGCAGGGGGAAATACTAAACCAGCATGGTGCTGGCTGG 

TCAGGGTGACTGACAGCTCAGGAAGGAAGTCTTGGTTCTCTTACCCAAGGAAGCAGGGGTGGGGCCACTGTCTGGGG 

GGCCAGAGACCACCTTTGGTGTCATTGTGTGGTGCAGTCCTTCGGCTGGGTGGAGTGGGAGGCAGAGGGAGAGGATA 

AGGGAGCGTCCCAGGGGAGGGCTGGGGCTGGAGGAGGCAGGGGCTGGGCTGAGCGGGAGTGGGCAGCGCTGTGTCCT 

GGCCTGGGGAGCATGGCTGAGCACCTACTACATGCAGACACTGCTGGGGGGTCACTCAATCTGCACAGATGCTCTTC 

TGAAGTAGGCATGATAGTCCCCATTTGATAGACGTGGAAACCTGCAACCCAAAAACCTGCTGAGCTGACAAGAAACC 

CCCTCAGAGGCCTCGGCAGCCAGAAAAATGCGTTGGGTCCAGTGCCCTCAAGTCCGCCAAGGACATGGGCTGGCTTT 

AGAGACTCACAAACTTGGGAGATAGGACTGGCCAAGGGCACCTGGTTTTTTTCGCTCTGGAGATGGTTCTTAACCAC 

AGGCCACACACACTTCACAGCCTCATCTGGCCCTCGGGAGCCCCAGAGGGCACAGCTCTGGGCAGGAGACACAGCAG 

GTGGGCCCCTCCCTTGGCAGGGCGGGCTCGAATCAGGCAGGGTGCTCCTAGCCTTGTCACCGGACACCGAAGGGCGT 

CACGGGCAGTGGCTGGCGGTGTCCTTCTGAGCTAAGCTGGGTCCTGACCTTTCACACTTCCCTCCTAACTTCCATGG 

CTCTGTCACCGCCTTACGGAGGAGCTGAAGCCACAGACACAGCAAGGTTGGGGTCCGCACCGGAAGTATCCAGTGGT 

AGACGGCGGAACCCTTAAGAAACGGACGCCTTCATGCGGGCGGCTGGAGAAGCGGGGGCTGGGCACTGCAGCAACCA 

CGCTTCGGCTGACACCAGGAAGGAAGCACGCCTGGAGCGGATCCGAAACTACTGAGAGGGGCCAGGGCTGGCCGTGG 

GCGCAGGCCGATCCTTACATTCGAGGCCGGCCCAGCTCTGTAGCTTCCCCCTCTGGGCCTCTCACCCGCGCAGGACC 

TCGGTGGGAGCGCGCACGTGGCGGGGCGGGGGGCCGCGGGCCCGAGCCCGGACTGGCCACCGGGGGCGCCCGCGAGC 

TGACGCTCTGGCCCGTTGCGGTCTCTGTGGCCCGCGCGACCTTCCGGCCCTGGAGCCGGTGGCCGCGGGGCTCCAGC 

GACGCCGTGTGGTCCGTGCTCCGCTCTGTGGCTCCAGGGGGCCGAGAAACTGCTGAGAGTCCGGCCCGGCCGGCAGT 

GCTGGGCGCGGGCCGAGGGCCCCGGGAGGCAGCGGCCCCGCCCTCTTTACCTGCGGCCTCGCAGAGCATGCTGGGAG 

CCGCGGGAGGCAGTGGCCCCGCTCCCCTCACCTGCGGTATCGCAGAGCATGGTGGGAGCCCCGGGAGGCAGTGGCCC 

CGCCCCCTTCCCTGCGGCCGC 

<210> SEQ ID NO 864 

<211> Length : 90 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 864 
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>R3 513 7_PEA_l_PEA_l_PEA_l_node_5 

GGTGTGAAGAAGCCTTTCACCGAGGTCATCCGTGCCAACATCGGGGACGCACAGGCTATGGGGCAGAGGCCCATCAC 
CTTCCTGCGCCAG 

<210> SEQ ID NO 865 

<211> Length : 109 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 865 

>R3 513 7_PEA__l_PEA_l_PEA_l_node_7 

GTCTTGGCCCTCTGTGTTAACCCTGATCTTCTGAGCAGCCCCAACTTCCCTGACGATGCCAAGAAAAGGGCGGAGCG 
CATCTTGCAGGCGTGTGGGGGCCACAGTCTGG 

<210> SEQ ID NO 866 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 866 

>R3 513 7_PEA__l_PEA_l__PEA_l_node_l 2 

GTGACCACTGCCGCCCTCGTGCGCTCTGTGTCATCAACCCTGGCAACCCCACCG 

<210> SEQ ID NO 867 

<211> Length : 80 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 867 

>R35137 PEA 1 PEA 1 PEA 1 node 14 
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GGCAGGTGCAGACCCGCGAGTGCATCGAGGCCGTGATCCGCTTCGCCTTCGAAGAGCGGCTCTTTCTGCTGGCGGAC 
GAG 

<210> SEQ ID NO 868 

<211> Length : 67 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 868 

>R3 513 7_PEA_l_PEA_l_PEA_l_node_l 5 

GTGCGCGGCGCGGGGGAGCGGGAAGCCGGGCAACAGTCCGCCCCCGTGACGCCTTGCGCCCTTCCAG 

<210> SEQ ID NO 869 

<211> Length : 100 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 869 

>R3 5 1 3 7_PE A_1__PEA_1 J?EA_l_node_l 7 

GTGCGTGCGTACGAGGCGGGTGGGGGCTCGCGGGCCATGGCCAGGCCCTCCTCGCCCGATGGGCCACCCCCTCCTCC 
GCACCTGACCTGGCCGTGCGCAG 

<210> SEQ ID NO 870 

<211> Length : 74 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 870 

>R3 513 7_PEA_l_PEA_l_PEA_l_node__2 1 

GTCAGGCGGGGGCGGGGCCTGCGGGGTGGGCAGGGGGGGCCGGGCATCCCTCTCTGACGGCTCTCCGTCCACAG 
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<210> SEQ ID NO 871 

<211> Length : 73 

<2X2> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 871 

>R3 5 1 3 7_PE A_l_PEA_l_PEA_l_node_2 2 

GAGCTGGGCCTGGCCCCCGATATGTTCTTCTGCCTGCGCCTCCTGGAGGAGACCGGCATCTGCGTGGTGCCAG 

<210> SEQ ID NO 872 

<211> Length : 40 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 872 

>R3 5 1 3 7_PEA_l_PEA_l_PEA_l__node_2 3 
GGAGCGGCTTTGGGCAGCGGGAAGGCACCTACCACTTCCG 

<210> SEQ ID NO 873 

<211> Length : 66 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 873 

>R3 5 1 37_PEA_l_PEA__l_PEA_l_node__2 4 

GTGAGGCCTGGCCCTCACTCCCTGTCCCGCCACCCTGGCCCTTCACTCACTGTCAACTCCTTTCAG 



<210> SEQ ID NO 874 
<211> Length : 81 
<212> Type : DNA 
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<213> Organism : Homo sapiens 
<400> sequence : 874 

>R3 5137_PEA_l_PEA_l_PEA__l_node_25 

GATGACCATTCTGCCCCCCTTGGAGAAACTGCGGCTGCTGCTGGAGAAGCTGAGCAGGTTCCATGCCAAGTTCACCC 
TCGA 

<210> SEQ ID NO 875 

<211> Length : 57 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 875 
>R35137J?EA_l_PEA_l_PEA_l_node_2 6 

GTACTCCTGAGCACCCCAGCTGGGGCCAGGCTGGGTCGCCCTGGACTGTGTGCTCAG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 876 

<211> Length : 582 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 876 
> Z2 5 2 9 9_PEA_2_node_2 0 

GTAAGCAGGGGATGAGGGCACACTGAGCTCCCTCCAGCCCTCTCAGCCTCAACCCTCTGGAGGCCCAGGCATATGGG 
CAGGGGGACTCCTGAACCCTACTCCAAGCACAGCCTCTGTCTGACTCCCTTGTCCTTCAAGAGAACTGTTCTCCAGG 
TCTCAGGGCCAGGATTTCCATAGGATCGCCTGTGGCTTTGATTCTATTCTAGTGTCTCTGGGTGGGGGTCCTGGGCA 
AGTGTCTTTCTGAGTCTCAGTTTCTTTATCGGTAAAATGTACATAATGAGATTGAAAGTGCTCTGCAAAGCACTATG 
TGCACTAAGAATTTATTATTCAGGTTGTTTCCATCATGTTTTCTGAGGTGAAATCACAAAGGATCAGTGGAGTTTGA 
GGATTATCTAGTTCAATGCTTTGAGTTTAGAGTTTTACGGTGAAAATGAGACTTGTCTCCTGACACTAAGTCTCTCT 
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CAACTATAGCGCTATCTTGCTATTTTCTCTATCTCAGAAGGATCCTTGGGCAGGAGGAAGGATGTGGATATGATTTG 
GCTGGTTTCTATGCTGAAGCTCTTATCTGATTTTCTCTCACAG 

<210> SEQ ID NO 877 

<211> Length : 193 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 877 
>Z252 99_PEA_2_node_21 

CTTGATTCCTGCCATATGGAGGAGGCTCTGGAGTCCTGCTCTGTGTGGTCCAGGTCCTTTCCACCCTGAGACTTGGC 
TCCACCACTGATATCCTCCTTTGGGGAAAGGCTTGGCACACAGCAGGCTTTCAAGAAGTGCCAGTTGATCAATGAAT 
AAATAAACGAGCCTATTTCTCTTTGCAAAACCTGCTTCT 

<210> SEQ ID NO 878 

<211> Length : 190 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 878 
> Z 2 5 2 9 9_PEA_2_n o de_2 3 

GTGAAAAGAGACATCACAAGCAATTGAGGGACCAGGAAGTGGATCCTCTAGAGATGAGGAGGCATTCTGCTGGATGA 
CTTTTAAAAATGTTTTCTCCAGAGTCATCTCTCTCATTAACAATGTTTTTTGTCTTAGAAATTTCTTGTTGATTTTT 
AAACTTACATGATTTCTTGTTTTGGTATGAATACAG 

<210> SEQ ID NO 879 

<211> Length : 179 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 879 



WO 2006/131783 



PCT/IB2005/004037 



406 

> Z 2 5 2 9 9_PE A_2_node_2 4 

GCTGCTTCAGTCCTTCAATAAGCCCATCACACTTTTTCACCATGTCATCTATCAGCACTTTTTCTGCAGTGTTACGA 
ACATCAGCTTCATCACTGTCAGCCTGCGTTTTGCCTGCAACCCATCAAATGAGGTCAGGAGAGGAGTTTTCCACTTT 

TGGCTTCATGTTGGTGCTCAAAACT 

<210> SEQ ID NO 880 

<211> Length : 208 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 880 
> Z 2 5 2 9 9_PEA_2_node_8 

GTTTCCTGCTTATGCAATAGTAGCTGGGAGAGGCCGAAAGAATTCTGGTGGGGCCACACCCACTGGTGAAAGAATAA 
ATAGTGAGGTTTGGCATTGGCCATCAGAGTCACTCCTGCCTTCACCATGAAGTCCAGCGGCCTCTTCCCCTTCCTGG 
TGCTGCTTGCCCTGGGAACTCTGGCACCTTGGGCTGTGGAAGGCTCTGGAAAGT 

<210> SEQ ID NO 881 

<211> Length : 37 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 881 
> Z 2 52 9 9_PEA_2_node_l 2 

CCTTCAAAGCTGGAGTCTGTCCTCCTAAGAAATCTGC 

<210> SEQ ID NO 882 

<211> Length : 112 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 882 
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>Z2 52 9 9_PEA_2_node_l 3 

CCAGTGCCTTAGATACAAGAAACCTGAGTGCCAGAGTGACTGGCAGTGTCCAGGGAAGAAGAGATGTTGTCCTGACA 
CTTGTGGCATCAAATGCCTGGATCCTGTTGACACC 

<210> SEQ ID NO 883 

<211> Length : 10 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 883 
>Z2 52 99_PEA_2_node__14 
CCAAACCCAA 

<210> SEQ ID NO 884 

<211> Length : 4 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 884 
>Z2 5299_PEA_2_node_17 
CAAG 

<210> SEQ ID NO 885 

<211> Length : 56 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 885 
> Z 2 5 2 9 9_PEA_2_n o de_l 8 

GAGGAAGCCTGGGAAGTGCCCAGTGACTTATGGCCAATGTTTGATGCTTAACCCCC 
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<210> SEQ ID NO 886 

<211> Length : 90 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 886 
>Z252 99_PEA_2_node_19 

CCAATTTCTGXGAGATGGATGGCCAGTGCAAGCGTGACTTGAAGTGTTGCATGGGCATGTGTGGGAAATCCTGCGTT 
TCCCCTGTGAAAG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 887 

<211> Length : 131 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 887 
>HSSTROL3_node_6 

CAAGCCCAGCAGCCCCGGGGCGGATGGCTCCGGCCGCCTGGCTCCGCAGCGCGGCCGCGCGCGCCCTCCTGCCCCCG 
ATGCTGCTGCTGCTGCTCCAGCCGCCGCCGCTGCTGGCCCGGGCTCTGCCGCCG 

<210> SEQ ID NO 888 

<211> Length : 182 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 888 
>HSSTROL3 node_10 
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GACGTCCACCACCTCCATGCCGAGAGGAGGGGGCCACAGCCCTGGCATGCAGCCCTGCCCAGTAGCCCGGCACCTGC 

CCCTGCCACGCAGGAAGCCCCCCGGCCTGCCAGCAGCCTCAGGCCTCCCCGCTGTGGCGTGCCCGACCCATCTGATG 
GGCTGAGTGCCCGCAACCGACAGAAGAG 

<210> SEQ ID NO 889 

<211> Length : 144 

<212> Type : DRA 

<213> Organism : Homo sapiens 

<400> sequence : 889 
>HSSTROL3_node_13 

GATCCTTCGGTTCCCATGGCAGTTGGTGCAGGAGCAGGTGCGGCAGACGATGGCAGAGGCCCTAAAGGTATGGAGCG 
ATGTGACGCCACTCACCTTTACTGAGGTGCACGAGGGCCGTGCTGACATCATGATCGACTTCGCCAG 

<210> SEQ ID NO 890 

<211> Length : 134 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 890 
>HSSTROL3_node_15 

GTACTGGCATGGGGACGACCTGCCGTTTGATGGGCCTGGGGGCATCCTGGCCCATGCCTTCTTCCCCAAGACTCACC 
GAGAAGGGGATGTCCACTTCGACTATGATGAGACCTGGACTATCGGGGATGACCAGG 

<210> SEQ ID NO 891 

<211> Length : 183 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 891 
>HSSTROL3 node 19 
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ACAACAGCAGCCAAGGCCCTGATGTCCGCCTTCTACACCTTTCGCTACCCACTGAGTCTCAGCCCAGATGACTGCAG 
GGGCGTTCAACACCTATATGGCCAGCCCTGGCCCACTGTCACCTCCAGGACCCCAGCCCTGGGCCCCCAGGCTGGGA 
TAGACACCAATGAGATTGCACCGCTGGAG 

<210> SEQ ID NO 892 

<211> Length : 217 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 892 
>HSSTROL3_node_21 

CCAGACGCCCCGCCAGATGCCTGTGAGGCCTCCTTTGACGCGGTCTCCACCATCCGAGGCGAGCTCTTTTTCTTCAA 
AGCGGGCTTTGTGTGGCGCCTCCGTGGGGGCCAGCTGCAGCCCGGCTACCCAGCATTGGCCTCTCGCCACTGGCAGG 
GACTGCCCAGCCCTGTGGACGCTGCCTTCGAGGATGCCCAGGGCCACATTTGGTTCTTCCAAG 

<210> SEQ ID NO 893 

<211> Length : 138 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 893 
>HSSTROL3_node_2 4 

AGCTGGGATTTCCATCCTCAACTGGCAGAGATGAGAGCCTGGAGCATTGCAGATGCCAGGGACTTCACAAATGAAGG 
CACAGCATGGGAAACCTGCGTGGGTTCCAGGGCAGTCCAGCCTGCAGGGGCCCAGGGAGTG 

<210> SEQ ID NO 894 

<211> Length : 300 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 894 
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>HSSTROL3_node_2 5 

GTCAGTAGGCATTTGTCACAGCCAAATGCCAGTGGAAGGAGCAGCCGCCCAGGCAGCCCTCTACTGATGAGAGTAAC 
CTCACCCGTGCACTAGTTTACAGAGCATTCACTGCCCCAGCTTATCCCAGGCCTCCCGCTTCCCTCTGCGGGTGGGG 
TGCTGAGCAGGCATTATTGGCCTGCATGTTTTACTGATGAGGAAACTGAGGCTGGGAGAGTCTGTGGTAGGGGTCAA 
GCAGGTCCACAGTGGCGGGGCATGGCAGTGGTGGCTGGGCAGGTCCTTGCAGCCTTCCCTCTCCGGCAG 

<210> SEQ ID NO 895 

<211> Length : 142 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 895 
>HS STROL3_node_2 6 

GTGCTCAGTACTGGGTGTACGACGGTGAAAAGCCAGTCCTGGGCCCCGCACCCCTCACCGAGCTGGGCCTGGTGAGG 
TTCCCGGTCCATGCTGCCTTGGTCTGGGGTCCCGAGAAGAACAAGATCTACTTCTTCCGAGGCAG 

<210> SEQ ID NO 896 

<211> Length : 927 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 896 
>HSSTROL3__node_28 

GTGCGTTGGGGGTGAGGCAGCTGGTGGGAGGTGGGCACAGCAGCCGCTTCTCCCACCTGGTGGTGGCTGGGCTCCCA 
CATGCCTGCCACAGGAAGTCTGGCTCTTCATCACAGGTCCTTTGTCCAGAGCCATCTGCCCTCCTCTCGGTGGCCGG 
CTAGTGCTACATTCCATATTGCAGATGAGGAAACTGAGGGTCAGAGAAGTGCAAGGTCTTACCCTGGTTTTTCAGCC 
ACAGCCAGTAGAACAATAAACTGCTGTACACTGAGGGCCAACAATGCTCTAAGCTCCTTACTGGTCTCATCCAGTTC 
TCAGAACAGCCCTCTGATGTGACACCTGTTGTGAACCCAGTTTCCAGAGGAGCAAACAGAGGCTCAGGCAATGAGGC 
CCCTAACCTGGACTACCCTGGTGGTCCCTGCTCCTAACCACTGACCCACCCAGCCTCCCACAACCACAGGGGGCTAG 
AGCCAGTCCAGTGCTCCCTCCCCTGCTAGGCTCCTCTTCTGTGCTCTTTCTCCCACATCAGGACCCACTGGGAGAGC 
TATCCTAGGGTAGCCTCCAGCTCCAGGACTCCAGGGTGCCCGTCAATAGCCTGGCTAATTTAATAGATGCAGGAGAG 
AGTGATGTGGAGGGTGGTGGGGGCAACGGGACTTGCTTTCCTGAGAGGTGGGACTCAGGCCTCTGAGGCTCTGGGTA 
CCTGTCAGGCTGGGTATTAGCCCAGCCCAGATTCCGGGGCAGGCAGAAGGGCTCCCTAGAGGGAAGAGAGGTTCTGA 
AAGGCCGGCCCTGGATCCTGCAGGACTCGAGGAACTCAGCAGTGGCCAAGGGCTTCCCACTCAGCCCTCCCTTAGTG 
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CCCATCCCTGGGCACAGCCTGACAGGCAGGAGTAGGGCCCAGTGTCCACTCGCCCAGGCTTGACCACCTTCTCTTCT 
CAG 

<210> SEQ ID NO 897 

<211> Length : 911 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 897 
>HSSTROL3_node_2 9 

GCTATGCCTACTTCCTGCGCGGCCGCCTCTACTGGAAGTTTGACCCTGTGAAGGTGAAGGCTCTGGAAGGCTTCCCC 
CGTCTCGTGGGTCCTGACTTCTTTGGCTGTGCCGAGCCTGCCAACACTTTCCTCTGACCATGGCTTGGATGCCCTCA 
GGGGTGCTGACCCCTGCCAGGCCACGAATATCAGGCTAGAGACCCATGGCCATCTTTGTGGCTGTGGGCACCAGGCA 
TGGGACTGAGCCCATGTCTCCTCAGGGGGATGGGGTGGGGTACAACCACCATGACAACTGCCGGGAGGGCCACGCAG 
GTCGTGGTCACCTGCCAGCGACTGTCTCAGACTGGGCAGGGAGGCTTTGGCATGACTTAAGAGGAAGGGCAGTCTTG 
GGCCCGCTATGCAGGTCCTGGCAAACCTGGCTGCCCTGTCTCCATCCCTGTCCCTCAGGGTAGCACCATGGCAGGAC 
TGGGGGAACTGGAGTGTCCTTGCTGTATCCCTGTTGTGAGGTTCCTTCCAGGGGCTGGCACTGAAGCAAGGGTGCTG 
GGGCCCCATGGCCTTCAGCCCTGGCTGAGCAACTGGGCTGTAGGGCAGGGCCACTTCCTGAGGTCAGGTCTTGGTAG 
GTGCCTGCATCTGTCTGCCTTCTGGCTGACAATCCTGGAAATCTGTTCTCCAGAATCCAGGCCAAAAAGTTCACAGT 
CAAATGGGGAGGGGTATTCTTCATGCAGGAGACCCCAGGCCCTGGAGGCTGCAACATACCTCAATCCTGTCCCAGGC 
CGGATCCTCCTGAAGCCCTTTTCGCAGCACTGCTATCCTCCAAAGCCATTGTAAATGTGTGTACAGTGTGTATAAAC 
CTTCTTCTTCTTTTTTTTTTTTAAACTGAGGATTGTCATTAAACACAGTTGTTTTCTACCTGCC 

<210> SEQ ID NO 898 

<211> Length : 48 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 898 
>HSSTROL3_node_l 1 

GTTCGTGCTTTCTGGCGGGCGCTGGGAGAAGACGGACCTCACCTACAG 
<210> SEQ ID NO 899 
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<211> Length : 41 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 899 
>HSSTROL3_node_17 

GCACAGACCTGCTGCAGGTGGCAGCCCATGAATTTGGCCAC 

<210> SEQ ID NO 900 

<211> Length : 18 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 900 
>HSSTROL3_node_l 8 
GTGCTGGGGCTGCAGCAC 

<210> SEQ ID NO 901 

<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 901 
>HSSTROL3_node_2 0 

GTGAGGCCCTGCCTGCCAGTCCCCCTACTCCTCTGCTGGCCACTGTGACTGCAGCATATGCCCTCAGCATGTGTCCC 
TCTCTCCCACCCCAG 

<210> SEQ ID NO 902 

<211> Length : 116 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 902 
>HSSTROL3_node_27 

GGACTACTGGCGTTTCCACCCCAGCACCCGGCGTGTAGACAGTCCCGTGCCCCGCAGGGCCACTGACTGGAGAGGGG 
TGCCCTCTGAGATCGACGCTGCCTTCCAGGATGCTGATG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 903 

<211> Length : 359 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 903 
>HUMTREFAC_PEA_2_node_0 

CGCTCCCCAGTAGAGGACCCGGAACCAGAACTGGAATCCGCCCTTACCGCTTGCTGCCAAAACAGTGGGGGCTGAAC 
TGACCTCTCCCCTTTGGGAGAGAAAAACTGTCTGGGAGCTTGACAAAGGCATGCAGGAGAGAACAGGAGCAGCCACA 
GCCAGGAGGGAGAGCCTTCCCCAAGCAAACAATCCAGAGCAGCTGTGCAAACAACGGTGCATAAATGAGGCCTCCTG 
GACCATGAAGCGAGTCCTGAGCTGCGTCCCGGAGCCCACGGTGGTCATGGCTGCCAGAGCGCTCTGCATGCTGGGGC 
TGGTCCTGGCCTTGCTGTCCTCCAGCTCTGCTGAGGAGTACGTGGGCCTGT 

<210> SEQ ID NO 904 

<211> Length : 586 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 904 
>HUMTREFAC PEA 2 node 9 
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CTCCAGCTGCCCCCGGCCGGGGGATGCGAGGCTCGGAGCACCCTTGCCCGGCTGTGATTGCTGCCAGGCACTGTTCA 
TCTCAGCTTTTCTGTCCCXTTGCTCCCGGCAAGCGCTTCTGCTGAAAGTTCATATCTGGAGCCTGATGTCTTAACGA 
ATAAAGGTCCCATGCTCCACCCGAGGACAGTTCTTCGTGCCTGAGACTTTCTGAGGTTGTGCTTTATTTCTGCTGCG 
TCGTGGGAGAGGGCGGGAGGGTGTCAGGGGAGAGTCTGCCCAGGCCTCAAGGGCAGGAAAAGACTCCCTAAGGAGCT 
GCAGTGCATGCAAGGATATTTTGAATCCAGACTGGCACCCACGTCACAGGAAAGCCTAGGAACACTGTAAGTGCCGC 
TTCCTCGGGAAAGCAGAAAAAATACATTTCAGGTAGAAGTTTTCAAAAATCACAAGTCTTTCTTGGTGAAGACAGCA 
AGCCAATAAAACTGTCTTCCAAAGTGGTCCTTTATTTCACAACCACTCTCGCTACTGTTCAATACTTGTACTATTCC 
TGGGTTTTGTTTCTTTGTACAGTAAACATTATGAACAAACAGGCAAA 

<210> SEQ ID NO 905 

<211> Length : 111 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 905 
>HUMTREFAC_PEA_2_node_2 

GGAAAGTGCATCTTCCTAAGGGCGAGGGTTTCAGCAGTGGTTGAACTCGGCGGGGTGGGGCGGAGCGGGAGGATGCA 
AACTTGCAAAGTGAAGCAAACACACTCACCGCAG 

<210> SEQ ID NO 906 

<211> Length : 44 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 906 
>HUMTREFAC_PEA_2__node_3 

CCCAGCAAGGGCTCTGGCAGCTGACAGGGCTTTGTCTGGGACAG 

<210> SEQ ID NO 907 

<211> Length : 97 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 907 
>HUMTREFAC_PEA_2_node_4 

CTGCAAACCAGTGTGCCGTGCCAGCCAAGGACAGGGTGGACTGCGGCTACCCCCATGTCACCCCCAAGGAGTGCAAC 
AACCGGGGCTGCTGCTTTGA 

<210> SEQ ID NO 908 

<211> Length : 50 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 908 
>HUMTREFAC_PEA_2_node_5 

CTCCAGGATCCCTGGAGTGCCTTGGTGTTTCAAGCCCCTGCAGGAAGCAG 

<210> SEQ ID NO 909 

<211> Length : 19 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 909 

>HUMTREFAC_PEA_2_node_8 

AATGCACCTTCTGAGGCAC 



Segment nucleic acid sequences: 

<210> SEQ ID NO 910 
<211> Length : 1,133 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 910 
>HSS100PCB_node_3 

TGAGACAAGATGTCACTCTGTCACCCAGGCTGGAGTGCAGTGGCAGGATCACGGCTCACTGCAGCCTCGACCTCCCT 
GGGCTCAGGTGATCCTCCCACCTCAGCCTACCGAGTAGCTGGGACTACAGGTGCATGTCACCATACCCGGTTAATTT 
TTGTATTTTTTTTAGAGACAAGGTCTCACCATGTTGCCCAGGCTGGTCTCAAACTCCTGTGCTCAGGCAATGCGTCA 
GCCTCGACATCTCAAAGTGCTGTGATTACAGGCGTGAGCCCCGACACCTGGCCTAGTTCTATTTTCTAAATGTGAAT 
TCTGTAAAGATATCTTTTAAAAATAAAGTTCTGTTTTTGGTAGAAAATGTAAAAATAGATAAATATGGAGGGAAGAA 
ATCCCCCCTGGAATACAGACGCTTCCTCTCCCTTCCAGCCTTTTCCCCATATGAACATTGCTGTGAGTGAGACTTAC 
ATGCAATGTAATTTCTTTTTGAGCTTAACATTACAACATAAATTCTCAAACTCTGATGTTCATTAAACACCCCAGCC 
CCATCCTGGGAACTTGGGCTTGGGGCTCGGGGTGTTCTGATAATGATCAAAGTATGAGAATTGAACCCATGAGGACT 
TTGATCCAAGATACTGGGGTGTGGGGAGGGGCAGGCACAGGTGTCCTGGGAACACACTTTGAGAAGCAATGGCAAAG 
CTGGGGGTCCAGCTAATGTGTTACATTAGAATCACCTCGGGGAGGCCCTGGGTGCCCTTCTCAGCCCTCCCTCCGGA 
GGCTGCTGAAGCCCAGCAAAGCCGGAGTCAGAGAACAATGTCCGCCTGAGGGCAGGGCTGGGCTGGGCTGGCCTTCT 
GGCCCTATCTGCTCCGTGCCCAACCCAGCGCCCCGCACAGTCGGAGCTTTGTAAATACGAGGTGACTGTCTGCCTAC 
AAACTTTGTAAACATCACTTGAAATGGCCGCAGGGCATTGCGACATGGCCATACCACTATTTGTTTGCTATTGAATT 
TGTACTTCCCTGCCTTACTTTTGCTATTGCAAACCATGCTGTCACTAAGGTCTTCATGCACACAGTTGTGTCTTGGT 
CAGATGATATGTTTCTACCAATTTTAATTGTGTTTCTTTCCACCTGGACACACAG 

<210> SEQ ID NO 911 

<211> Length : 790 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 911 
>HSS10 0PCB_node_4 

CTCTCTGGCCCAGGGCTGGGTCATCAGCACACCCTGCTGCTGCTGTTCAGATCTGCATCCTGGTCCCGCTTGGTCCC 
ACAGTGAGAACGCTTTGCTATCACATGGGCAGGCTCTGAGAGCCCTGCCGGCCTGGCCTTCTCAAAGAAGACCTGAG 
AGCTTGGGACCCAAGCAGAGAGGAAGAACAGGGCTCAGGGTGCTTGCTCCATGCTCGCTCCACACCTGGGGCTCAAC 
CCTGGCTTTCCCCGGCTCCCTGTGTGACTTCAGGGCAGGTCCCTTGGGCCCTCTGGGCCTTATCATCTTCATCTGTA 
ACAGGGCGATGCCTCTGCCGTGTCTGGTGGTGTTGAGGAGTTCCTGTTTGTGTAAGCAGCTAGTTCAGTGCCAGCAC 
GAGATGGGAGGCCCATGAAGTTAGCAGTGCACAAAAAATAGAGCAAAGACTGGATGCATTTCCTGAGAACAACCATC 
ACTGTAAAGCACTTTACAAATCCAAAGACAACCCCCGGCAAAAACTCAAAATGAAACTCCCTCTCGCAGAGCACAAT 
TCCAATTCGCTCTAAAAACATTACAAGTTAGTTCATGTCATGCCAGATAGCTGAAGGCAGCTCACAAGTTCTTAAGG 
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CCAGGAATGCCATGTGTCTGCTATGCACAGCTGGCCCTGGCCCTGAGCCTGAATGACAGCACAAAGGTGACGCAGAT 
GTGGGTGCCCTGCTCCTGCCCAGCAGCAGTGCTTGGTGGAGGCTGAGGCCCTGCACAGGCACCCTCACTGCTGACCT 
TGAGCCTCTCTCTCCTCTAG 

<210> SEQ ID NO 912 

<211> Length : 643 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 912 
>HSS100PCB_node_5 

AGTGGAAAAGACAAGGATGCCGTGGATAAATTGCTCAAGGACCTGGACGCCAATGGAGATGCCCAGGTGGACTTCAG 
TGAGTTCATCGTGTTCGTGGCTGCAATCACGTCTGCCTGTCACAAGTACTTTGAGAAGGCAGGACTCAAATGATGCC 
CTGGAGATGTCACAGATTCCTGGCAGAGCCATGGTCCCAGGCTTCCCAAAAGTGTTTGTTGGCAATTATTCCCCTAG 
GCTGAGCCTGCTCATGTACCTCTGATTAATAAATGCTTATGAAATGATCCTGGTCTCAGAGAAACTGGTTATTTCCT 
TAAAACCTCAGTGTCCTCTCCCCACAAAACAGAATTCACTAATTAAACTTGTCATTTCATTACTCATTGCCTGCCTG 
CAGTGGTGAGACAGATGAAGAAGACAAGGCAGTCGATAAAAATCCATTTGTTGGCCATCTTCTGAGTGCCCACTGAG 
ACACAGTCCCACCTTTGGCCGTGTGAATGCCACCATGTCCTGGACCCAAGGGAGGCTCGGGTACACACATCCATGTA 
TTTATTCATCTACTTCGTAATTCATTTAGTCGGTCACGTAGCCACTCATTGAACCTGGGCCCCTCCATCTCTCCAAA 
CAGCATGTTTTCTTCAAGTAACATACT 

Segment nucleic acid sequences: 

<210> SEQ ID NO 913 

<211> Length : 1,298 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 913 
>R2 07 7 9_node_0 

CTGTCCCCGGCCCCGGCGCGGGGGAGACGTGAGCGTGCACACGTACACACACAGCAGGGGAAGAGGCGCTCCAAGCG 
GCGCCCAACTTTCTCCTTCCCTCCACGGGCCGGGTGAGAAAGTAGCCGGGGGCTATCCCGACCCGGCGGTTCTTGGG 
GAGGGGGCCGAACAAGAAAAGGGAGGAGATGGAGATAACTTCCCCGGATTTAGCTTTTTTGTCTTTGTTTTTGTTCT 
CACCACTTCCATCGGATGACTGGAGAGTAAAAGGGAACCCGGAGCGGGGTGGCGAGCAGCGCTTTGAGAAAATGCAG 
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GAGTGTGTTTGGAGACGCGTAAAGTTGCCTTTCAAGCTCTGGCCTCCGGGCACGCGATGCTCCGCGGCGGGCTGACT 
CAGGGCTGCCTTGGGCCTCCCTGCCACCCTCCTGGAAATGATGCAAGTCCTGACTGTCACCTGGATCCCTGCAGCCC 
AGCCTGGAATGCGTCTGGATTAGGGGAAAGACGAGAAACGACACTCCAGGTGTTGCACGGCCCACCAAAGCGGGAAG 
ATAGGGCAGTTGCTCAGACCAAATACTGTATCTAGTGCTTCTGCTCCTATCTTCAATCGTGGGGTTCTTTTTAATGC 
AAAGTGTCACAAGGCCAGGAATTCCCATGTGTGCTCAGTTGGCCCACAGCATCATTGTGCCTAGGAAACTGCTTCAA 
TTTATCAAGTCCTCTGGGCTGGGAATCTCACTGAATTCCAAACGGCGGAAAGAGGAAACTTTCCCAACCCGATGTGG 
GTGTGACGCGAGCCAGGGGCCCCAGGGACACTGTCCCAGAGCACACCGTCCCCCTTTAACAGCAACTGGAGCTTGGA 
TTCGCTCTTATATTGTACAGTCCTTTCGACCATTGCCCTGGAGCACCCGCACACGCGCACGCATCTCCGGCCGCGCT 
CACACACACTCATACACACGCACGCAAACGCGTGGCCGCCGCCAGGTCGGCAACTTTGTCCGGCGCTCCCAGCGGCG 
CTCGGCTTCCTCCTGTAGTAGTTGAGCGCAGGCCCCGCCTCCCGGCCGTGTTGTCAAAAGGGCCGGGGTCTCGGATT 
GGTCCAGCCGCCGGGACAACACCTGCTCGACTCCTTCATTCAAGTGACACCAGAGCTTCCAGGGATATTTGAGGCAC 
CATCCCTGCCATTGCCGGGCACTCGCGGCGCTGCTAACGGCCTGGTCACATGCTCTCCGGAGAGCTACGGGAGGGCG 
CTGGGTAACCTCTATCCGAGCCGCGGCCGCGAGGAGGAGGGAAAAGGCGAGCAAAAAGGAAGAGTG 

<210> SEQ ID NO 914 

<211> Length : 170 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 914 
>R2 07 7 9_node_2 

GGAGGAAGAGGGGAGCACAAAGGATCCAGGTCTCCCGACGGGAGGTTAATACCAAGAACCATGTGTGCCGAGCGGCT 
GGGCCAGTTCATGACCCTGGCTTTGGTGTTGGCCACCTTTGACCCGGCGCGGGGGACCGACGCCACCAACCCACCCG 

AGGGTCCCCAAGACAG 

<210> SEQ ID NO 915 

<211> Length : 143 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 915 
>R207 7 9_node_7 

CGGAGATCCAGCACTGTTTGGTCAACGCTGGCGATGTGGGGTGTGGCGTGTTTGAATGTTTCGAGAACAACTCTTGT 
GAGATTCGGGGCTTACATGGGATTTGCATGACTTTTCTGCACAACGCTGGAAAATTTGATGCCCAG 
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<210> SEQ ID NO 916 

<211> Length : 148 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 916 
>R2 077 9_node_9 

GGCAAGTCATTCATCAAAGACGCCTTGAAATGTAAGGCCCACGCTCTGCGGCACAGGTTCGGCTGCATAAGCCGGAA 
GTGCCCGGCCATCAGGGAAATGGTGTCCCAGTTGCAGCGGGAATGCTACCTCAAGCACGACCTGTGCGCGG 

<210> SEQ ID NO 917 

<211> Length : 168 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 917 
>R2 07 7 9_node_18 

CTGTGGGGAGGAGGTGAAGGAGGCCATCACCCACAGCGTGCAGGTTCAGTGTGAGCAGAACTGGGGAAGCCTGTGCT 
CCATCTTGAGCTTCTGCACCTCGGCCATCCAGAAGCCTCCCACGGCGCCCCCCGAGCGCCAGCCCCAGGTGGACAGA 

ACCAAGCTCTCCAG 

<210> SEQ ID NO 918 

<211> Length : 578 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 918 
>R2 07 7 9_node_2 1 

CAGTAGGGAGACTGGCCGAGGTGCCAAGGGTGAGCGAGGTAGCAAGAGCCACCCAAACGCCCATGCCCGAGGCAGAG 
TCGGGGGCCTTGGGGCTCAGGGACCTTCCGGAAGCAGCGAGTGGGAAGACGAACAGTCTGAGTATTCTGATATCCGG 
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AGGTGAAATGAAAGGCCTGGCCACGAAATCTTTCCTCCACGCCGTCCATTTTCTTATCTATGGACATTCCAAAACAT 
TTACCATTAGAGAGGGGGGATGTCACACGCAGGATTCTGTGGGGACTGTGGACTTCATCGAGGTGTGTGTTCGCGGA 
ACGGACAGGTGAGATGGAGACCCCTGGGGCCGTGGGGTCTCAGGGGTGCCTGGTGAATTCTGCACTTACACGTACTC 
AAGGGAGCGCGCCCGCGTTATCCTCGTACCTTTGTCTTCTTTCCATCTGTGGAGTCAGTGGGTGTCGGCCGCTCTGT 
TGTGGGGGAGGTGAACCAGGGAGGGGCAGGGCAAGGCAGGGCCCCCAGAGCTGGGCCACACAGTGGGTGCTGGGCCT 
CGCCCCGAAGCTTCTGGTGCAGCAGCCTCTGGTGCTGTC 

<210> SEQ ID NO 919 

<211> Length : 691 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 919 
>R2 077 9_node_2 4 

GGAGTGTCATTTCTATGTGTAATTTCTGAGCCATTGTACTGTCTGGGCTGGGGGGGACACTGTCCAAGGGAGTGGCC 
CCTATGAGTTTATATTTTAACCACTGCTTCAAATCTCGATTTCACTTTTTTTATTTATCCAGTTATATCTACATATC 
TGTCATCTAAATAAATGGCTTTCAAACAAAGCAACTGGGTCATTAAAACCAGCTCAAAGGGGGTTTAAAAAAAAAAA 
ACCAGCCCATCCTTTGAGGCTGATTTTTCTTTTTTTTAAGTTCTATTTTAAAAGCTATCAAACAGCGACATAGCCAT 
ACATCTGACTGCCTGACATGGACTCCTGCCCACTTGGGGGAAACCTTATACCCAGAGGAAAATACACACCTGGGGAG 
TACATTTGACAAATTTCCCTTAGGATTTCGTTATCTCACCTTGACCCTCAGCCAAGATTGGTAAAGCTGCGTCCTGG 
CGATTCCAGGAGACCCAGCTGGAAACCTGGCTTCTCCATGTGAGGGGATGGGAAAGGAAAGAAGAGAATGAAGACTA 
CTTAGTAATTCCCATCAGGAAATGCTGACCTTTTACATAAAATCAAGGAGACTGCTGAAAATCTCTAAGGGACAGGA 
TTTTCCAGATCCTAATTGGAAATTTAGCAATAAGGAGAGGAGTCCAAGGGGACAAATAAAGGCAGAGAGAAGAGA 

<210> SEQ ID NO 920 

<211> Length : 131 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 920 
>R207 7 9_node_2 7 

CTAAAAATACGAGGAAAGGAGAGTGAGGATTTTCATTAAAAGTCTCAGCAGTGGGTTTCTTGGGTTATTTAAAACAT 
CACCTAAATAGGCCTTTTCTTCCTAATTGGCCATCAAATTAAAGCCTATCCTTT 
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<210> SEQ ID NO 921 

<211> Length : 247 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 921 
>R207 7 9_node_2 8 

CTCAAGCAGGAGCTGGTATTGTAGGGAGTGGCCGGGTATTCTGGGCTGGGCTCTTCTGGAGTAGGGGGTCAGCAAAC 
ATTGTCTGCAAAGGGCCAGATACTGAATCCAGTACTTTCAGTTTGGCGAGCCGTGAGGTCTCTGTCGAAACTACTCA 
ACTCTGCCGTCCTAGCACAAAAGCAGCCATAGACAACACACAAACGAGAGGGCTTGGCTCCCTTCCAGGAAGATTTA 

TTTAACAGGCTCCCAG 

<210> SEQ ID NO 922 

<211> Length : 126 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence :922 
>R20779_node_30 

GACACAGTTTACTGATCTCTGTTCTAGTGAGTGGGTCAAAAAGCATATGCATCCTTATCCGTCAACTCATCAGCTCT 
TCCTCAAGGCAACCTGAGGCCAGACACCAAGAAACCAAGCGTATCTGCT 

<210> SEQ ID NO 923 

<211> Length : 231 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 923 
>R207 7 9_node_31 

CTAAAATGACTTGTTCCTGGGGAATGCCTTCAACCAAAACACAGCTAGTATTTCTATGCCCCAAATCCAATCCCAGT 
CTTTCATGATCCATGCCGGCGGTTGGGTGGGGAGGGGAATCATTGGTTGGGGGAAGGGAGGAAACCCCACCTCCAGC 
CCCCGCCACCGGGCTCCCTGGGCACCCAGCAAGATCTGGGGCTGCAGAGAACAGAAGAGCTGGTGCACTTAATCCAG 
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<210> SEQ ID NO 924 

<211> Length : 1, 079 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 924 
>R2 0 7 7 9_n o de_3 2 

CTCTGCCCTTGGGGGGAGGAGGACCTGTGTGTCAGGCTCTGCCATGGGAACGAGTGTAAACCGTGGCTGTCTCCTGC 
AGTGAGCCACCGCGGCAGGCACGTTGACTGTTTTACTGACATCACTCAAAAGCTAAAGCAATAACATTCTCCTGCGT 
TGCTGAGTCAGCTGTTCATTTGTCCGCCAGCTCCTGGACTGGATGTGTGAAAGGCATCACATTTCCATTTTCCTCCG 
TGTAAATGTTTTATGTGTTCGCCTACTGATCCCATTCGTTGCTXCTATTGTAAATATTTGTCATTTGTATTTATTAT 
CTCTGTGTTTTCCCCCTAAGGCATAAAATGGTTTACTGTGTTCATTTGAACCCATTTACTGATCTCTGTTGTATATT 
TTTCATGCCACTGCTTTGTTTTCTCCTCAGAAGTCGGGTAGATAGCATTTCTATCCCATCCCTCACGTTATTGGAAG 
CATGCAACAGTATTTATTGCTCAGGGTCTTCTGCTTAAAACTGAGGAAGGTCCACATTCCTGCAAGCATTGATTGAG 
ACATTTGCACAATCTAAAATGTAAGCAAAGTAGTCATTAAAAATACACCCTCTACTTGGGCTTTATACTGCATACAA 
ATTTACTCATGAGCCTTCCTTTGAGGAAGGATGTGGATCTCCAAATAAAGATTTAGTGTTTATTTTGAGCTCTGCAT 
CTTAACAAGATGATCTGAACACCTCTCCTTTGTATCAATAAATAGCCCTGTTATTCTGAAGTGAGAGGACCAAGTAT 
AGTAAAATGCTGACATCTAAAACTAAATAAATAGAAAACACCAGGCCAGAACTATAGTCATACTCACACAAAGGGAG 
AAATTTAAACTCGAACCAAGCAAAAGGCTTCACGGAAATAGCATGGAAAAACAATGCTTCCAGTGGCCACTTCCTAA 
GGAGGAACAACCCCGTCTGATCTCAGAATTGGCACCACGTGAGCTTGCTAAGTGATAATATCTGTTTCTACTACGGA 
TTTAGGCAACAGGACCTGTACATTGTCACATTGCATTATTTTTCTTCAAGCGTTAATAAAAGTTTTAAATAAATGGC 

T 

<210> SEQ ID NO 925 

<211> Length : 38 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 925 
>R207 7 9_node_l 

GGAGGAGGAGGGGAAGCGGCGAAGGAGGAAGAGGAGGA 
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<210> SEQ ID NO 926 

<211> Length : 41 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 926 
>R2 07 7 9_node_3 

GAGCTCCCAGCAGAAAGGCCGCCTGTCCCTGCAGAATACAG 

<210> SEQ ID NO 927 

<211> Length : 11 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 927 
>R2 0 77 9_node_10 
CTGCCCAGGAG 

<210> SEQ ID NO 928 

<211> Length : 53 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 928 
>R2 077 9_node_l 1 

AACACCCGGGTGATAGTGGAGATGATCCATTTCAAGGACTTGCTGCTGCACGA 

<210> SEQ ID NO 929 
<211> Length : 73 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 929 
>R2 07 7 9_node_14 

ATGCTACAAGATAGAAATTACTATGCCCAAGAGGAGGAAAGTGAAGCTAAGAGATTAGAGAACTCGGACTGAG 

<210> SEQ ID NO 930 

<211> Length : 33 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 930 
>R2 07 7 9_node_17 

ACCCTACGTGGACCTCGTGAACTTGCTGCTGAC 

<210> SEQ ID NO 931 

<211> Length : 12 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 931 
>R2 07 7 9_node_19 
GGCCCACCACGG 

<210> SEQ ID NO 932 

<211> Length : 30 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 932 
>R20779 node 20 
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GGAAGCAGGACATCACCTCCCAGAGCCCAG 

<210> SEQ ID NO 933 

<211> Length : 103 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 933 
>R20779_node__22 

TCCGCGGAAGTCAGGGCGGCTGGATTCCAGGACAGGAGTGAATGTAAAAATAAATATCGCTTAGAATGCAGGAGAAG 
GGTGGAGAGGAGGCAGGGGCCGAGGG 

<210> SEQ ID NO 934 

<21I> Length : 77 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 934 
>R2 0779_node_23 

GGTGCTTGGTGCCAAACTGAAATTCAGTTTCTTGTGTGGGGCCTTGCGGTTCAGAGCTCTTGGCGAGGGTGGAGGGA 

<210> SEQ ID NO 935 

<211> Length : 5 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 935 

>R20779_node_25 

CAGAA 

<210> SEQ ID NO 936 
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<211> Length : 17 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 936 
>R2 07 7 9_node_2 9 
CTGAATTTCACTCACAG 

Segment nucleic acid sequences : 

<210> SEQ ID NO 937 

<211> Length r 167 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 937 
>R3 814 4_PEA_2_node_2 1 

TTCATGGCGTGAACCCAGGAGAGACCCCTGTCACCTGTACGGCAGGGATTGGGACCTTCATTGTTGAATTTGCCACC 
CTGAGCAGCCTCACTGGTGACCCGGTGTTCGAAGATGTGGCCAGAGTGGCTTTGATGCGCCTCTGGGAGAGCCGGTC 

AGATATCGGGCTG 

<210> SEQ ID NO 938 

<211> Length : 142 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 938 
>R3 814 4_PE A_2_node_2 6 

GTCGGCAACCACATTGATGTGCTCACTGGCAAGTGGGTGGCCCAGGACGCAGGCATCGGGGCTGGCGTGGACTCCTA 
CTTTGAGTACTTGGTGAAAGGAGCCATCCTGCTTCAGGATAAGAAGCTCATGGCCATGTTCCTAG 
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<210> SEQ ID NO 939 

<211> Length : 125 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 939 
>R3 814 4_PEA__2_node_2 9 

AGTATAACAAAGCCATCCGGAACTACACCCGCTTCGATGACTGGTACCTGTGGGTTCAGATGTACAAGGGGACTGTG 
TCCATGCCAGTCTTCCAGTCCTTGGAGGCCTACTGGCCTGGTCTTCAG 

<210> SEQ ID NO 940 

<211> Length : 145 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 940 
>R3 81 4 4_PEA_2_node_3 1 

AGCCTCATTGGAGACATTGACAATGCCATGAGGACCTTCCTCAACTACTACACTGTATGGAAGCAGTTTGGGGGGCT 
CCCGGAATTCTACAACATTCCTCAGGGATACACAGTGGAGAAGCGAGAGGGCTACCCACTTCGGCCAG 

<210> SEQ ID NO 941 

<211> Length : 172 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 941 
>R3 814 4_PEA_2_node_4 6 

AAGCTGGACAACCGCATGGAGTCGTTCTTCCTGGCCGAGACTGTGAAATACCTCTACCTCCTGTTTGACCCAACCAA 

CTTCATCCACAACAATGGGTCCACCTTCGACGCGGTGATCACCCCCTATGGGGAGTGCATCCTGGGGGCTGGGGGGT 
ACATCTTCAACACAGAAG 
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<210> SEQ ID NO 942 

<211> Length : 375 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 942 
>R3 814 4_PEA__2_node_4 7 

CTCACCCCATCGACCCTGCCGCCCTGCACTGCTGCCAGAGGCTGAAGGAAGAGCAGTGGGAGGTGGAGGACTTGATG 
AGGGAATTCTACTCTCTCAAACGGAGCAGGTCGAAATTTCAGAAAAACACTGTTAGTTCGGGGCCATGGGAACCTCC 
AGCAAGGCCAGGAACACTCTTCTCACCAGAAAACCATGACCAGGCAAGGGAGAGGAAGCCTGCCAAACAGAAGGTCC 
CACTTCTCAGCTGCCCCAGTCAGCCCTTCACCTCCAAGTTGGCATTACTGGGACAGGTTTTCCTAGACTCCTCATAA 
CCACTGGATAATTTTTTTATTTTTATTTTTTTGAGGCTAAACTATAATAAATTGCTTTTGGCTATCA 

<210> SEQ ID NO 943 

<211> Length : 122 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 943 
>R3 814 4_PEA_2_node_4 9 

AAAAGATCTCGCTCTGTTGCCCAGGCTGGAGTGCAGTGGTGTGATCACGACTCACCGCAGCCTTGACCTCCCACACT 
CAAGCAATCCTCCTGCCTTAGCCTTCCAAGTAGCTGGAACTCCAG 

<210> SEQ ID NO 944 

<211> Length : 105 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 944 
>R38144 PEA 2 node 0 
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GG ATT CCCGGAAGAACCCGCAGCAGCTCCCAGGATGAACTGGTTGCAGTGGCTGCTGCTGCTGCGGGGGCGCT GAGA 
GGACACGAGCTCTATGCCTTTCCGGCTG 

<210> SEQ ID NO 945 

<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 945 
>R3 814 4_PEA_2_node_l 

CTCATCCCGCTCGGCCTCCTGTGCGCGCTGCTGCCTCAGCACCATGGTGCGCCAGGTCCCGACGGCTCCGCGCCAGA 
TCCCGCCCACTACAG 

<210> SEQ ID NO 946 

<211> Length : 102 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 946 
>R3 814 4_PEA_2_node_4 

GGAGCGAGTCAAGGCCATGTTCTACCACGCCTACGACAGCTACCTGGAGAATGCCTTTCCCTTCGATGAGCTGCGAC 
CTCTCACCTGTGACGGGCACGACAC 

<210> SEQ ID NO 947 

<211> Length : 9 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 947 
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>R3 81 4 4__PEA_2_node_5 
CTGGGGCAG 

<210> SEQ ID NO 948 

<211> Length : 40 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 948 
>R3 814 4_PEA_2_node_7 

TTTTTCTCTGACTCTAATTGATGCACTGGACACCTTGCTG 

<210> SEQ ID NO 949 

<211> Length : 106 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 949 
>R3 8 1 4 4__PEA_2_node_l 1 

ATTTTGGGGAATGTCTCAGAATTCCAAAGAGTGGTTGAAGTGCTCCAGGACAGCGTGGACTTTGATATTGATGTGAA 
CGCCTCTGTGTTTGAAACAAACATTCGAG 

<210> SEQ ID NO 950 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 950 
>R3 814 4_PEA_2_node_l 4 
TGGTAG 
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<210> SEQ ID NO 951 

<211> Length : 27 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 951 

>R3 8 1 4 4_PEA_2_node_l 5 

GAGGACTCCTGTCTGCTCATCTGCTCT 



<210> SEQ ID NO 952 

<211> Length : 93 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 952 
>R3 314 4_PEA_2_node_l 6 

CCAAGAAGGCTGGGGTGGAAGTAGAGGCTGGATGGCCCTGTTCCGGGCCTCTCCTGAGAATGGCTGAGGAGGCGGCC 
CGAAAACTCCTCCCAG 



<210> SEQ ID NO 953 

<211> Length : 35 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 953 
>R3 814 4_PEA_2_node_l 9 

CCTTTCAGACCCCCACTGGCATGCCATATGGAACA 
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<210> SEQ ID NO 954 

<211> Length : 10 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 954 
>R3 8 1 4 4_PEA_2_node_2 0 
GTGAACTTAC 

<210> SEQ ID NO 955 

<211> Length : 89 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 955 
>R3 8 1 4 4JPE A_2_node_3 6 

AACTTATTGAAAGCGCAATGTACCTCTACCGTGCCACGGGGGATCCCACCCTCCTAGAACTCGGAAGAGATGCTGTG 
G A AT C CAT T G AA 



<210> SEQ ID NO 956 

<211> Length : 33 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 956 

>R3 8 1 44_PEA_2_node_37 

AAAATCAGCAAGGTGGAGTGCGGATTTGCAACA 
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<210> SEQ ID NO 957 

<211> Length : 20 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 957 
>R3 814 4JPEA_2_node_4 3 
CTTGCTTCCTTCTCCCACAT 

<210> SEQ ID NO 958 

<211> Length : 5 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 958 
>R3 814 4_PEA_2_node_4 4 
GTCAG 

<210> SEQ ID NO 959 

<211> Length : 21 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 959 
>R3 814 4_PEA_2__node__4 5 
AT C A A AG AT C T G C GAG AC C AC 
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<210> SEQ ID NO 960 

<211> Length : 74 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 960 
>R3 814 4_PEA_2_node_5 1 

GTGGTGGTTAATTTTATGTGTCAACCTGGCTGGACCACTGGG TACT CCGAT ATT TGGTCAAACATTATTCTGAG 

Segment nucleic acid sequences : 

<210> SEQ ID NO 961 

<211> Length : 184 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 961 
>HUMOSTRO_PEA_l_PEA_l_node_0 

GTGGCAGAAAACCTCATGACACAATCTCTCCGCCTCCCTGTGTTGGTGGAGGATGTCTGCAGCAGCATTTAAATTCT 
GGGAGGGCTTGGTTGTCAGCAGCAGCAGGAGGAGGCAGAGCACAGCATCGTCGGGACCAGACTCGTCTCAGGCCAGT 
TGCAGCCTTCTCAGCCAAACGCCGACCAAG 

<210> SEQ ID NO 962 

<211> Length : 189 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 962 
>HUMOSTRO_PEA_l_PEA_l_node_l 0 

CACTAAAGATGTACCTACCCCTCCACAACAGATGAAACTGTGCCAGCCAAACAACAAATGGGCATTGTCCCCAGAAG 
CTTGGACAAAAAGGCACACAGAGTTCAATTCCAGTTGAACAGAATAAAGGCCAAAATAGAGCTGCCTTGGGGGTCAC 
TGCAATTAGACTGCTTAATGAAGACATTAAAAGAA 

<210> SEQ ID NO 963 

<211> Length : 266 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 963 
>HUMOSTRO_PEA_l_PEA_l_node_l 6 

GTATTTTTAAACTTCTCATAATTAAACTACAGTGATGAAAGATAGCCACACTCAGGCCATTTGGGCTGCTCAGATGA 
ATCCTGCCTGCCTGCTGGCAAACATGTGCTTAGGACATTGACTGATCTGCCATGTTGGCTTCTCTCTGTGTTAAGCC 
ATCCACAGATGAGGCTGAAAAATAAAAACTGCTTTGGATTAAAAAGGTTAACTTTTGAATAAAAAAGCTAGGCATGT 
GTGATGCGCACTAACACGTGCCATTCCTTCTTCAG 

<210> SEQ ID NO 964 

<211> Length : 164 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 964 
>HUMOSTRO_PEA_l_PEA__l_node_2 3 

ACTGATGATTCTCACCAGTCTGATGAGTCTCACCATTCTGATGAATCTGATGAACTGGTCACTGATTTTCCCACGGA 
CCTGCCAGCAACCGAAGTTTTCACTCCAGTTGTCCCCACAGTAGACACATATGATGGCCGAGGTGATAGTGTGGTTT 
ATGGACTGAG 
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<210> SEQ ID NO 965 

<211> Length : 230 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 965 
>HUMOSTRO_PEA_l_PEA_l_node__31 

AGTGCTGAAACCCACAGCCACAAGCAGTCCAGATTATATAAGCGGAAAGCCAATGATGAGAGCAATGAGCATTCCGA 
TGTGATTGATAGTCAGGAACTTTCCAAAGTCAGCCGTGAATTCCACAGCCATGAATTTCACAGCCATGAAGATATGC 
TGGTTGTAGACCCCAAAAGTAAGGAAGAAGATAAACACCTGAAATTTCGTATTTCTCATGAATTAGATAGTGCATC 

<210> SEQ ID NO 966 

<211> Length : 136 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 966 
>HUMOSTRO_PEA_l_PEA_l_node_4 3 

TGGTGTGAATAAATCTTTTATCTTGAATGTAATAAGAATTTGGTGGTGTCAATTGCTTATTTGTTTTCCCACGGTTG 
TCCAGCAATTAATAAAACATAACCTTTTTTACTGCCTATATAATGTTTTTAAAGGTTTA 

<210> SEQ ID NO 967 

<211> Length : 26 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 967 

>HUMO S TROUPE A_l_PEA_l_n o de_3 

G AAAACTC AC TAG CAT G AG AATT GC A 
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<210> SEQ ID NO 968 

<211> Length : 42 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 968 
>HUMOSTRO_PEA_JL_PEA_l_node_5 

GTGATTTGCTTTTGCCTCCTAGGCATCACCTGTGCCATACCA 

<210> SEQ ID NO 969 

<211> Length : 39 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 969 

>HUMOSTRO_PEA_l__PEA_l_node_7 

GTTAAACAGGCTGATTCTGGAAGTTCTGAGGAAAAGCAG 

<210> SEQ ID NO 970 

<211> Length : 87 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 970 
>HUMOSTRO_PEA_l_PEA_l_node_8 

GTAAGCATCTTTTATGTTTTTATATAGTTAAATCATTTACTCAATTATGGCGAGAGGTGCAAGAAACGTATTTGCTG 
CGATATTACT 
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<210> SEQ ID NO 971 

<211> Length : 81 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 971 
>HUMOSTRO_PEA_l_PEA_l_node_l 5 

CTTTACAACAAATACCCAGATGCTGTGGCCACATGGCTAAACCCTGACCCATCTCAGAAGCAGAATCTCCTAGCCCC 
ACAG 



<210> SEQ ID NO 972 

<211> Length : 42 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 972 
>HUMOSTRO_PEA_l_PEA_l_node_l 7 

AATGCTGTGTCCTCTGAAGAAACCAATGACTTTAAACAAGAG 

<210> SEQ ID NO 973 

<211> Length : 8 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 973 
>HUMOSTRO PEA 1 PEA 1 node 20 
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ACCCTTCC 

<210> SEQ ID NO 974 

<211> Length : 50 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400>. sequence : 974 
>HUMOSTRO_PEA_l_PEA__l_node_2 1 

AAGTAAGTCCAACGAAAGCCATGACCACATGGATGATATGGATGATGAAG 

<210> SEQ ID NO 975 

<211> Length : 65 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 975 
>HUMOSTRO_PEA_l_PEA_l_node_22 

ATGATGATGACCATGTGGACAGCCAGGACTCCATTGACTCGAACGACTCTGATGATGTAGATGAC 

<210> SEQ ID NO 976 

<211> Length : 37 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 976 
>HUMOSTRO PEA 1 PEA 1 node 2 4 



WO 2006/131783 



PCT/IB2005/004037 



441 

GTCAAAATCTAAGAAGTTTCGCAGACCTGACATCCAG 

<210> SEQ ID NO 977 

<211> Length : 18 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 977 
>HUMOSTRO_PEA_l_PEA_l_node_2 6 
TACCCTGATGCTACAGAC 

<210> SEQ ID NO 978 

<211> Length : 26 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 978 

>HUMOSTRO_PEA_l_PEA_l_node_27 

GAGGACATCACCTCACACATGGAAAG 

<210> SEQ ID NO 979 

<211> Length : 52 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 979 



WO 2006/131783 



PCT/IB2005/004037 



442 

>HUMOSTRO_PEA_l_PEA_l_node_2 8 

CGAGGAGTTGAATGGTGCATACAAGGCCATCCCCGTTGCCCAGGACCTGAAC 

<210> SEQ ID NO 980 

<211> Length : 51 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 980 
>HUMOSTRO_PEA_l_PEA__l_node_2 9 

GCGCCTTCTGATTGGGACAGCCGTGGGAAGGACAGTTATGAAACGAGTCAG 

<210> SEQ ID NO 981 

<211> Length : 12 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 981 
>HUMOSTRO_PEA_l_PEA_l_node_3 0 
CTGGATGACCAG 



<210> SEQ ID NO 982 

<211> Length : 34 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 982 

>HUMOSTRO_PEA_l__PEA_l__node_32 

TTCTGAGGTCAATTAAAAGGAGAAAAAATACAAT 
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<210> SEQ ID NO 983 

<211> Length : 41 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 983 
>HUMOSTRO_PEA_l_PEA_l_node_3 4 
TTCTCACTTTGCATTTAGTCAAAAGAAAAAATGCTTTATAG 



<210> SEQ ID NO 984 

<211> Length : 36 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 984 
>HUMOSTRO_PEA_l_PEA_l_node_3 6 
CAAAATGAAAGAGAACATGAAATGCTTCTTTCTCAG 



<210> SEQ ID NO 985 

<211> Length : 119 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 985 
>HUMOSTRO_PEA_l_PEA_l_node_3 7 

TTTATTGGTTGAATGTGTATCTATTTGAGTCTGGAAATAACTAATGTGTTTGATAATTAGTTTAGTTTGTGGCTTCA 
TGGAAACTCCCTGTAAACTAAAAGCTTCAGGGTTATGTCTAT 



<210> SEQ ID NO 986 

<211> Length : 11 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 986 
>HUMOSTRO_PEA__l_PEA_l_node_3 8 
GTTCATTCTAT 



<210> SEQ ID NO 987 

<211> Length : 91 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 987 
>HUMOSTRO_PEA_l_PEA__l_node_3 9 

AGAAGAAATGCAAACTATCACTGTATTTTAATATTTGTTATTCTCTCATGAATAGAAATTTATGTAGAAGCAAACAA 
AATACTTTTACCCA 



<210> SEQ ID NO 988 
<211> Length : 18 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 988 
>HUMOSTRO_PEA__l_PEA_l_node_4 0 
CTTAAAAAGAGAATATAA 

<210> SEQ ID NO 989 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 989 
>HUMOSTRO_PEA_l_PEA_l_node_4 1 
CATTTT 

<210> SEQ ID NO 990 

<211> Length : 60 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 990 
>HUMOSTRO_PEA_l_PEA_l_node_42 

ATGTCACTATAATCTTTTGTTTTTTAAGTTAGTGTATATTTTGTTGTGATTATCTTTTTG 



Segment nucleic acid sequences : 
<210> SEQ ID NO 991 
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<211> Length : 153 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 991 
>R1 172 3_PEA_l_node_l 3 

ACACTAAAAGAACAAACACCTTGCTCTTCGAGATGAGACATTTTGCCAAGCAGTTGACCACTTAGTTCTCAAGAAGC 
AACTATCTCTTTCATGTGCCTTCTGAGGAAGTATTCAGAGGGGGAATATCAAATGTCTTTCCCTTGGACTCTCCCA 



<210> SEQ ID NO 992 

<211> Length : 744 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 992 
>R1 17 2 3_PEA_l_node_l 6 

GGTCTCACTGTGTCACGAGGCTGGAGTGCAGTGGAACAATTTCAGCACACTGCAACCTCTGCCTCCCAGGCTCAAAT 
GATCATCCCACCTAAGCCTCCGGAGTAGCTGGGACCACAGGCT^AGCGCCACCATGCCCAGCTGATACCAATGTCTTT 
TAAAAAATGTTGTATGTGGAAATAAATTGAGACTTATAGAAAAGCTGCAAAAATAGTGCAGTTTCTATATATCCTTC 
CCCCATCTTTGGCTAGTGTTAACAATCTACATAACCGCAGTACGATGATCAAGGCTAGGAAATTAACATTGGCACAG 
TACTGTTAATGAAACCATGCTTTGTTTTGAGATTCCCACAGTTTTGCCTTTTTCTGTTCCAAGATCCTATCCAGGAT 
CCCACGTTGCATTTCATTGTCATGTCTCCTTCTCCTCTAACCTCTGACAATGCATCATTCTTTCCATGTCTTTTGTG 
ATGTTGACACTTTTGAAGAGGACTGGTCCAGATTTTTGTACACTGTCCCTCAGTTTGGGATTGTCTGCTGTTTTCTC 
ATGAACAGATAGAGGTTTTGCATTTTTGACAAGAATCCTCAGAAGAGATGCACCCTTCTCAGTGCACTGTAGCAAGG 
GGCGCATGCTGTCAATGTCTTACTGGTGATGTTAACTTTGATCGCTTTTGATTCAGATAGTATCTGCTGGGTTTTTC 
CACTGTAAAGTTACTATTTTTTCCATTGTAATTAATAAATAACTTGAGGGA 



<210> SEQ ID NO 993 
<211> Length : 174 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 993 
>R1 17 2 3_PEA_l_node_l 9 

GTACCAGTCCTTCTGCTCCCCAGGGAAACTGAACTCAGTTTGCATCAGCTGCTGCAACACCCCTCTTTGTAACGGGC 
CAAGGCCCAAGAAAAGGGGAAGTTCTGCCTCGGCCCTCAGGCCAGGGCTCCGCACCACCATCCTGTTCCTCAAATTA 

GCCCTCTTCTCGGCACACTG 



<210> SEQ ID NO 994 

<211> Length : 309 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 994 
>R1 1 7 2 3 JPEA_l_node__2 

AGAAGAGGAAGACAGGAAGGGGGTGGGGATGTGAAGCGACCGTCCCAGCCTTCCCCGCCCGCCACCCCCACCCCAAC 
TCGGCAGCCGTCACGTGATGCCTGGAGTGGGAGGTGGGGAGAAAAGGCGAGACTTTTGTGGGTGCTCCCGATCGCCA 
GTAGTTCCTTCAGTCTCAGCCGCCAACTCCGGAGGCGCGGTGCTCGGCCCGGGAGCGCGAGCGGGAGGAGCAGAGAC 
CCGCAGCCGGGAGCCCGAGCGCGGGCGATGCAGGCTCCGCGAGCGGCACCTGCGGCTCCTCTAAGCTACGACCGTCG 

T 



<210> SEQ ID NO 995 

<211> Length : 487 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 995 
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>R1 17 2 3_PEA_l_node_2 2 

GTGAGTTTCTTCTGGGTGTCCTTTTATTCTGGGTAGGGAGCGGGAGTCCGTGTTCTCTTTTGTTCCTGTGCAAATAA 
TGAAAGAGCTCGGTAAAGCATTCTGAATAAATTCAGCCTGACTGAATTTTCAGTATGTACTTGAAGGAAGGAGGTGG 
AGTGAAAGTTCACCCCCATGTCTGTGTAACCGGAGTCAAGGCCAGGCTGGCAGAGTCAGTCCTTAGAAGTCACTGAG 
GTGGGCATCTGCCTTTTGTAAAGCCTCCAGTGTCCATTCCATCCCTGATGGGGGCATAGTTTGAGACTGCAGAGTGA 
GAGTGACGTTTTCTTAGGGCTGGAGGGCCAGTTCCCACTCAAGGCTCCCTCGCTTGACATTCAAACTTCATGCTCCT 
GAAAACCATTCTCTGCAGCAGAATTGGCTGGTTTCGCGCCTGAGTTGGGCTCTAGTGACTCGAGACTCAATGACTGG 
GACTTAGACTGGGGCTCGGCCTCGC 

<210> SEQ ID NO 996 

<211> Length : 418 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 996 
>R1 172 3_PEA_l_node_3 1 

GG AG AGAC AG AG AAAAG AAAAAC AC AG C AT G AG AAC AC AGT AAAT A AAT AAAAC C AT A AAAT AT T T AG C CCCTCTGT 
TCTGTGCTTACTGGCCAGGAAATGGTACCAATTTTTCAGTGTTGGACTTGACAGCTTCTTTTGCCACAAGCAAGAGA 
GAATTTAACACTGTTTCAAACCCGGGGGAGTTGGCTGTGTTAAAGAAAGACCATTAAATGCTTTAGACAGTGTATTT 
ATACCAGTTGATGTCTGTTAATTTTAAAAAAATGTTTTCATTGGTGTTTGTTTGCGTATCCAGAAAGCAGTTCATGT 
TATCCATAAATCTGGTTTTGTCTTTTTTTGTTTTAAAGAAAAAGATGTATACATACAGTATAGCTGCATTAGATAAA 
GCAGTGTTTGTATTTTAAAGGATGTCTGCACAA 

<210> SEQ ID NO 997 

<211> Length : 44 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 997 
>R1 172 3_PE A_l_node_l 0 

GCTTTGCGCTGCAAATCCAGTGCTACCAGTGTGAAGAATTCCAG 
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<210> SEQ ID NO 998 

<211> Length : 94 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 998 
>R1 17 2 3 JPE A_l_node_l 1 

CTGAACAACGACTGCTCCTCCCCCGAGTTCATTGTGAATTGCACGGTGAACGTTCAAGACATGTGTCAGAAAGAAGT 
GATGGAGCAAAGTGCCG 



<210> SEQ ID NO 999 

<211> Length : 4 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 999 
>R1 1 7 2 3_PE A_l_node_l 5 
ACAG 



<210> SEQ ID NO 1000 

<211> Length : 58 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1000 
>R117 23_PEA_l_node_18 

GGATCATGTACCGCAAGTCCTGTGCATCATCAGCGGCCTGTCTCATCGCCTCTGCCGG 



<210> SEQ ID NO 1001 

<211> Length : 11 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1001 
>R1 1 7 2 3_PEA__l_node_2 0 
CTGAAGCTGAA 



<210> SEQ ID NO 1002 

<211> Length : 63 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1002 
>R1 17 2 3_PEA_l__node_2 1 

GGAGATGCCACCCCCTCCTGCATTGTTCTTCCAGCCCTCGCCCCCAACCCCCCACCTCCCTGA 

<210> SEQ ID NO 1003 

<211> Length : 30 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1003 
>Rll723_PEA_l_node_23 
TCTGAAAAGTGCTTAAGAAAATCTTCTCAG 

<210> SEQ ID NO 1004 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1004 
>R1 17 2 3_PEA_l_node_2 4 

TTCTCCTTGCAGAGGACTGGCGCCGGGACGCGAAGAGCAACGGGCGCTGCACAAAGCGGGCGCTGTCGGTGGTGGAG 
TGCGCAT 

<210> SEQ ID NO 1005 

<211> Length : 26 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1005 
>R1 1 7 2 3_PEA_l_node_2 5 
GTACGCGCAGGCGCTTCTCGTGGTTG 

<210> SEQ ID NO 1006 

<211> Length : 113 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1006 
>R11723 PEA 1 node 26 
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GCGTGCTGCAGCGACAGGCGGCAGCACAGCACCTGCACGAACACCCGCCGAAACTGCTGCGAGGACACCGTGTACAG 
GAGCGGGTTGATGACCGAGCTGAGGTAGAAAAACGT 

<210> SEQ ID NO 1007 

<211> Length : 82 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1007 
>R1 17 2 3_PEA_l_node_2 7 

CTCCGAGAAGGGGAGGAGGATCATGTACGCCCGGAAGTAGGACCTCGTCCAGTCGTGCTTGGGTTTGGCCGCAGCCA 
TGATC 

<210> SEQ ID NO 1008 

<211> Length : 24 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1008 
>R1 17 2 3_PEA_l_node_2 8 
CTCCGAATCTGGTTGGGCATCCAG 

<210> SEQ ID NO 1009 

<211> Length : 28 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1009 
>R1 17 2 3_PEA_l_node_2 9 
CATACGGCCAATGTCACAACAATCAGCC 
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<210> SEQ ID NO 1010 

<211> Length : 10 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1010 
>R1 17 2 3_PEA_l_node_3 
CTCCGCGGCA 

<210> SEQ ID NO 1011 

<211> Length : 21 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1011 
>R1 172 3_PEA__l_node_3 0 
CTGGGCAGACACGAGCAGGAG 

<210> SEQ ID NO 1012 

<211> Length : 52 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1012 
>R11723 PEA 1 node 4 
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GCAGCGCGGGCCCCAGCAGCCTCGGCAGCCACAGCCGCTGCAGCCGGGGCAG 

<210> SEQ ID NO 1013 

<211> Length : 43 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1013 
>R1 17 2 3_PE A_l_no de_5 

CCTCCGCTGCTGTCGCCTCCTCTGATGCGCTTGCCCTCTCCCG 

<210> SEQ ID NO 1014 

<211> Length : 32 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1014 

>R1 17 2 3_PEA_l_node_6 

GCCCCGGGACTCCGGGAGAATGTGGGTCCTAG 

<210> SEQ ID NO 1015 

<211> Length : 39 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1015 
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>Rll7 23_PEA_l_node_7 

GCATCGCGGCAACTTTTTGCGGATTGTTCTTGCTTCCAG 



<210> SEQ ID NO 1016 

<211> Length : 34 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1016 
>R1 17 2 3_PEA_l_node_8 

GTGAGAATACCCAGAGGCCAGCAGCCGAGGCCAG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 1017 

<211> Length : 438 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1017 
>R1 627 6_PEA_l_node_0 

GTGGAGGAGGATGGTGGGGAGTGGTGGTGTCTTCGTCCTGGGAGAAGGCGAAGCAACTTCCAGGAGGAAACGGGCGT 
TTCCTTCCCACGCGCTCGAGCGAGCCCTGGGTCCTGGCCTCGGAACTCCACCCAGCCCCTCCCCACCCTCTGGGAAA 
AGCCAGTCGCCACACACAGGCACACGCAGGCCCCGGCGCCGCGCCCTAAGGAGAGCAGCACCCACAGCCAATTGCCA 
TGGCAACCCCGGGGTTCGTTCCACTTCCCCACCCAGCCGATCTCCCCCCTCCTCCCTGCACTGCAGCCAACCGGCTT 
GTGCGCGTCCCAGGAGCGCGCTATAAAACCTGTGCTGGGCGTGATCGGCAAGCACCGGACCAGGGGGAAGGCGAGCA 
GTGCCAATCTACAGCGAAGAAAGTCTCGTTTGGTAAAAGCGAGAAGGGAAAGC 
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<210> SEQ ID NO 1018 

<211> Length : 122 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1018 
>R1 62 7 6_PEA_l_node_6 

GTAATCCTGCTCCCTCTGCTGTTTGACCTCTTCTCCTGCAGCTAAGTGAAGCTGCTTCCTCCCTTCTCTTTTGTATT 
CCCCTTCCCAGAGGGCGATAAGCAAATAATAATAATGCAATAAAT 

<210> SEQ ID NO 1019 

<211> Length : 90 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1019 
>Rl627 6_PEA__l_node_l 

CTGAGCATGCAGAGTGTGCAGAGCACGAGCTTTTGTCTCCGAAAGCAGTGCCTTTGCCTGACCTTCCTGCTTCTCCA 
TCTCCTGGGACAG 

<210> SEQ ID NO 1020 

<211> Length : 111 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1020 
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>R1 62 7 6_PE A_l_node_4 

GTCGCTGCGACTCAGCGCTGCCCTCCCCAGTGCCCGGGCCAGTGCCCTGCGACGCCGCCGACCTGCGCCCCCGGGGT 
GCGCGCGGTGCTGGACGGCTGCTCATGCTGTCTG 



<210> SEQ ID NO 1021 

<211> Length : 115 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1021 
>R1 627 6_PEA_l__node_5 

GTGTGTGCCCGCCAGCGTGGCGAGAGCTGCTCAGATCTGGAGCCATGCGACGAGAGCAGTGGCCTCTACTGTGATCG 
CAGCGCGGACCCCAGCAACCAGACTGGCATCTGCACGG 



Segment nucleic acid sequences : 

<210> SEQ ID NO 1022 

<211> Length : 232 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1022 
>H61775_node_2 

ATCTGGTGGTTCTCCGGAGAGCAGCTTCCTTGGGTGTTACATGAGCCAAGCCCTCACTGTACAGAAGAGTGAGAGCT 
GAAACCTGTTCCCTGAGCTGATCAGAAGGACATCCCTTGGCCCCTCCATCTGGGCTCCTGTGGATAGGAGGGGCTGG 
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GTGAGCAGGCCAGCTGGGCTATGGTGTGGTGCCTCGGCCTGGCCGTCCTCAGCCTGGTCATCAGCCAGGGGGCTGAC 
G 

<210> SEQ ID NO 1023 

<211> Length : 189 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1023 
>H617 7 5_node_4 

GTCGAGGGAAGCCTGAGGTGGTATCGGTGGTGGGCCGGGCTGGGGAGAGTGTGGTGCTGGGCTGTGACCTGCTGCCC 
CCGGCCGGCCGGCCCCCCCTGCATGTCATCGAGTGGCTGCGCTTTGGATTCCTGCTTCCCATCTTCATCCAGTTCGG 
CCTCTACTCTCCCCGAATTGACCCTGATTACGTGG 

<210> SEQ ID NO 1024 

<211> Length : 201 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1024 
>H617 7 5_node_6 

GTTCTCCTCAGGTGGGAGGTAGGGAGGTATCAGCAAGAAAGGTGGGCTGGGTAGAGTCGCACAAGGCCTCCTATGAA 
CGGCTTTGTCCCTGCTCTGATCTCATCTCCAGCTCTGCTGCCTTAACTCTGCTTAATAAGCATGGCTGTGCTCCCAA 
GCAGTGTTAATTCATTGAAAGATGTCATTCATTTACACACACACACA 

<210> SEQ ID NO 1025 
<211> Length : 698 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 1025 
>H61775_node_8 

GTGACTGTGGGTTCCCTGCCTTCCGAGAGCTTAAGAGAGCAGAGACTGTGTCTCCTGTTTTCTTCACACGCCGCTGC 
ATATGGGAAGATCTGAAGTCAACAGGCTTTAGCCCTGCAGGTGGAGGGAGGCCTCCAGGAGGTGGGCCCAGGACTCA 
GGAGGACTCAGGGCTGCCCTGCTGGCGATCTTCCTGTTCTGTAACACTACAGGTCTAGCAGTCCAGCTGTCACAGAA 
AAGCTAGGACATGCAGTATGCTTCTTTGGATATTCTGAGTAACATTTGGACTGTTACCCATTGGCTACCAGCATCTC 
CCAAGTGAGAATACATAGATTACCCCCAGTGCCCTGAACAGCACTCGGTCCTAACACCCGTGTCCATGGAAAGCACG 
CCGCGTCTGGAGAAAGAAGCCGAAGGCTCTTGTCACTTACTAGCCATGTGATTTTGGAAAGAAACTTAACATTAATT 
CCTTCAGCTACAATGGAATTCTTGGGAGGATTAAATATGGTGACAACGCCTAATATTAGATGGCCTGTATTCCACAC 
TCAATCTTCCTTCCCTCTTCTTCCTTCTTTGTAGAGCTATAATGAAAAGTATCATGTGGGACACAGAAGAGGTTGCA 
GTCTGGGGTCTGCAGGGCTTAGCGGCCAGGCAGATTAGCTTTCTTGAGGAATCCTGACAGTGGGTGGAAGGGTATGA 

TGATG 

<210> SEQ ID NO 1026 

<211> Length : 86 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1026 
>H617 7 5_node_0 

GGAGGCGCTCGGGGCATCCGAGGCGGGGAGGCGGGTCCGCCCCCTATTGTGTAGCGGCGAGAGTGGAGCCGAGCGGT 
GCGGAGCAG 

<210> SEQ ID NO 1027 

<211> Length : 7 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1027 

>H61775_node_5 

GATAAGA 



Segment nucleic acid sequences: 

<210> SEQ ID NO 1028 

<211> Length : 203 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1028 
>M8 5 4 9 1_PE A_l_no de__0 

TCTGCTGGCTGCGCGGTGGCGGCGGCTGTGTGTGCGCCGCGCCTTGCCGCCCCCCCTGGCCCCCCGAGCCCGGGGCG 
CGCGCTCCCGCCCGGGCCGTCCGGGCCCCGCGGCGCCGCGGCCCGAGGCCCCGGGAAGCGCAGCCATGGCTCTGCGG 
AGGCTGGGGGCCGCGCTGCTGCTGCTGCCGCTGCTCGCCGCCGTGGAAG 



<210> SEQ ID NO 1029 

<211> Length : 229 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1029 
>M8 54 91_PEA_l_node_13 

AGAGACAGGATCTCACTATGTTGTCCAGGCTGGTCTTGAACTCGTGGCCACAAATGATCCTCCCACCTCAGCCTCCC 
AAAGTGTTGGAATTATAGGCATGAACCACCATGCCCAGGAGGAGAATTTTTGATAATAATATTTTGTGGACATCTTT 
GCATATCATGTCAGAGCTATAACATCATTGTGGAGAAGCTCTTAGGATCCCATAGAATAAATGTACCGTAATTTA 
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<210> SEQ ID NO 1030 

<21X> Length : 336 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1030 
>M8 5 4 91__PEA_l_node_2 1 

CCATCCCCTCCGCGCCCCAGGCTGTGATTTCCAGTGTCAATGAGACCTCCCTCATGCTGGAGTGGACCCCTCCCCGC 
GACTCCGGAGGCCGAGAGGACCTCGTCTACAACATCATCTGCAAGAGCTGTGGCTCGGGCCGGGGTGCCTGCACCCG 
CTGCGGGGACAATGTACAGTACGCACCACGCCAGCTAGGCCTGACCGAGCCACGCATTTACATCAGTGACCTGCTGG 
CCCACACCCAGTACACCTTCGAGATCCAGGCTGTGAACGGCGTTACTGACCAGAGCCCCTTCTCGCCTCAGTTCGCC 

TCTGTGAACATCACCACCAACCAGGCAG 



<210> SEQ ID NO 1031 

<211> Length : 125 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1031 
>M8 5 4 9 l_PEA__l_node_2 3 

CTCCATCGGCAGTGTCCATCATGCATCAGGTGAGCCGCACCGTGGACAGCATTACCCTGTCGTGGTCCCAGCCAGAC 
CAGCCCAATGGCGTGATCCTGGACTATGAGCTGCAGTACTATGAGAAG 



<210> SEQ ID NO 1032 
<211> Length : 1,305 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1032 
>M8 54 91_PEA_l_node_24 

GTACCTATTGGCTGGGTGCTGTCCCCATCACCCACCTCCCTGAGGGCCCCTCTCCCAGGCTGAGGCCTGGGAGTTCT 
GCCCCACCGCAAGATGAGACGCACTGGTGCAGCAGAAAGAGCACTGGCCTTGGAGTCAGGCTGCCTGGCTCCCAATC 
CAGCTCCGCTCCTTCCCACTGTGAGACCTCAGGCAGGTGCCTTGACCTCTCTGGATCTCACTTTTCTGGTCTGGAGG 
ATACACCCAGCAATCTCAGTGAAATGCAACAGTCACATCCCTTTCCCTACCACGACCCTTTCATCTTGACCTCAGTG 
GCTTGATGTTGGGAAAAACTGGGTTTCCAAAAAGCTGCACTTATGAAGTGATAATTAGTCACTCACCTCTTCTTCGA 
CAGAGATTTGAAACAGCTCAAGAGAGCTTCCGCCTGCCCTGCTCTGAGTCCTGCTAAAACACCCACTTTCACTCGCC 
TGCATGCCCTTTGCATGGGGAGAGGTGATTTCACTTTGAGCTTTTAAATCAGACCTTAATTACTCCCTTTGGGTGGA 
AGCCCCTGGGATGGTAGAAGGATCACTGGACTAAGAGTGAGAAGCCGTAGGTTCAAATCCCAGCTCCGTCCTTCACC 
AGCTATGTGACCTTGGGCAGGCGTCTTTCTCCCTCTGAACCTCAGTTTCCACCTGTGTCGAGTGTGGGTGAGACCCC 

tcgcggggagctatgcaggttacggagaaaaggcagcacagca!cccagaatgggacctggccctcagcagaggccat 

GTGTGTCCCTGGCCTTCCTCCTCTGCCCTGCCTGCTGCACAGTGGGCAATGGTGACAGGATGGGAGGCCAAGTGGAT 
GTGGGGTCTGCACAGTACAGGGGCCAGGAGGTAGACAGCACAATTGCCCACCCACATGGCTGGACATCAGAGGCCCC 
AGGAAGCCTCTCCTTTGAATGATCACTTCTCTTACCTGCTCCAGGAGGCAACAAACAGCCACAGAGGCTGCAAGGGC 
ACCTGGGAAAGGCATCGCGGGGCTTCCATTCAGACTAGGTGTCAATGACTGACAGGGAGGCCTTTGGTTGAGGGCAA 
GCCCACGGGGAACTGCAGATGGATGGAAGGGCTCTCCCTGAAGGCTGAGAGGAAGAGTGCAGTCAATTGCAGCCAGT 
CCTGCTGGAGCCCAACTTTCTAGAGCCCAGCCCGGCCTTCCCACTCTGTTAACTGCTGGATCGGCTAACCAGGCCGG 
TCTCCAGGGCCTTTCAAACACTTACCCAGCCTTTGCCGGCCGTCTTACCATTGCTTGCGTGCGTGTTCATCCC 



<210> SEQ ID NO 1033 

<211> Length : 404 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1033 
>M854 91_PEA_l_node_8 

TGGGAAGAGGTGAGTGGCTACGATGAGAACATGAACACGATCCGCACGTACCAGGTGTGCAACGTGTTTGAGTCAAG 
CCAGAACAACTGGCTACGGACCAAGTTTATCCGGCGCCGTGGCGCCCACCGCATCCACGTGGAGATGAAGTTTTCGG 
TGCGTGACTGCAGCAGCATCCCCAGCGTGCCTGGCTCCTGCAAGGAGACCTTCAACCTCTATTACTATGAGGCTGAC 
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TTTGACTCGGCCACCAAGACCTTCCCCAACTGGATGGAGAATCCATGGGTGAAGGTGGATACCATTGCAGCCGACG^ 
GAGCTTCTCCCAGGTGGACCTGGGTGGCCGCGTCATGAAAATCAACACCGAGGTGCGGAGCTTCGGACCTGTGTCCC 

GCAGCGGCTTCTACCTGGC 

<210> SEQ ID NO 1034 

<211> Length : 184 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1034 
>M8 5 4 9 l_PEA_l__node_9 

CTTCCAGGACTATGGCGGCTGCATGTCCCTCATCGCCGTGCGTGTCTTCTACCGCAAGTGCCCCCGCATCATCCAGA 
ATGGCGCCATCTTCCAGGAAACCCTGTCGGGGGCTGAGAGCACATCGCTGGTGGCTGCCCGGGGCAGCTGCATCGCC 
AATGCGGAAGAGGTGGATGTACCCATCAAG 

<210> SEQ ID NO 1035 

<211> Length : 97 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1035 
>M854 91_PEA_l_node_10 

CTCTACTGTAACGGGGACGGCGAGTGGCTGGTGCCCATCGGGCGCTGCATGTGCAAAGCAGGCTTCGAGGCCGTTGA 
GAATGGCACCGTCTGCCGAG 



<210> SEQ ID NO 1036 
<211> Length : 91 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1036 
>M8 54 9 l_PEA_l_node__l 8 

GTTGTCCATCTGGGACTTTCAAGGCCAACCAAGGGGATGAGGCCTGTACCCACTGTCCCATCAACAGCCGGACCACT 
TCTGAAGGGGCCAC 



<210> SEQ ID NO 1037 

<211> Length : 65 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1037 
>M854 91_PEA__l_node_19 

CAACTGTGTCTGCCGCAATGGCTACTACAGAGCAGACCTGGACCCCCTGGACATGCCCTGCACAA 



<210> SEQ ID NO 1038 

<211> Length : 65 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1038 
>M854 91_PEA_l__node_6 

AAACGCTAATGGACTCCACTACAGCGACTGCTGAGCTGGGCTGGATGGTGCATCCTCCATCAGGG 
Segment nucleic acid sequences: 



<210> SEQ ID NO 1039 
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<2X1> Length : 810 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1039 
>T3 9971_node_0 

GAGACTGAGCCTGGGGACAGGGAGTGGCCTGCTCAGAAAAGACTCAGAAATTAAATCCAGTCCAGTGGGTTGATATT 
TACCCAAATTTCCAGCCTGGGGAGATTGATGCACCCAAGAGAAGAACCCAGAAATGAAACTTTGTTCTTTTATGCTA 
AAAAATAAAATTCCCCAGAGTGCTTACAATCTCTCCTCCCACTCCCTTTTTCCTGCCCTAAATAAATAATGGCGAAT 
GAGCACCCAGCCAGGGATGTGTCTGATCAAACAATCATGGATCAATAGCTATGTTTGGAGAAGGAATTTGTGGCTGC 
TCCAGCTACTGGGCATTTTGTCTGGTCCAGTTCATGTAATCTCCCAACACCCCATGAAGCAAGGCTTTGTTAATCCT 
ATTTTACTGAAAATGAACTAAGACTCAGAGAGATAAAGCTGTTGCCCAATGAGCCTTCTTTCTGCCCTCCAGATCCA 
CGGTGCTAATTCCCCTTCCGATGACCTAATGATTCTGAGCTTGGCAAAGGTCTTATCTCCCAGCTCGCCCAGGCCCA 
GTGTTCCAGGAATGTGACCTTTGCTGCAGCAGCCGCTGGAGGGGGCAGAGGGGATGGGCTGGAGGTTGAGCAAACAG 
AGCAGCAGAAAAGGCAGTTCCTCTTCTCCAGTGCCCTCCTTCCCTGTCTCTGCCTCTCCCTCCCTTCCTCAGGCATC 
AGAGCGGAGACTTCAGGGAGACCAGAGCCCAGCTTGCCAGGCACTGAGCTAGAAGCCCTGCCATGGCACCCCTGAGA 
CCCCTTCTCATACTGGCCCTGCTGGCATGGGTTGCTCTGG 



<210> SEQ ID NO 1040 

<211> Length : 168 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1040 
>T39971_node_18 

GTGCCAGGGGCTGTGGGCCAGGGTAGAAAGCATCTAGGGAGGGTTTGAGAGCTATTGCTCCCAGGGACAGGGTGGAC 
AGGGAAGCTGGACCCAGGGCCCTGCAGGACCTGGTGGGAGCTCTGTGAGCACAGGGCAGCCCCAAGACTCCAGGTCC 

TGGGCAGTGAACCT 



<210> SEQ ID NO 1041 
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<211> Length : 157 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1041 
>T39971__node_21 

GGTAGTCAGTACTGGCGCTTTGAGGATGGTGTCCTGGACCCTGATTACCCCCGAAATATCTCTGACGGCTTCGATGG 
CATCCCGGACAACGTGGATGCAGCCTTGGCCCTCCCTGCCCATAGCTACAGTGGCCGGGAGCGGGTCTACTTCTTCA 

AGG 



<210> SEQ ID NO 1042 

<211> Length : 198 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1042 
>T39971_node__22 

GTACTCAGGGGGTGGTGGGAGACTGAGCAGGCAGTGGAGCAGTCTTGGATTCCTTTCACATTTCACTGGGGACAGGC 
CTCAGCATGTGCCCACCCCTGACCCCCACCTCATGCTGGGAGATCCTAACTTCAACAGCCTCTGGGATCTCCAGTCT 
TGCCCTGGCCCAGCCCTCCTAATGCCCACCACCCCGCTCCTCAG 



<210> SEQ ID NO 1043 

<211> Length : 153 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1043 
>T39971 node 23 
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GGAAACAGTACTGGGAGTACCAGTTCCAGCACCAGCCCAGTCAGGAGGAGTGTGAAGGCAGCTCCCTGTCGGCTGTG 
TTTGAACACTTTGCCATGATGCAGCGGGACAGCTGGGAGGACATCTTCGAGCTTCTCTTCTGGGGCAGAACCTCTG 

<210> SEQ ID NO 1044 

<211> Length : 140 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1044 
>T39971_node_31 

GCCATCCCGCGCCACGTGGCTGTCCTTGTTCTCCAGTGAGGAGAGCAACTTGGGAGCCAACAACTATGATGACTACA 
GGATGGACTGGCTTGXGCCTGCCACCTGTGAACCCATCCAGAGTGTCTTCTTCTTCTCTGGAG 



<210> SEQ ID NO 1045 

<211> Length : 127 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1045 
>T39971_node_33 

ACAAGTACTACCGAGTCAATCTTCGCACACGGCGAGTGGACACTGTGGACCCTCCCTACCCACGCTCCATCGCTCAG 
TACTGGCTGGGCTGCCCAGCTCCTGGCCATCTGTAGGAGTCAGAGCCCAC 



<210> SEQ ID NO 1046 

<211> Length : 223 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1046 
>T3 9971_node_7 

TGACTCGCGGGGATGTGTTCACTATGCCGGAGGATGAGTACACGGTCTATGACGATGGCGAGGAGAAAAACAATGCC 
ACTGTCCATGAACAGGTGGGGGGCCCCTCCCTGACCTCTGACCTCCAGGCCCAGTCCAAAGGGAATCCTGAGCAGAC 
ACCTGTTCTGAAACCTGAGGAAGAGGCCCCTGCGCCTGAGGTGGGCGCCTCTAAGCCTGAGGGGATAGA 



<210> SEQ ID NO 1047 

<211> Length : 9 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1047 
>T 3 9 9 7 1 __n o de_l 
CTGACCAAG 



<210> SEQ ID NO 1048 

<211> Length : 44 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1048 
>T39 971_node__10 

GGAGACCTCAGCCCCCAGCAGAGGAGGAGCTGTGCAGTGGGAAG 



<210> SEQ ID NO 1049 
<211> Length : 38 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 1049 
>T39971_node_ll 

CCCTTCGACGCCTTCACCGACCTCAAGAACGGTTCCCT 



<210> SEQ ID NO 1050 

<211> Length : 14 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1050 
>T3 9971_node_12 
CTTTGCCTTCCGAG 



<210> SEQ ID NO 1051 

<211> Length : 32 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1051 
>T39971_node_15 

GGCAGTACTGCTATGAACTGGACGAAAAGGCA 



<210> SEQ ID NO 1052 
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<211> Length : 24 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1052 

>T39971__node_16 

GTGAGGCCTGGGTACCCCAAGCTC 

<210> SEQ ID NO 1053 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1053 
>T39971_node_17 

ATCCGAGATGTCTGGGGCATCGAGGGCCCCATCGATGCCGCCTTCACCCGCATCAACTGTCAGGGGAAGACCTACCT 
CTTCAAG 

<210> SEQ ID NO 1054 

<211> Length : 42 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1054 
>T39971__node_2 6 

CTGGTACCAGACAGCCCCAGTTCATTAGCCGGGACTGGCACG 
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<210> SEQ ID NO 1055 

<211> Length : 51 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1055 
>T3 9971_node_2 7 

GTGTGCCAGGGCAAGTGGACGCAGCCATGGCTGGCCGCATCTACATCTCAG 

<210> SEQ ID NO 1056 

<211> Length : 9 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1056 
>T3 9971_node_2 8 
GCATGGCAC 

<210> SEQ ID NO 1057 

<211> Length : 95 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1057 
>T39971_node__2 9 

CCCGCCCCTCCTTGGCCAAGAAACAAAGGTTTAGGCATCGCAACCGCAAAGGCTACCGTTCACAACGAGGCCACAGC 
CGTGGCCGCAACCAGAAC 
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<210> SEQ ID NO 1058 

<211> Length : 42 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1058 
>T39 971__node_3 

AGTCATGCAAGGGCCGCTGCACTGAGGGCTTCAACGTGGACA 

<210> SEQ ID NO 1059 

<211> Length : 8 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1059 

>T39971_node_30 

TCCCGCCG 

<210> SEQ ID NO 1060 

<211> Length : 7 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1060 

>T39971_node_34 

ATGGCCG 
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<210> SEQ ID NO 1061 

<211> Length : 17 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1061 
>T3 9971_node_35 
GGCCCTCTGTAGCTCCC 

<210> SEQ ID NO 1062 

<211> Length : 62 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1062 
>T39971jno de_3 6 

TCCTCCCATCTCCTTCCCCCAGCCCAATAAAGGTCCCTTAGCCCCGAAAAAAAAGCKATAAT 

<210> SEQ ID NO 1063 

<211> Length : 20 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1063 
>T3 9 971_node_4 
AGAAGTGCCAGTGTGACGAG 
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<210> SEQ ID NO 1064 

<211> Length : 58 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1064 
>T39971_node_5 

CTCTGCTCTTACTACCAGAGCTGCTGCACAGACTATACGGCTGAGTGCAAGCCCCAAG 

<210> SEQ ID NO 1065 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1065 

>T39971_node_8 

CTCAAG 

<210> SEQ ID NO 1066 

<211> Length : 20 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1066 
>T39971 node 9 
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GCCTGAGACCCTTCATCCAG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 1067 

<211> Length : 327 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1067 
>Z2 1 3 6 8__PE A_l_node_0 

AGGTXACTTGACTGGGAGTTCTCAGACCTCCAGTTTCAGCCCTGCCCTCAGCCTCCAATCCGTAAGAGACACCCAGC 
CCCAGCAATTGGATTGGGCAGCCCGTCTTGACACACCACTGTGCTGAGTGCTTGAGGACGTGTTTCAACAGATGGTT 
GGGGTTAGTGTGTGTCATCACATTCGAGTGGGGATTAAGAGAAGGAAGGCTGCCTTGCTGGAGCTGTGTGGTCTTCT 
CCAAGTGAGAGTCGCAGGCAATAGAACTACTTTGCTTTTGGAGGAAAAGGAGGAATTCATTTTCAGCAGACACAAGA 

AAAGCAGTTTTTTTTTCAG 

<210> SEQ ID NO 1068 

<211> Length : 177 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1068 
>Z213 68_PEA_l_node_15 

AACTCCAGAAATCAGGAGACGGAGACATTTTGTCAGTTTTGCAACATTGGACCAAATACAATGAAGTATTCTTGCTG 
TGCTCTGGTTTTGGCTGTCCTGGGCACAGAATTGCTGGGAAGCCTCTGTTCGACTGTCAGATCCCCGAGGTTCAGAG 

GACGGATACAGCAGGAACGAAAA 
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<210> SEQ ID NO 1069 

<211> Length : 240 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1069 
>Z213 68_PEA_l_node_19 

GGTCCCTGCAAGTCATGAACAAAACGAGAAAGATTATGGAACATGGGGGGGCCACCTTCATCAATGCCTTTGTGACT 
ACACCCATGTGCTGCCCGTCACGGTCCTCCATGCTCACCGGGAAGTATGTGCACAATCACAATGTCTACACCAACAA 
CGAGAACTGCTCTTCCCCCTCGTGGCAGGCCATGCATGAGCCTCGGACTTTTGCTGTATATCTTAACAACACTGGCT 

ACAGAACAG 



<210> SEQ ID NO 1070 

<211> Length : 300 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1070 
> Z 2 1 3 6 8_PEA_l_node_2 

TCTTCATCTTGCGAGCACTTGGCAGACCGTCGCTAATGAATCTTGGGGCCGGTGTCGGGCCGGGGCGGCTTGATCGG 
CAACTAGGAAACCCCAGGCGCAGAGGCCAGGAGCGAGGGCAGCGAGGATCAGAGGCCAGGCCTTCCCGGCTGCCGGC 
GCTCCTCGGAGGTCAGGGCAGATGAGGAACATGACTCTCCCCCTTCGGAGGAGGAAGGAAGTCCCGCTGCCACCTTA 
TCTCTGCTCCTCTGCCTCCTCCCTGTTCCCAGAGCTTTTTCTCTAGAGAAGATTTTGAAGGCGGCTTTT 



<210> SEQ ID NO 1071 
<211> Length : 152 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1071 

> Z 2 1 3 6 8_PEA_l_node_2 1 

CCTTTTTTGGAAAATACCTCAATGAATATAATGGCAGCTACATCCCCCCTGGGTGGCGAGAATGGCTTGGATTAATC 
AAGAATTCTCGCTTCTATAATTACACTGTTTGTCGCAATGGCATCAAAGAAAAGCATGGATTTGATTATGCAAAG 

<210> SEQ ID NO 1072 

<211> Length : 176 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1072 

> Z 2 1 3 6 8_PEA_l_node_3 3 

CTGTATAACATGCTCGTGGAGACGGGGGAGCTGGAGAATACTTACATCATTTACACCGCCGACCATGGTTACCATAT 
TGGGCAGTTTGGACTGGTCAAGGGGAAATCCATGCCATATGACTTTGATATTCGTGTGCCTTTTTTTATTCGTGGTC 
CAAGTGTAGAACCAGGATCAAT 

<210> SEQ ID NO 1073 

<211> Length : 129 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1073 
> Z 2 1 3 6 8 JPE A_l_node_3 6 

AGTCCCACAGATCGTTCTCAACATTGACTTGGCCCCCACGATCCTGGATATTGCTGGGCTCGACACACCTCCTGATG 
TGGACGGCAAGTCTGTCCTCAAACTTCTGGACCCAGAAAAGCCAGGTAACAG 
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<210> SEQ ID NO 1074 

<211> Length : 279 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1074 
>Z213 68_PEA_l_node_37 

GTGTGTCATTGTTCCTCCTCTCAGCCAGCCCCAAATACACTGAGCTCCAGCTGGTGCCCAGAGCCAGCCAGCAGCTG 
AAGACATGGAGGCAGAATATGCCTTGCCCACAAGGATCACCCCAAGCTGAGCATTTCTCAGCTGCTTGTGAATAGCA 
TATTGATGGAGATGCACTCATGGTCTGTGGGAAGTGAGAGGTGTTTCTTTAAATAAGCTGTTAGCACAGATCCATTT 
GGAAAAACGTCCAGATGCCAAAAGTAAATATTATCATTTTGCTTTCAG 



<210> SEQ ID NO 1075 

<211> Length : 853 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1075 
> Z2 1 3 6 8_PEA_l_node_3 9 

GTAATTATTGGTTCCTGGGGTGCTTCTGGGAACCAGTCCTAGTGGGCAGCTTTCCCTGCTGAGTATTTTTTTTCTCC 
TTATTTTTGTTTACTAAGCATGCAGATTTCGTAAACCTAGTCACAAGATTGAATGGTTTGCTGCTTATTCTGTAGTG 
GTCAATAGAGTAATAATTGCTGGATCAGAATTGTAAAGAATAACCCTCAAGTTGGTTAATTGGTACAAAAACACAGT 
TAGATAGAAGTTATAGAATTTGATAGTATAGTTGGGACATTATCGTTAACAATAATTTATGTATATCTTAAAATAGC 
TAGAAGTGAAGAATTGCAAAGTTCCCAACACAAGGAAAAGATAAATGAGATGATGAATATCCCAATTATCTTGATTT 
GATCATTACACATTGTAGACTGGTATCCATATATCACACGTACCCCCAAAATATGTATAATTGTGATATATCAATTT 
TTAAAATACCAAAAAAGCAAGAGAATGACGACTCCACATCCCCCAAAAAGAATAAATTCTCATAAGCTTGGACCAAA 
GCCTTTATCATGGGTGTAGATTTACTGTTGCATTTCTCAGTGCTGGTTTCTAATCAGACCAGTGGATTGAGTTTCTC 
TACCATCCTCCCCACGTTCTTCTCTAAGCTGCCTCCAAGCCTCACCCGGCACCCTTCTTCCTACTTCCTACTTCTTT 
TCCTTGTGTGCCTTTCCTAGTTTTAAATAGATAAATGTATGCCATTGTAATTATTTCCATTGTCACTTCTGGGTTTC 
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CCCTTTTGGTTCATTAATACCCATTGCCTTGTTTTTCTCTGTACATAAATTAGGAGAGAGAAAATATTTGTATAATT 
TTTTTA 

<210> SEQ ID NO 1076 

<211> Length : 162 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1076 
>Z213 68_PEA_l_node_4 

GGATTCTTCACTTCTCTTGAACAAGGAACTCACTCAGAGACTAACACAAAGGAAGTAATTTCTTACCTGGTCATTAT 
TTAGTCTACAATAAGTTCATCCTTCTTCAGTGTGACCAGTAAATTCTTCCCATACTCTTGAAGAGAGCATAATTGGA 

ATGGAGAG 

<210> SEQ ID NO 1077 

<211> Length : 130 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1077 
> Z 2 1 3 6 8__PEA_l_node_4 1 

CAAATTTCTACGTAAGAAGGAAGAATCCAGCAAGAATATCCAACAGTCAAATCACTTGCCCAAATATGAACGGGTCA 
AAGAACTATGCCAGCAGGCCAGGTACCAGACAGCCTGTGAACAACCGGGGCAG 

<210> SEQ ID NO 1078 

<211> Length : 217 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1078 
>2213 6 8_PEA_l_node_4 3 

AAGTGGCAATGCATTGAGGATACATCTGGCAAGCTTCGAATTCACAAGTGTAAAGGACCCAGTGACCTGCTCACAG^ 
CCGGCAGAGCACGCGGAACCTCTACGCTCGCGGCTTCCATGACAAAGACAAAGAGTGCAGTTGTAGGGAGTCTGGTT 
ACCGTGCCAGCAGAAGCCAAAGAAAGAGTCAACGGCAATTCTTGAGAAACCAGGGGACTCCAA 



<210> SEQ ID NO 1079 

<211> Length : 256 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1079 
>Z2 1 3 68_PEA__l__node_4 5 

AGTACAAGCCCAGATTTGTCCATACTCGGCAGACACGTTCCTTGTCCGTCGAATTTGAAGGTGAAATATATGACATA 
AATCTGGAAGAAGAAGAAGAATTGCAAGTGTTGCAACCAAGAAACATTGCTAAGCGTCATGATGAAGGCCACAAGGG 
GCCAAGAGATCTCCAGGCTTCCAGTGGTGGCAACAGGGGCAGGATGCTGGCAGATAGCAGCAACGCCGTGGGCCCAC 

CTACCACTGTCCGAGTGACACACAA 

<210> SEQ XD NO 1080 

<211> Length : 176 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1080 
> Z 2 1 3 6 8_PEA_l__node_5 3 

GGAGGCTGCTCAGGAAGTAGATAGCAAACTGCAACTTTTCAAGGAGAACAACCGTAGGAGGAAGAAGGAGAGGAAGG 
AGAAGAGACGGCAGAGGAAGGGGGAAGAGTGCAGCCTGCCTGGCCTCACTTGCTTCACGCATGACAACAACCACTGG 

CAGACAGCCCCGTTCTGGAACC 
<210> SEQ ID NO 1081 
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<211> Length : 143 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1081 
>Z2 13 6 8_PEA_l__node_5 6 

TGGGATCTTTCTGTGCTTGCACGAGTTCTAACAATAACACCTACTGGTGTTTGCGTACAGTTAATGAGACGCATAAT 
TTTCTTTTCTGTGAGTTTGCTACTGGCTTTTTGGAGTATTTTGATATGAATACAGATCCTTATCAG 

<210> SEQ ID NO 1082 

<211> Length : 124 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1082 
>Z213 68_PEA_l_node_58 

CTCACAAATACAGTGCACACGGTAGAACGAGGCATTTTGAATCAGCTACACGTACAACTAATGGAGCTCAGAAGCTG 
TCAAGGATATAAGCAGTGCAACCCAAGACCTAAGAATCTTGATGTTG 

<210> SEQ ID NO 1083 

<211> Length : 588 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1083 
>Z 2 1 3 6 8_PEA_l_node_6 6 

AGGACAGTTATGGGATGGATGGGAAGGTTAATCAGCCCCGTCTCACTGCAGACATCAACTGGCAAGGCCTAGAGGAG 
CTACACAGTGTGAATGAAAACATCTATGAGTACAGACAAAACTACAGACTTAGTCTGGTGGACTGGACTAATTACTT 
GAAGGATTTAGATAGAGTATTTGCACTGCTGAAGAGTCACTATGAGCAAAATAAAACAAATAAGACTCAAACTGCTC 
AAAGTGACGGGTTCTTGGTTGTCTCTGCTGAGCACGCTGTGTCAATGGAGATGGCCTCTGCTGACTCAGATGAAGAC 
CCAAGGCATAAGGTTGGGAAAACACCTCATTTGACCTTGCCAGCTGACCTTCAAACCCTGCATTTGAACCGACCAAC 
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ATTAAGTCCAGAGAGTAAACTTGAATGGAATAACGACATTCCAGAAGTTAATCATTTGAATTCTGAACACTGGAGAA 
AAACCGAAAAATGGACGGGGCATGAAGAGACTAATCATCTGGAAACCGATTTCAGTGGCGATGGCATGACAGAGCTA 
GAGCTCGGGCCCAGCCCCAGGCTGCAGCCCATTCGCAGGCACCCGAAAG 

<210> SEQ ID NO 1084 

<211> Length : 585 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1084 
>Z213 68JPEA_l_node_67 

AACTTCCCCAGTATGGTGGTCCTGGAAAGGACATTTTTGAAGATCAACTATATCTTCCTGTGCATTCCGATGGAATT 
TCAGTTCATCAGATGTTCACCATGGCCACCGCAGAACACCGAAGTAATTCCAGCATAGCGGGGAAGATGTTGACCAA 
GGTGGAGAAGAATCACGAAAAGGAGAAGTCACAGCACCTAGAAGGCAGCGCCTCCTCTTCACTCTCCTCTGATTAGA 
TGAAACTGTTACCTTACCCTAAACACAGTATTTCTTTTTAACTTTTTTATTTGTAAACTAATAAAGGTAATCACAGC 
CACCAACATTCCAAGCTACCCTGGGTACCTTTGTGCAGTAGAAGCTAGTGAGCATGTGAGCAAGCGGTGTGCACACG 
GAGACTCATCGTTATAATTTACTATCTGCCAAGAGTAGAAAGAAAGGCTGGGGATATTTGGGTTGGCTTGGTTTTGA 
TTTTTTGCTTGTTTGTTTGTTTTGTACTAAAACAGTATTATCTTTTGAATATCGTAGGGACATAAGTATATACATGT 
TATCCAATCAAGATGGCTAGAATGGTGCCTTTCTGAGTGTCTAAAA 

<210> SEQ ID NO 1085 

<211> Length : 1,188 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1085 
>Z213 68_PEA_l_node_6 9 

TTTTGATTCATTTTTAACCACTGGAATTTTTCAATGCCGTCATTTTCAGTTAGATGATTTTGCACTTTGAGATTAAA 
ATGCCATGTCTATTTGATTAGTCTTATTTTTTTATTTTTACAGGCTTATCAGTCTCACTGTTGGCTGTCATTGTGAC 
AAAGTCAAATAAACCCCCAAGGACGACACACAGTATGGATCACATATTGTTTGACATTAAGCTTTTGCCAGAAAATG 
TTGCATGTGTTTTACCTCGACTTGCTAAAATCGATTAGCAGAAAGGCATGGCTAATAATGTTGGTGGTGAAAATAAA 
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TAAATAAGTAAACAAAATGAAGATTGCCTGCTCTCTCTGTGCCTAGCCTCAAAGCGTTCATCATACATCATACCTTT 
AAGATTGCTATATTTTGGGTTATTTTCTTGACAGGAGAAAAAGATCTAAAGATCTTTTATTTTCATCTTTTTTGGTT 
TTCTTGGCATGACTAAGAAGCTTAAATGTTGATAAAATATGACTAGTTTTGAATTTACACCAAGAACTTCTCAATAA 
AAGAAAATCATGAATGCTCCACAATTTCAACATACCACAAGAGAAGTTAATTTCTTAACATTGTGTTCTATGATTAT 
TTGTAAGACCTTCACCAAGTTCTGATATCTTTTAAAGACATAGTTCAAAATTGCTTTTGAAAATCTGTATTCTTGAA 
AATATCCTTGTTGTGTATTAGGTTTTTAAATACCAGCTAAAGGATTACCTCACTGAGTCATCAGTACCCTCCTATTC 
AGCTCCCCAAGATGATGTGTTTTTGCTTACCCTAAGAGAGGTTTTCTTCTTATTTTTAGATAATTCAAGTGCTTAGA 
TAAATTATGTTTTCTTTAAGTGTTTATGGTAAACTCTTTTAAAGAAAATTTAATATGTTATAGCTGAATCTTTTTGG 
TAACTTTAAATCTTTATCATAGACTCTGTACATATGTTCAAATTAGCTGCTTGCCTGATGTGTGTATCATCGGTGGG 
ATGACAGAACAAACATATTTATGATCATGAATAATGTGCTTTGTAAAAAGATTTCAAGTTATTAGGAAGCATACTCT 
GTTTTTTAATCATGTATAATATTCCATGATACTTTTATAGAACAATTCTGGCTTCAGGAAAGTCTAGAAGCAATATT 

TCTTCAAATAAAAGGTGTTTAAACTTTTTTCTG 



<210> SEQ ID NO 1086 

<211> Length : 45 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1086 
> Z 2 1 3 6 8_PE A_l_node__l 1 

GACCCTATCTGCAGATGTTCTGAATACCTCTGAGAATAGAGATTG 



<210> SEQ ID NO 1087 

<211> Length : 28 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1087 
>Z213 68_PEA_l_node_12 
ATTATTCAACCAGGATACCTAATTCAAG 
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<210> SEQ ID NO 1088 

<211> Length : 15 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1088 
> Z 2 1 3 6 8_PEA_l_n o de_l 6 
AACATCCGACCCAAC 



<210> SEQ ID NO 1089 

<211> Length : 40 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1089 
>Z213 68_PEA_l_node_17 

ATTATTCTTGTGCTTACCGATGATCAAGATGTGGAGCTGG 



<210> SEQ ID NO 1090 

<211> Length : 74 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 1090 
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>Z2 1 3 6 8_PEA_l_node_2 3 

GACTACTTCACAGACTTAATCACTAACGAGAGCATTAATTACTTCAAAATGTCTAAGAGAATGTATCCCCATAG 

<210> SEQ ID NO 1091 

<211> Length : 96 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1091 
>Z213 68_PEA_l_node_24 

GCCCGTTATGATGGTGATCAGCCACGCTGCGCCCCACGGCCCCGAGGACTCAGCCCCACAGTTTTCTAAACTGTACC 
CCAATGCTTCCCAACACAT 

<210> SEQ ID NO 1092 

<211> Length : 59 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1092 
>Z213 68_PEA_l_node_30 

AACTCCTAGTTATAACTATGCACCAAATATGGATAAACACTGGATTATGCAGTACACAG 

<210> SEQ ID NO 1093 

<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 1093 
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>Z2 1 3 68_PEA_l_node_31 

GACCAATGCTGCCCATCCACATGGAATTTACAAACATTCTACAGCGCAAAAGGCTCCAGACTTTGATGTCAGTGGAT 
GATTCTGTGGAGAGG 

<210> SEQ ID NO 1094 

<211> Length : 57 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1094 
>Z 2 13 6 8_PEA__l_node_3 8 

GTTTCGAACAAACAAGAAGGCCAAAATTTGGCGTGATACATTCCTAGTGGAAAGAGG 

<210> SEQ ID NO 1095 

<211> Length : 97 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1095 
>Z213 68_PEA_l_node_4 7 

GTGTTTTATTCTTCCCAATGACTCTATCCATTGTGAGAGAGAACTGTACCAATCGGCCAGAGCGTGGAAGGACCATA 
AGGCATACATTGACAAAGAG 

<210> SEQ ID NO 1096 
<211> Length : 95 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 1096 
>Z2 1 3 68_PEA_l_node_4 9 

ATTGAAGCTCTGCAAGATAAAATTAAGAATTTAAGAGAAGTGAGAGGACATCTGAAGAGAAGGAAGCCTGAGGAATG 
TAGCTGCAGTAAACAAAG 

<210> SEQ ID NO 1097 

<211> Length : 66 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1097 
> Z 2 1 3 6 8_PEA_l_node_5 1 

CTATTACAATAAAGAGAAAGGTGTAAAAAAGCAAGAGAAATTAAAGAGCCATCTTCACCCATTCAA 

<210> SEQ ID NO 1098 

<211> Length : 34 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1098 
>Z 2 1 3 6 8_PEA_l_node_61 

GAAATAAAGATGGAGGAAGCTATGACCTACACAG 



<210> SEQ ID NO 1099 
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<211> Length : 53 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1099 
> Z 2 1 3 6 8_PEA_l_node_6 8 

CTTGACACCCCTGGTAAATCTTTCAACACACTTCCACTGCCTGCGTAATGAAG 

<210> SEQ ID NO 1100 

<211> Length : 95 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1100 
> Z 2 1 3 6 8_PE A_l_node_7 

GTGCTGACGGCCACCCACCATCATCTAAAGAAGATAAACTTGGCAAATGACATGCAGGTTCTTCAAGGCAGAATAAT 
TGCAGAAAATCTTCAAAG 

Segment nucleic acid sequences: 

<210> SEQ ID NO 1101 

<211> Length : 148 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1101 
>HSKITCR_node_0 

GGCTCGGCTTTGCCGCGCTCGCTGCACTTGGGCGAGAGCTGGAACGTGGACCAGAGCTCGGATCCCATCGCAGCTAC 
CGCGATGAGAGGCGCTCGCGGCGCCTGGGATTTTCTCTGCGTTCTGCTCCTACTGCTTCGCGTCCAGACAG 
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<210> SEQ ID NO 1102 

<211> Length : 190 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1102 
>HSKITCR_node_ll 

ATAAAGGATTCATTAATATCTTCCCCATGATAAACACTACAGTATTTGTAAACGATGGAGAAAATGTAGATTTGATT 
GTTGAATATGAAGCATTCCCCAAACCTGAACACCAGCAGTGGATCTATATGAACAGAACCTTCACTGATAAATGGGA 

AGATTATCCCAAGTCTGAGAATGAAAGTAATATCAG 

<210> SEQ ID NO 1103 

<211> Length : 194 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1103 
>HSKITCR_node_17 

ATGCTCTGCTTCTGTACTGCCAGTGGATGTGCAGACACTAAACTCATCTGGGCCACCGTTTGGAAAGCTAGTGGTTC 
AGAGTTCTATAGATTCTAGTGCATTCAAGCACAATGGCACGGTTGAATGTAAGGCTTACAACGATGTGGGCAAGACT 

TCTGCCTATTTTAACTTTGCATTTAAAGGTAACAACAAAG 

<210> SEQ ID NO 1104 

<211> Length : 270 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1104 
>HSKITCR_node__2 

GCTCTTCTCAACCATCTGTGAGTCCAGGGGAACCGTCTCCACCATCCATCCATCCAGGAAAATCAGACTTAATAGTC 
CGCGTGGGCGACGAGATTAGGCTGTTATGCACTGATCCGGGCTTTGTCAAATGGACTTTTGAGATCCTGGATGAAAC 
GAATGAGAATAAGCAGAATGAATGGATCACGGAAAAGGCAGAAGCCACCAACACCGGCAAATACACGTGCACCAACA 
AACACGGCTTAAGCAATTCCATTTATGTGTTTGTTAGAG 



<210> SEQ ID NO 1105 

<211> Length : 127 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1105 
>HSKITCR_node_21 

AAACCCATGTATGAAGTACAGTGGAAGGTTGTTGAGGAGATAAATGGAAACAATTATGTTTACATAGACCCAACACA 
ACTTCCTTATGATCACAAATGGGAGTTTCCCAGAAACAGGCTGAGTTTTG 



<210> SEQ ID NO 1106 

<211> Length : 151 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1106 
>HSKITCR_node_27 

GGCCCACCCTGGTCATTACAGAATATTGTTGCTATGGTGATCTTTTGAATTTTTTGAGAAGAAAACGTGATTCATTT 
ATTTGTTCAAAGCAGGAAGATCATGCAGAAGCTGCACTTTATAAGAATCTTCTGCATTCAAAGGAGTCTTCCTG 
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<210> SEQ ID NO 1107 

<211> Length : 125 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1107 
>HSKITCR_ nodej 

GTAAATGCTTGGCTTTCTGCAGTGCTGTGCTTTCAAGAATTTAATATCCTGCTCTTAATTTTGGATGACATATGGAT 
GACTGAGCCATAGATAAAATATTTCTGGCTGGGTCTAGAAGGCCTAAA 

<210> SEQ ID NO 1108 

<211> Length : 128 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1108 
>HSKITCR_node_31 

GCTCATACATAGAAAGAGATGTGACTCCCGCCATCATGGAGGATGACGAGTTGGCCCTAGACTTAGAAGACTTGCTG 
AGCTTTTCTTACCAGGTGGCAAAGGGCATGGCTTTCCTCGCCTCCAAGAAT 

<210> SEQ ID NO 1109 

<211> Length : 123 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1109 
>HSKITCR_node_33 

TGTATTCACAGAGACTTGGCAGCCAGAAATATCCTCCTTACTCATGGTCGGATCACAAAGATTTGTGATTTTGGTCT 
AGCCAGAGACATCAAGAATGATTCTAATTATGTGGTTAAAGGAAAC 
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<210> SEQ ID NO 1110 

<211> Length : 1,321 

<212> Type : DTSfA 

<213> Organism : Homo sapiens 

<400> sequence : 1110 
>HSKITCR_node_3 4 

GTGAGTACCCATTCTCTGCTTGACAGTCCTGCAAAGGATTTTTAGTTTCAACTTTCGATAAAAATTGTTTCCTGTGA 
CTTTCATAATGTAAATCCTGTCTAGGGATATCACACATTTTAGCAGTCAAATTAAGTATACTTCAGCAAAATTTGCA 
TGGTATGCTGAACATTACTACAACTAACATTCAATAATAGAAGTCCTAATTCTAATTGTGTAATTTTGGGGCATGTG 
AAGGAAACAGAAATAGCCTTAATTTTCATTATAGCCTGAGAATAGCAATGAACTTGATTTTGCTCAAGTGTAACAAA 
TGTAGGTCATTGAAGGTCACAGCAGGAGAAATTTTGGGGGGATTGGCATGCCGTGTGAAAAATATTAAAATCTAAGA 
TCATATTCAGAGTTAGCCATATAGAATGTTGGATCCTAGAATACACGGAGAGCTATTAAATAGGTTCATAAGTAATA 
ATGGGTTTGTAGAACATAGCAAATGTTTTTCAAAGTGCTGTATTTCCTTGCCAATTTTTATTGAATCCTCAGAACAA 
ACCCTTGTCCTATAATTGCAGCACCTTCCTATTTTAAAAGTTGGAGAAACTAAGGCATGGCATATTTTAGTGCTTTT 
TATCTGAGGTGACACAGGACAATAGAGGCTGAAACTAGAGCCTGTGATTTCCAGAGTGGCATCCCATCGCCATGGAA 
AACCATATTGCCCTTTATAATATTCTCTGCCTGTCTTCCTGGGCCCATTTCTGCAGTCCAGATTTATTCCTGGAGGT 
GGATTTTTGTGCCCCTTATGATAATGTAATGTTTTCCCCCCATCACTGTGGCGTCTCTATTGGTAGCCTGTACTCGA 
TTTGGTTTTGTAAGAGTGTTTCTGTGTTTATCTCCTGGCCAAAGCATGAATTTCTGAAAAGGCTGCCCTGAAGCTAG 
ATATTATCATCAGGAGTGATAATAAAAACAACTTCTCTTCATCCTGGTGCAATTTTGGAAGGAAGAATGATAAGTCG 
TTAATTATCCAGTCCTCTGAATATTGACCAAAGGGAGGAGGAAGAAGGTGGCCGCCTCCAGGCCCAGTCTTGATGAA 
TTCTGCAATCTGTCTCCATTATTTCATGCCTGTGTAATTTTGTCAAGAGTTGTCCAATACCTCTGAATCTTTGTTTT 
TCTCCCCTTGGGAATAATAGTGATTCCTGACTCAGAAAGTTATGAGTTAATATATGTAAAGTGCTTATGACAGTGCC 
AGACATATTGTTACTATTCTTTATAGATGTCTTATTTACGATCTAAATAATCGTGACGGTGGTTACATACTGTCTTG 

TTATTGGTTAGT 



<210> SEQ ID NO 1111 

<211> Length : 194 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1111 
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>HSKITCR_node_3 6 

AATGCACTTGTGAGCCATGTATTTCAGAGGTGATTGGGATCATCTGAGTTCATATAGGTAAAAGGTTTTTGTGAGAT 
GGTACTCAAGTTATCACTCCACATTTCAGCAACAGCAGCATCTATAAGAATATCTTCTGTTCAATTTTGTTGAGCTT 
CTGAATTAACATTATTGACTCTGTTGTGCTTCTATTACAG 



<210> SEQ ID NO 1112 

<211> Length : 1,494 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1112 
>HSKITCR_node_4 4 

GTAGACCATTCTGTGCGGATCAATTCTGTCGGCAGCACCGCTTCCTCCTCCCAGCCTCTGCTTGTGCACGACGATGT 
CTGAGCAGAATCAGTGTTTGGGTCACCCCTCCAGGAATGATCTCTTCTTTTGGCTTCCATGATGGTTATTTTCTTTT 
CTTTCAACTTGCATCCAACTCCAGGATAGTGGGCACCCCACTGCAATCCTGTCTTTCTGAGCACACTTTAGTGGCCG 
ATGATTTTTGTCATCAGCCACCATCCTATTGCAAAGGTTCCAACTGTATATATTCCCAATAGCAACGTAGCTTCTAC 
CATGAACAGAAAACATTCTGATTTGGAAAAAGAGAGGGAGGTATGGACTGGGGGCCAGAGTCCTTTCCAAGGCTTCT 
CCAATTCTGCCCAAAAATATGGTTGATAGTTTACCTGAATAAATGGTAGTAATCACAGTTGGCCTTCAGAACCATCC 
ATAGTAGTATGATGATACAAGATTAGAAGCTGAAAACCTAAGTCCTTTATGTGGAAAACAGAACATCATTAGAACAA 
AGGACAGAGTATGAACACCTGGGCTTAAGAAATCTAGTATTTCATGCTGGGAATGAGACATAGGCCATGAAAAAAAT 
GATCCCCAAGTGTGAACAAAAGATGCTCTTCTGTGGACCACTGCATGAGCTTTTATACTACCGACCTGGTTTTTAAA 
TAGAGTTTGCTATTAGAGCATTGAATTGGAGAGAAGGCCTCCCTAGCCAGCACTTGTATATACGCATCTATAAATTG 
TCCGTGTTCATACATTTGAGGGGAAAACACCATAAGGTTTCGTTTCTGTATACAACCCTGGCATTATGTCCACTGTG 
TATAGAAGTAGATTAAGAGCCATATAAGTTTGAAGGAAACAGTTAATACCATTTTTTAAGGAAACAATATAACCACA 
AAGCACAGTTTGAACAAAATCTCCTCTTTTAGCTGATGAACTTATTCTGTAGATTCTGTGGAACAAGCCTATCAGCT 
TCAGAATGGCATTGTACTCAATGGATTTGATGCTGTTTGACAAAGTTACTGATTCACTGCATGGCTCCCACAGGAGT 
GGGAAAACACTGCCATCTTAGTTTGGATTCTTATGTAGCAGGAAATAAAGTATAGGTTTAGCCTCCTTCGCAGGCAT 
GTCCTGGACACCGGGCCAGTATCTATATATGTGTATGTACGTTTGTATGTGTGTAGACAAATATTTGGAGGGGTATT 
TTTGCCCTGAGTCCAAGAGGGTCCTTTAGTACCTGAAAAGTAACTTGGCTTTCATTATTAGTACTGCTCTTGTTTCT 
TTTCACATAGCTGTCTAGAGTAGCTTACCAGAAGCTTCCATAGTGGTGCAGAGGAAGTGGAAGGCATCAGTCCCTAT 
GTATTTGCAGTTCACCTGCACTTAAGGCACTCTGTTATTTAGACTCATCTTACTGTACCTGTTCCTTAGACCTTCCA 
TAATGCTACTGTCTCACTGAAACATTTAAAT 
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<210> SEQ ID NO 1113 

<211> Length : 402 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1113 
>HSKITCR_node_4 6 

TTTACCCTTTAGACTGTAGCCTGGATATTATTCTTGTAGTTTACCTCTTTAAAAACAAAACAAAACAAAACAAAAAA 
CTCCCCTTCCTCACTGCCCAATATAAAAGGCAAATGTGTACATGGCAGAGTTTGTGTGTTGTCTTGAAAGATTCAGG 
TATGTTGCCTTTATGGTTTCCCCCTTCTACATTTCXTAGACTACATTTAGAGAACTGTGGCCGTTATCTGGAAGTAA 
CCATTTGCACTGGAGTTCTATGCTCTCGCACCTTTCCAAAGTTAACAGATTTTGGGGTTGTGTTGTCACCCAAGAGA 
TTGTTGTTTGCCATACTTTGTCTGAAAAATTCCTTTGTGTTTCTATTGACTTCAATGATAGTAAGAAAAGTGGTTGT 

TAGTTATAGATGTCTAG 



<210> SEQ ID NO 1114 

<211> Length : 282 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1114 
>HSKITCR_node_5 

ATCCTGCCAAGCTTTTCCTTGTTGACCGCTCCTTGTATGGGAAAGAAGACAACGACACGCTGGTCCGCTGTCCTCTC 
ACAGACCCAGAAGTGACCAATTATTCCCTCAAGGGGTGCCAGGGGAAGCCTCTTCCCAAGGACTTGAGGTTTATTCC 
TGACCCCAAGGCGGGCATCATGATCAAAAGTGTGAAACGCGCCTACCATCGGCTCTGTCTGCATTGTTCTGTGGACC 
AGGAGGGCAAGTCAGTGCTGTCGGAAAAATTCATCCTGAAAGTGAGGCCAG 



<210> SEQ ID NO 1115 
<211> Length : 195 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1115 
>HSKITCR_node_50 

CAATAATGTCTTTTGAATATTCCCAAGCCCATGAGTCCTTGAAAATATTTTTTATATATACAGTAACTTTATGTGTA 
AATACATAAGCGGCGTAAGTTTAAAGGATGTTGGTGTTCCACGTGTTTTATTCCTGTATGTTGTCCAATTGTTGACA 

GTTCT GAAGAAT T C T A AT AAA AT G T AC AT AT AT AA AT C A AG 



<210> SEQ ID NO 1116 

<211> Length : 137 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1116 
>HSKITCR__node_7 

CCTTCAAAGCTGTGCCTGTTGTGTCTGTGTCCAAAGCAAGCTATCTTCTTAGGGAAGGGGAAGAATTCACAGTGACG 
TGCACAATAAAAGATGTGTCTAGTTCTGTGTACTCAACGTGGAAAAGAGAAAACAGTCAG 



<210> SEQ ID NO 1117 

<211> Length : 169 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1117 
>HSKITCR node 9 
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ACTAAACTACAGGAGAAATATAATAGCTGGCATCACGGTGACTTCAATTATGAACGTCAGGCAACGTTGACTATCAG 
TTCAGCGAGAGTTAATGATTCTGGAGTGTTCATGTGTTATGCCAATAATACTTTTGGATCAGCAAATGTCACAACAA 
CCTTGGAAGTAGTAG 

<210> SEQ ID NO 1118 

<211> Length : 116 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1118 
>HSKITCR_node_13 

ATACGTAAGTGAACTTCATCTAACGAGATTAAAAGGCACCGAAGGAGGCACTTACACATTCCTAGTGTCCAATTCTG 
ACGTCAATGCTGCCATAGCATTTAATGTTTATGTGAATA 

<210> SEQ ID NO 1119 

<211> Length : 115 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1119 
>HSKITCR_node__15 

CAAAACCAGAAATCCTGACTTACGACAGGCTCGTGAATGGCATGCTCCAATGTGTGGCAGCAGGATTCCCAGAGCCC 
ACAATAGATTGGTATTTTTGTCCAGGAACTGAGCAGAG 

<210> SEQ ID NO 1120 

<211> Length : 107 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1120 
>HSKITCR_node_l 9 

AGCAAATCCATCCCCACACCCTGTTCACTCCTTTGCTGATTGGTTTCGTAATCGTAGCTGGCATGATGTGCATTATT 
GTGATGATTCTGACCTACAAATATTTACAG 



<210> SEQ ID NO 1121 

<211> Length : 105 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1121 
>HSKITCR_node_2 3 

GGAAAACCCTGGGTGCTGGAGCTTTCGGGAAGGTTGTTGAGGCAACXGCTTATGGCTTAATTAAGTCAGATGCGGCC 
ATGACTGTCGCTGTAAAGATGCTCAAGC 



<210> SEQ ID NO 1122 

<211> Length : 111 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1122 
>HSKITCR_node_25 

CGAGTGCCCATTTGACAGAACGGGAAGCCCTCATGTCTGAACTCAAAGTCCTGAGTTACCTTGGTAATCACATGAAT 
ATTGTGAATCTACTTGGAGCCTGCACCATTGGAG 



<210> SEQ ID NO 1123 
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<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1123 
>HSKITCR_node_2 9 

CAGCGATAGTACTAATGAGTACATGGACATGAAACCTGGAGTTTCTTATGTTGTCCCAACCAAGGCCGACAAAAGGA 
GATCTGTGAGAATAG 



<210> SEQ ID NO 1124 

<211> Length : 112 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1124 
>HSKITCR_node_37 

GCTCGACTACCTGTGAAGTGGATGGCACCTGAAAGCATTTTCAACTGTGTATACACGTTTGAAAGTGACGTCTGGTC 
CTATGGGATTTTTCTTTGGGAGCTGTTCTCTTTAG 



<210> SEQ ID NO 1125 

<211> Length : 100 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1125 
>HSKITCR_node_39 

GAAGCAGCCCCTATCCTGGAATGCCGGTCGATTCTAAGTTCTACAAGATGATCAAGGAAGGCTTCCGGATGCTCAGC 
CCTGAACACGCACCTGCTGAAAT 
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<210> SEQ ID NO 1126 

<211> Length : 106 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1126 
>HSKITCR_node_41 

GTATGACATAATGAAGACTTGCTGGGATGCAGATCCCCTAAAAAGACCAACATTCAAGCAAATTGTTCAGCTAATTG 
AGAAGCAGATTTCAGAGAGCACCAATCAT 



<210> SEQ ID NO 1127 

<211> Length : 48 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1127 
>HSKITCR_node_43 

ATTTACTCCAACTTAGCAAACTGCAGCCCCAACCGACAGAAGCCCGTG 



<210> SEQ ID NO 1128 

<211> Length : 36 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1128 
>HSKITCR_node_4 7 

GTACTTCAGGGGCACTTCATTGAGAGTTTTGTCTTG 

<210> SEQ ID NO 1129 

<211> Length : 113 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1129 
>HSKITCR_node_4 8 

GATATTCTTGAAAGTTTATATTTTTATAATTTTTTCTTACATCAGATGTTTCTTTGCAGTGGCTTAATGTTTGAAAT 
TATTTTGTGGCTTTTTTTGTAAATATTGAAATGTAG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 1130 

<211> Length : 760 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1130 
>HUMGRP5E_node_0 

CCAAAATCTATGGGCTGGGACAGCAAAGATGTGGCCTACGAAGAGAAAGGTCTGGAGAATCAGAAGGCCTTCAAATG 
GTGGTTCCAAATCCCTCCAGCAAAGCCCATCCATCTTTAGAGCTCACCCGTCTCCAGCTACACCCCCCACCCCTCCC 
GGCCCAGATCAGGCAGCGGGGTCGCCCTCTCCAGGACTCTCAAGGCAGCTAAGGCTGGAGGCGCCGGCGAGCCTGGA 
GAGGGAGGAGTTCACTAAATTGTGTTGGATGGAAGGCGTCGAGGACCGGAGGAATTAATCCGATGTGGGGAAGGCGG 
ACGGGGCTACGAGGAAAAAAGAGGGGGCAATGTACACTCAGCCTTTTCATCACTCGGCGGGGAGATGGATGGTTTTC 
CGGACCGGGCGTCCCAGCGCCCCGGTTAGCTATAGGGAGACGTCAGAGCGCTCTGGTCCGCGATAGAAGAGCCCCCC 
AGCCCCCCCGCCCGGGCTTCCATATAAAGTAGGGGCCCTAGTGGAGGCCGCAGCAGTAGCACCAGCGGCTGCGGCGG 
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CGGAGCTCCTCCGAGGTCCGGGTCACCAGTCTCTGCTCTTCCCAGCCTCTCCGGCGCGCTCCAAGGGCTTCCCGTCG 
GGACCATGCGCGGCAGTGAGCTCCCGCTGGTCCTGCTGGCGCTGGTCCTCTGCCTGGCGCCCCGGGGGCGAGCGGTC 
CCGCTGCCTGCGGGCGGAGGGACCGTGCTGACCAAGATGTACCCGCGCGGCAACCACTGGGCGGTGG 



<210> SEQ ID NO 1131 

<211> Length : 224 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1131 
>HUMGRP5E_node_2 

GGCACTTAATGGGGAAAAAGAGCACAGGGGAGTCTTCTTCTGTTTCTGAGAGAGGGAGCCTGAAGCAGCAGCTGAGA 
GAGTACATCAGGTGGGAAGAAGCTGCAAGGAATTTGCTGGGTCTCATAGAAGCAAAGGAGAACAGAAACCACCAGCC 
ACCTCAACCCAAGGCCCTGGGCAATCAGCAGCCTTCGTGGGATTCAGAGGATAGCAGCAACTTCAAAGAT 



<210> SEQ ID NO 1132 

<211> Length : 359 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1132 
>HUMGRP5E_node_8 

GTTCTCAACGTGAAGGAAGGAACCCCCAGCTGAACCAGCAATGATAATGATGGCCTCTCTCAAAAGAGAAAAACAAA 
ACCCCTAAGAGACTGCGTTCTGCAAGCATCAGTTCTACGGATCATCAACAAGATTTCCTTGTGCAAAATATTTGACT 
ATTCTGTATCTTTCATCCTTGACTAAATTCGTGATTTTCAAGCAGCATCTTCTGGTTTAAACTTGTTTGCTGTGAAC 
AATTGTCGAAAAGAGTCTTCCAATTAATGCTTTTTTATATCTAGGCTACCTGTTGGTTAGATTCAAGGCCCCGAGCT 
GTTACCATTCACAATAAAAGCTTAAACACATTGTCCAAAGGGCAGGCTGTT 
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<210> SEQ ID NO 1133 

<211> Length : 19 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1133 

>HUMGRP5E_node_3 

GTAGGTTCAAAAGGCAAAG 

<210> SEQ ID NO 1134 

<211> Length : 14 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1134 

>HUMGRP5E_node_7 

ACTCTCTGCTCCAG 

Segment nucleic acid sequences: 

<210> SEQ ID NO 1135 

<211> Length : 178 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1135 
>D56406 PEA l_node_0 
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TTCACTCACTTTCAAAGCCAGCTGAAGGAAAGAGGAAGTGCTAGAGAGAGCCCCCTTCAGTGTGCTTCTGACTTTTA 
CGGACTTGGCTTGTTAGAAGGCTGAAAGATGATGGCAGGAATGAAAATCCAGCTTGTATGCATGCTACTCCTGGCTT 
TCAGCTCCTGGAGTCTGTGCTCAG 



<210> SEQ ID NO 1136 

<211> Length : 780 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1136 
>D5 64 0 6_PEA_l_node_13 

TTAATCCAGGAAGATATTCTTGATACTGGAAATGACAAAAATGGAAAGGAAGAAGTCATAAAGAGAAAAATTCCTTA 
TATTCTGAAACGGCAGCTGTATGAGAATAAACCCAGAAGACCCTACATACTCAAAAGAGATTCTTACTATTACTGAG 
AGAATAAATCATTTATTTACATGTGATTGTGATTCATCATCCCTTAATTAAATATCAAATTATATTTGTGTGAAAAT 
GTGACAAACACACTTATCTGTCTCTTCTACAATTGTGGTTTATTGAATGTGATTTTTCTGCACTAATATAAATTAGA 
CTAAGTGTTTTCAAATAAATCTAAATCTTCAGCATGATGTGTTGTGTATAATTGGAGTAGATATTAATTAAGTCACC 
TGTATAATGTTTTGTAATTTTGCAAAACATATCTTGAGTTGTTTAAACAGTCAAAATGTTTGATATTTTATACCAGC 
TTATGAGCTCAAAGTACTACAGCAAAGCCTAGCCTGCATATCATTCACCCAAAACAAAGTAATAGCGCCTCTTTTAT 
TATTTTGACTGAATGTTTTATGGAATTGAAAGAAACATACGTTCTTTTCAAGACTTCCTCATGAATCTCTCAATTAT 
AGGAAAAGTTATTGTGATAAAATAGGAACAGCTGAAAGATTGATTAATGAACTATTGTTAATTCTTCCTATTTTAAT 
GAATGACATTGAACTGAATTTTTTGTCTGTTAAATGAACTTGATAGCTAATAAAAAGACAACTAGCCATCAAAATCA 

AAAGTTTCTC 



<210> SEQ ID NO 1137 

<211> Length : 93 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1137 
>D56406 PEA 1 node_ll 
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GCACGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGACGGGCGGATCACGAGGTCAAGAGATGGAGAC 
CATCCCGGCTAACACG 

<210> SEQ ID NO 1138 

<211> Length : 6 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1138 
>D5 64 0 6__PE A_l_node_2 
ATTCAG 

<210> SEQ ID NO 1139 

<211> Length : 56 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1139 
>D5 64 0 6_PEA_l_node__3 

AAGAGGAAAT GAAAGCATT AGAAGC AGATT TCT T G AC C AAT AT GCAT AC ATC AAAG 

<210> SEQ ID NO 1140 

<211> Length : 115 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 1140 
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>D5 64 0 6_PEA_l_node_5 

ATTAGTAAAGCACATGTTCCCTCTTGGAAGATGACTCTGCTAAATGTTTGCAGTCTTGTAAATAATTTGAACAGCCC 
AGCTGAGGAAACAGGAGAAGTTCATGAAGAGGAGCTTG 

<210> SEQ ID NO 1141 

<211> Length : 34 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1141 
>D5 6 4 0 6_PEA_l_node_6 

TTGCAAGAAGGAAACTTCCTACTGCTTTAGATGG 

<210> SEQ ID NO 1142 

<211> Length : 26 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1142 
>D5 64 0 6_PEA_l_node_7 
CTTTAGCTTGGAAGCAATGTTGACAA 

<210> SEQ ID NO 1143 

<211> Length : 8 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1143 
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>D5 64 0 6__PEA_l_node_8 
T AT AC C AG 



<210> SEQ ID NO 1144 

<211> Length : 42 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1144 
>D5 6 4 0 6_PE A_l_node_9 

CTCCACAAAATCTGTCACAGCAGGGCTTTTCAACACTGGGAG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 1145 

<211> Length : 245 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1145 
>F0 5 0 68__PEA_l_node_0 

AAGAAAGGGAAGGCAACCGGGCAGCCCAGGCCCCGCCCCGCCGCTCCCCCACCCGTGCGCTTATAAAGCACAGGAAC 
CAGAGCTGGCCACTCAGTGGTTTCTTGGTGACACTGGATAGAACAGCTCAAGCCTTGCCACTTCGGGCTTCTCACTG 
CAGCTGGGCTTGGACTTCGGAGTTTTGCCATTGCCAGTGGGACGTCTGAGACTTTCTCCTTCAAGTACTTGGCAGAT 

CACTCTCTTAGCAG 



<210> SEQ ID NO 1146 
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<211> Length : 161 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1146 
>F0 50 68_PEA_l_node_10 

CTTCGGGACGTGCACGGTGCAGAAGCTGGCACACCAGATCTACCAGTTCACAGATAAGGACAAGGACAACGTCGCCC 
CCAGGAGCAAGATCAGCCCCCAGGGCTACGGCCGCCGGCGCCGGCGCTCCCTGCCCGAGGCCGGCCCGGGTCGGACT 

CTGGTGT 

<210> SEQ ID NO 1147 

<211> Length : 121 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1147 
>F0 5 0 6 8_PE A_l_node_l 2 

CCATGGTACAAGGAATAGTCGCGCAAGCATCCCGCTGGTGCCTCCCGGGACGAAGGACTTCCCGAGCGGTGTGGGGA 
CCGGGCTCTGACAGCCCTGCGGAGACCCTGAGTCCGGGAGGCAC 

<210> SEQ ID NO 1148 

<211> Length : 631 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1148 
>F050 68_PEA_l_node__13 

CGTCCGGCGGCGAGCTCTGGCTTTGCAAGGGCCCCTCCTTCTGGGGGCTTCGCTTCCTTAGCCTTGCTCAGGTGCAA 
GTGCCCCAGGGGGCGGGGTGCAGAAGAATCCGAGTGTTTGCCAGGCTTAAGGAGAGGAGAAACTGAGAAATGAATGC 
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TGAGACCCCCGGAGCAGGGGTCTGAGCCACAGCCGTGCTCGCCCACAAACTGATTTCTCACGGCGTGTCACCCCACC 
AGGGCGCAAGCCTCACTATTACTTGAACTTTCCAAAACCTAAAGAGGAAAAGTGCAATGCGTGTTGTACATACAGAG 
GTAACTATCAATATTTAAGTTTGTTGCTGTCAAGATTTTTTTTGTAACTTCAAATATAGAGATATTTTTGTACGTTA 
TATATTGTATTAAGGGCATTTTAAAAGCAATTATATTGTCCTCCCCCTATTTTAAGACGTGAATGTCTCAGCGAGGT 
GTAAAGTTGTTCGCCGCGTGGAATGTGAGTGTGTTTGTGTGCATGAAAGAGAAAGACTGATTACCTCCTGTGTGGAA 
GAAGGAAACACCGAGTCTCTGTATAATCTATTTACATAAAATGGGTGATATGCGAACAGCAAACCAATAAACTGTCT 
CAATGCTGAATAAAA 

<210> SEQ ID NO 1149 

<211> Length : 150 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1149 
>F05 0 68_PEA_l_node_4 

GTGAGTCCGGGCAGCGCCTTCCCCCTTGCTGGTACCTGGCAGGCAAGGGGAACTGACCGTTGGTCCCGAAGGTCTAG 
AAGTGAATGGGAGCAGGGACAGGCCTGGGCGTCACCTGAACGCACGCGAATCGGGTCTGCTTGTGTTTTCCAG 

<210> SEQ ID NO 1150 

<211> Length : 0 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1150 
>F050 68_PEA_l_node_8 

GTAACTACGCCCTGTGCTGTCCAGGGACGGGAGGGAAGGAAGGTGTGCGGGAGGAGTTCTCTGTCTCCACTCCCCTG 
GCCCGGGGGATCGTCGGGGCTGGACCGCAGCTCAGATGGCGCGAGCAGTTTCCAGCTCCCTCTGGCTCTAGAATGGC 
TCCCGTTCCCGGTGTTGGGGCCAAAGCTCTGCTTGATGGGGTCTCAAGTTGCCTTTCTTCCCCCTCCCCCCGCCCGC 
AG 
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<210> SEQ ID NO 1151 

<211> Length : 76 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1151 
>F0 50 68_PEA_l_node_ll 

CTTCTAAGCCACAAGCACACGGGGCTCCAGCCCCCCCGAGTGGAAGTGCTCCCCACTTTCTTTAGGATTTAGGCGC 

<210> SEQ ID NO 1152 

<211> Length : 119 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1152 
>F0 5068JPEA_l_node_3 

GGTCTGCGCTTCGCAGCCGGGATGAAGCTGGTTTCCGTCGCCCTGATGTACCTGGGTTCGCTCGCCTTCCTAGGCGC 
TGACACCGCTCGGTTGGATGTCGCGTCGGAGTTTCGAAAGAA 

<210> SEQ ID NO 1153 

<211> Length : 5 9 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1153 
>F050 68_JPEA_l_node_5 

GTGGAATAAGTGGGCTCTGAGTCGTGGGAAGAGGGAACTGCGGATGTCCAGCAGCTACC 
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<210> SEQ ID NO 1154 

<211> Length : 40 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1154 
>F050 68_PEA_l__node_6 

CCACCGGGCTCGCTGACGTGAAGGCCGGGCCTGCCCAGAC 

<210> SEQ ID NO 1155 

<211> Length : 51 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1155 
>F 0 5 0 6 8_PEA_l_node_7 

CCTTATTCGGCCCCAGGACATGAAGGGTGCCTCTCGAAGCCCCGAAGACAG 

<210> SEQ ID NO 1156 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1156 
>F0 50 68_PEA_l_node__9 

CAGTCCGGATGCCGCCCGCATCCGAGTCAAGCGCTACCGCCAGAGCATGAACAACTTCCAGGGCCTCCGGAGCTTTG 
GCTGCCG 

Segment nucleic acid sequences: 
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<210> SEQ ID NO 1157 

<211> Length : 573 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1157 
>H14 624_node_0 

TTATGCTCCCGCGGAGGCCAAGCGGACTCCCTGACAGGACAGAATCTGAACGTGAGAGTGAAGGTCTTGCCTGTCCA 
GAAACTCTTGTAGCCAGCACAGGTTTAAACAAGAAGCCAAATTGTTCTGGAGAGATTGCTGGGGGCTTTCTTTGTGC 
CTCAAGCTTCTTCAGTGCCCTGAGCACAGGAAACACTCAAGCAGAGAAGCAGAGCCAAACCCAGGATACGGGAGGTC 
GAGGCTCTTCCGTAGACCTGCAGCATTGGGGTGGGATGATGTTCATTCTGTGTGTGTTCTGGACCAAGCCCCTCTCC 
AGGGACCTATGGGCAGCCCCCTTTAAGCAAGATGCCCGGTGGAGTGGGCATCCACCATCACTTACCCTGGGCTTGGG 
TGAATAGATTTTCCGTGCCTTAAATGGGCAGGGAGGGGGTAAACATGGACGGTCCATTGGTACAAATAAAAGCCTTT 
GGTGGGTTTTGATCAATTGCAAGGATCGAAGGAGACCTGTGGACCTGAGGTCAACTGGCAGCAGAGAAGAGTCTGGG 
TTCGTGAAGGCGCCGCCGCGGTGCCGCGCCACGT 

<210> SEQ ID NO 1158 

<211> Length : 387 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1158 
>H14 624_node_16 

GTAAGCCTTCCCTCTTGCTTCCCCACTCCCTGCTGGGCTGAGACGCTCCCAGGAGATCCCGCCCCTGCCACGCATCC 
CAGTGCATCCCTGCTTGGGGTGCCAGTAGCGGGAAGGGCAGAAGTTCTGCCTGACCTGGTCTGTCATCACAACAAGC 
CTGTATCAAATTTGAGGCACCCCTCCCACGCCGCCCAAGTCTCGCGCATTCTCCTCCCGAGTTGTACCAGCTATACT 
TAAGGGCAGTTTAAAAATAAAACAAACAAACAAAAACAACAAAACTAAAAAAACGAAGAACTGAACGGCGGTTTAAA 
AAAAAATAGATACACGATTATTGTTAAAGATGCTAGCACTGGAGCTGCGCAGAGCGTTGGAAGTGGTGTTTGGTGGA 

GG 

<210> SEQ ID NO 1159 
<211> Length : 249 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1159 
>H14624_node_3 

ATTTGCATAAAAAAGGCCAAGAAAACTCTGGCTGTGCCCCAGCAACGGCTCATTCTGCTCCCCCGGGTCGGAGCCCC 
CCGGAGCTGCGCGCGGGCTTGCAGCGCCTCGCCCGCGCTGTCCTCCCGGTGTCCCGCTTCTCCGCGCCCCAGCCGCC 
GGCTGCCAGCTTTTCGGGGCCCCGAGTCGCACCCAGCGAAGAGAGCGGGCCCGGGACAAGCTCGAACTCCGGCCGCC 

TCGCCCTTCCCCGGCTCC 

<210> SEQ ID NO 1160 

<211> Length : 10 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1160 

>Hl4624_node_10 

GTGCTGGAGC 

<210> SEQ ID NO 1161 

<211> Length : 35 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1161 
>H14 624_node_ll 

AGGCCGGCGCTTGGATCCCGCTGGTCATGAAGCAG 



<210> SEQ ID NO 1162 
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<211> Length : 21 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1162 
>Hl4 624_node_12 
TGCCACCCGGACACCAAGAAG 

<210> SEQ ID NO 1163 

<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1163 
>Hl4624_node_13 

TTCCTGTGCTCGCTCTTCGCCCCCGTCTGCCTCGATGACCTAGACGAGACCATCCAGCCATGCCACTCGCTCTGCGT 
GCAGGTGAAGGACCG 

<210> SEQ ID NO 1164 

<211> Length : 60 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1164 
>H14 624_node_14 

CTGCGCCCCGGTCATGTCCGCCTTCGGCTTCCCCTGGCCCGACATGCTTGAGTGCGACCG 



<210> SEQ ID NO 1165 
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<211> Length : 71 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1165 
>H14 624_node_15 

TTTCCCCCAGGACAACGACCTTTGCATCCCCCTCGCTAGCAGCGACCACCTCCTGCCAGCCACCGAGGAAG 

<210> SEQ ID NO 1166 

<211> Length : 70 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1166 
>H14 62 4_node__4 

GCTCCCTCTGCCCCCTCGGGGTCGCGCGCCCACGATGCTGCAGGGCCCTGGCTCGCTGCTGCTGCTCTTC 

<210> SEQ ID NO 1167 

<211> Length : 11 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1167 
>H14 624_node_5 
CTCGCCTCGCA 

<210> SEQ ID NO 1168 
<211> Length : 24 
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<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1168 

>Hl4624_node_6 

CTGCTGCCTGGGCTCGGCGCGCGG 

<210> SEQ ID NO 1169 

<211> Length : 7 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1169 
>H14 624_node_7 
GCTCTTC 

<210> SEQ ID NO 1170 

<211> Length : 80 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1170 
>Hl4 624_node_8 

CTCTTTGGCCAGCCCGACTTCTCCTACAAGCGCAGCAATTGCAAGCCCATCCCGGCCAACCTGCAGCTGTGCCACGG 
CAT 

<210> SEQ ID NO 1171 
<211> Length : 55 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 1171 
>H14624_node_9 

CGAATACCAGAACATGCGGCTGCCCAACCTGCTGGGCCACGAGACCATGAAGGAG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 1172 

<211> Length : 213 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1172 
>H3 8 8 0 4_PEA_l_node_0 

GACTGGGTTGACCGATGCTGGGCAGCTGAGCGGACCAATCGGCCCCCTAGACTGAGACGTTGGCGTTTGAAATCAGC 
CAATGGCAGGTCTACACTGGAGCTTCCTCTCCGCCTCCTTCGCCTAGCCTGCGAGTGTTCTGAGGGAAGCAAGGAGG 
CGGCGGCGGCCGCAGCGAGTGGCGAGTAGTGGAAACGTTGCTTCTGAGGGGAGCCCAAG 

<210> SEQ ID NO 1173 

<211> Length : 432 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1173 
>H3 88 04_PEA_l_node_l 

GTAGGGAGGCGAGGCGACGGTGTGCGGGAGCGGGCTCTCCAGGGACTTCCCGGGTCCGCAACTGGCAGGGCCGTTCG 
ATTCGCAGGGGATCCCGTTTCGTTTCTGTTGTTTTCCCTTTATTTTTAGGAGTGCCCGGGGCGACGGGACCCCGGGA 
GAGGGGAAAGGGAACAGTCTGGGGTCCGGGCATCGCTGTGGGCCGGGCTGGGTTTAGGGGGACGGCGGTGCGGGCTG 
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GGCCGGTTTGGGCGCGGCGGGGGCCGGATGATGGGGCGAGTCCGGACCTTGGCGGGCGAGTGCTCGGCGCAGGCGCA 
AGCGCAGAGTCTCCTCGCGGTCGTCCTCTCGGCCCCTCCCTCTGGGGGGACCCCCAGTGCCAGGCTGTCAGTGCGCA 
GCCCCAGCCCGCGGGACCCCTGGGGACTCTGGGCGCCTGTTCTGCAG 

<210> SEQ ID NO 1174 

<211> Length : 159 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1174 
>H 3 8 8 0 4_PEA_l_no de_l 6 

GTATATACCCTCTCAGTGTCTGGAGACCGGCTGATTGTGGGAACAGCAGGCCGCAGAGTGTTGGTGTGGGACTTACG 
GAACATGGGTTACGTGCAGCAGCGCAGGGAGTCCAGCCTGAAATACCAGACTCGCTGCATACGAGCGTTTCCAAACA 

AGCAG 

<210> SEQ ID NO 1175 

<211> Length : 139 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1175 
>H3 8 8 0 4_PEA_l_node_l 9 

GGTTATGTATTAAGCTCTATTGAAGGCCGAGTGGCAGTTGAGTATTTGGACCCAAGCCCTGAGGTACAGAAGAAGAA 
GTATGCCTTCAAATGTCACAGACTAAAAGAAAATAATATTGAGCAGATTTACCCAGTCAATG 



<210> SEQ ID NO 1176 
<211> Length : 196 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 1176 
>H 3 8 8 0 4_PEA_l_node_2 4 

ATATTTGGGATCCATTTAACAAAAAGCGACTGTGCCAATTCCATCGGTACCCCACGAGCATCGCATCACTTGCCTTC 
AGTAATGATGGGACTACGCTTGCAATAGCGTCATCATATATGTATGAAATGGATGACACAGAACATCCTGAAGATGG 

TATCTTCATTCGCCAAGTGACAGATGCAGAAACAAAACCCAA 

<210> SEQ ID NO 1177 

<211> Length : 353 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1177 
>H3 8 8 0 4_PEA_l_node_2 5 

GTGAGTATGCTTCACCTGTATTTGAGCCTTTTCTTGCATTCAACCCAGGATTTATTAATTTTTCTAAATTCATGAAT 
AGCATTGTTGATGCCTGCTCGATATTACAGCTGACTGTAGGGTTGGAGTTGATGTTATCATGTTCTCCCAAGCTTTC 
AATATCCGTAGGTTGATAGACGTCTGATGGATAAAATTGTGCCTAGTTGTTTTGTAGAGAAGAATGTCAAACTCTTA 
TTCTTCTTGAATAGGCTCTATTATTTGAATCTCTGGAGTTATTACCAGCTCATTGCTTCAAAATTAAGTTGAGGAAT 
TCAAGAATAATTTATTTTAGTAAATTCTATTTAAGATGTTTAAGA 

<210> SEQ ID NO 1178 

<211> Length : 590 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1178 
>H3 8 8 0 4__PEA_l_node_2 8 

TATTAACACAAAGTAAGTGACCTTCAGGTCTTATTGGAAACTCAGAGTAATATGGCCTTGCCTGGAATTGCAAATTT 
CCTTAGTTTTGAAATTTTCATAGATGTCTTTGGTTCTTGGTTGTAACTGTTGACTGAGAAGAGCCATTTACATTTTT 
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TGATACCAACAGGGCAAAGCTTTTTACTTAATTACCTCTACCAGGCTTTAAGGGAAATCTGATACTTCAGCATGTGT 
TAAACTATAAAATACCTACTCCAAGTATCTGCCCAGTTCCTTGTCCCCTCTCCCCAGGCCCTTAAAGGAAGTTCTCG 
ATACATATTTGTAGAATAACTGAATGTTTTCAGGATTCCTGTACTTTGCTGAGTTAAAATGGATATGGTACCCTTGC 
TGATTGGTTGAGCCCCTAAGAGGGGGCAGAATATTAAATATTCCATATCAGATATGCTTTTACAGGTTTGACTTTAG 
AAAAGTCTTAGCATGTGAAGCCTGTTGGATAAAGGGCTGTGTTTGCATTTAATCTGTCACTTTTGTATCTCCTGTCC 
TGGCTGGCCATTTTGATCTCATGCTGTTCTTTTTTTCTTTTGAACTTGTAG 

<210> SEQ ID NO 1179 

<211> Length : 1,228 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1179 
>H38804_PEA_l_node_2 9 

GTCACCATGTACTTGACAAGATTTCATTTACTTAAGTGCCATGTTGATGATAATAAAACAATTCGTACTCCCCAATG 
GTGGATTTATTACTATTAAAGAAACCAGGGAAAATATTAATTTTAATATTATAACAACCTGAAAATAATGGAAAAGA 
GGTTTTTGAATTTTTTTTTTTAAATAAACACCTTCTTAAGTGCATGAGATGGTTTGATGGTTTGCTGCATTAAAGGT 
ATTTGGGCAAACAAAATTGGAGGGCAAGTGACTGCAGTTTTGAGAATCAGTTXTGACCTTGATGATTTTTTGTTTCC 
ACTGTGGAAATAAATGTTTGTAAATAAGTGTAATAAAAATCCCTTTGCATTCTTTCTGGACCTTAAATGGTAGAGGA 
AAAGGCTCGTGAGCCATTTGTTTCTTTTGCTGGTTATAGTTGCTAATTCTAAAGCTGCTTCAGACTGCTTCATGAGG 
AGGTTAATCTACAATTAAACAATATTTCCTCTTGGCCGTCCATTATTTTCTGAAGCAGATGGTTCATCATTTCCTGG 
GCTGTTAAACAAAGCGAGGTTAAGGTTAGACTCTTGGGAATCAGCTAGTTTTCAATCTTATTAGGGTGCAGAAGGAA 
AACTAATAAGAAAACCTCCTAATATCATTTTGTGACTGTAAACAATTATTTATTAGCAAACAATTGATCCCAGAAGG 
GCAAATTGTTTGAGTCAGTAATGAGCTGAGAAAAGACAGAGCATATCTGTGTATTTGGAAAAATAATTGTAACGTAA 
TTGCAGTGCATTTAGACAGGCATCTATTTGGACCTGTTTCTATCTCTAAATGAATTTTTGGAAACATTAATGAGGTT 
TACATATTTCTCTGACATTTATATAGTTCTTATGTCCATTTCAGTTGACCAGCCGCTGGTGATTAAAGTTAAAAAGA 
AAAAAATTATAGTGAGAATGAGATTCATTTCAATGTAATGCACTAAAGCAGAACACGAACTTAGCTTGGCCTATTCT 
AGGTAGTTCCAAATAGTATTTTTGTTGTCAAACTTTAAAATTTATATTAATTTGCAAATGTATGTCTCTGAGTAGGA 
CTTGGACCTTTCCTGAGATTTATTTTATCCGTGATGTATTTTTTTTAATTCTTTTGATACAGAGAAGGGTCTTTTTT 
TTTTTTAAGTATTTCAGTGAAAACTTGGTGTAAGTCTGAACCCATCTTTTGAAATGTATTTTCTTCATTGCAG 



<210> SEQ ID NO 1180 
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<211> Length : 326 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1180 
>H 3 8 8 0 4_PEA_l_node_3 0 

GTCCACCTAATCATCCTGTGAAAGTGGTTTCTCTATGGAAAGCTTTGTTTGCTTCCTACAAATACATGCTTATTCCT 
TAAGGGATGTGTTAGAGTTACTGTGGATTTCTCTGTTTTCTGTCTTACAAGAAACTTGTCTATGTACCTTAATACTT 
TGTTTAGGATGAGGAGTCTTTGTGTCCCTGTACAGTAGTCTGACGTATTTCCCCTTCTGTCCCCTAGTAAGCCCAGT 
TGCTGTATCTGAACAGTTTGAGCTCTTTTTGTAATATACTCTAAACCTGTTATTTCTGTGCTAATAAACGAGATGCA 
GAACCCTTGAAAAATGGA 

<210> SEQ ID NO 1181 

<211> Length : 70 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1181 
>H3 8 8 0 4_PE A_l_node_l 0 

GATCCAACGCATGCCTGGAGTGGAGGACTAGATCATCAATTGAAAATGCATGATTTGAACACTGATCAAG 

<210> SEQ ID NO 1182 

<211> Length : 39 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1182 
>H3 8 8 0 4_PEA_l_node_l 2 

AAAATCTTGTTGGGACCCATGATGCCCCTATCAGATGTG 
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<210> SEQ ID NO 1183 

<211> Length : 79 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1183 
>H3 8 8 0 4__PEA_l_node_l 3 

TTGAATACTGTCCAGAAGTGAATGTGATGGTCACTGGAAGTTGGGATCAGACAGTTAAACTGTGGGATCCCAGAACT 
CC 

<210> SEQ ID NO 1184 

<211> Length : 34 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1184 
>H3 8 8 0 4_PEA_l_node_l 4 

TTGTAATGCTGGGACCTTCTCTCAGCCTGAAAAG 

<210> SEQ ID NO 1185 

<211> Length : 33 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1185 
>H3 8 8 0 4_PEA_l_node_2 

ATGACCGGTTCTAACGAGTTCAAGCTGAACCAG 
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<210> SEQ ID NO 1186 

<211> Length : 39 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1186 
>H3 8 8 0 4 JPEA_l_node_2 0 

CCATTTCTTTTCACAATATCCACAATACATTTGCCACAG 

<210> SEQ ID NO 1187 

<211> Length : 21 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1187 
>H3 88 0 4_PE A_l_node_2 3 
GTGGTTCTGATGGCTTTGTAA 

<210> SEQ ID NO 1188 

<211> Length : 48 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1188 
>H3 8 8 0 4_PEA_l_node_2 6 

ATTTGAACTGCCAAAAATCTTTCCTCTCCACAGAGGTTGTTTCTTTAA 
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<210> SEQ ID NO 1189 

<211> Length : 38 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1189 
>H3 8 8 0 4_PEA_l_n ode_3 

CCACCCGAGGATGGCATCTCCTCCGTGAAGTTCAGCCC 

<210> SEQ ID NO 1190 

<211> Length : 111 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1190 
>H3 8 8 0 4_PEA_l_node_4 

CAACACCTCCCAGTTCCTGCTTGTCTCCTCCTGGGACACGTCCGTGCGTCTCTACGATGTGCCGGCCAACTCCATGC 
GGCTCAAGTACCAGCACACCGGCGCCGTCCTGGA 

<210> SEQ ID NO 1191 

<211> Length : 13 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1191 
>H3 8 8 0 4_PEA_l_node_5 
CTGCGCCTTCTAC 



WO 2006/131783 



PCT/IB2005/004037 



524 

Segment nucleic acid sequences: 

<210> SEQ ID NO 1192 

<211> Length : 257 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1192 
>HSENA7 8_node__0 

AGTGGGGAGAGATGAGTGTAGATAAAAGGAGTGCAGAAGGCACGAGGAAGCCACAGTGCTCCGGATCCTCCAATCTT 
CGCTCCTCCAATCTCCGCTCCTCCACCCAGTTCAGGAACCCGCGACCGCTCGCAGCGCTCTCTTGACCACTATGAGC 
CTCCTGTCCAGCCGCGCGGCCCGTGTCCCCGGTCCTTCGAGCTCCTTGTGCGCGCTGTTGGTGCTGCTGCTGCTGCT 
GACGCAGCCAGGGCCCATCGCCAGCG 

<210> SEQ ID NO 1193 

<211> Length : 133 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1193 
>HSENA7 8_node_2 

CTGGTCCTGCCGCTGCTGTGTTGAGAGAGCTGCGTTGCGTTTGTTTACAGACCACGCAGGGAGTTCATCCCAAAATG 
ATCAGTAATCTGCAAGTGTTCGCCATAGGCCCACAGTGCTCCAAGGTGGAAGTGGT 

<210> SEQ ID NO 1194 
<211> Length : 1,786 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 1194 
>HSENA7 8_node_6 

TGGAAACAAGGAAAACTGATTAAGAGAAATGAGCACGCATGGAAAAGTTTCCCAGTCTTCAGCAGAGAAGTTTTCTG 

GAGGTCTCTGAACCCAGGGAAGACAAGAAGGAAAGATTTTGTTGTTGTTTGTTTATTTGTTTTTCCAGTAGTTAGCT 

TTCTTCCTGGATTCCTCACTTTGAAGAGTGTGAGGAAAACCTATGTTTGCCGCTTAAGCTTTCAGCTCAGCTAATGA 

AGTGTTTAGCATAGTACCTCTGCTATTTGCTGTTATTTTATCTGCTATGCTATTGAAGTTTTGGCAATTGACTATAG 

TGTGAGCCAGGAATCACTGGCTGTTAATCTTTCAAAGTGTCTTGAATTGTAGGTGACTATTATATTTCCAAGAAATA 

TTCCTTAAGATATTAACTGAGAAGGCTGTGGATTTAATGTGGAAATGATGTTTCATAAGAATTCTGTTGATGGAAAT 

ACACTGTTATCTTCACTTTTATAAGAAATAGGAAATATTTTAATGTTTCTTGGGGAATATGTTAGAGAATTTCCTTA 

CTCTTGATTGTGGGATACTATTTAATTATTTCACTTTAGAAAGCTGAGTGTTTCACACCTTATCTATGTAGAATATA 

TTTCCTTATTCAGAATTTCTAAAAGTTTAAGTTCTATGAGGGCTAATATCTTATCTTCCTATAATTTTAGACATTCT 

TTATCTTTTTAGTATGGCAAACTGCCATCATTTACTTTTAAACTTTGATTTTATATGCTATTTATTAAGTATTTTAT 

TAGGAGTACCATAATTCTGGTAGCTAAATATATATTTTAGATAGATGAAGAAGCTAGAAAACAGGCAAATTCCTGAC 

TGCTAGTTTATATAGAAATGTATTCTTTTAGTTTTTAAAGTAAAGGCAAACTTAACAATGACTTGTACTCTGAAAGT 

TTTGGAAACGTATTCAAACAATTTGAATATAAATTTATCATTTAGTTATAAAAATATATAGCGACATCCTCGAGGCC 

CTAGCATTTCTCCTTGGATAGGGGACCAGAGAGAGCTTGGAATGTTAAAAACAAAACAAAACAAAAAAAAACAAGGA 

GAAGTTGTCCAAGGGATGTCAATTTTTTATCCCTCTGTATGGGTTAGATTTTCCAAAATCATAATTTGAAGAAGGCC 

AGCATTTATGGTAGAATATATAATTATATATAAGGTGGCCACGCTGGGGCAAGTTCCCTCCCCACTCACAGCTTTGG 

CCCCTTTCACAGAGTAGAACCTGGGTTAGAGGATTGCAGAAGACGAGCGGCAGCGGGGAGGGCAGGGAAGATGCCTG 

TCGGGTTTTTAGCACAGTTCATTTCACTGGGATTTTGAAGCATTTCTGXCTGAATGTAAAGCCTGTTCTAGTCCTGG 

TGGGACACACTGGGGTTGGGGGTGGGGGAAGATGCGGTAATGAAACCGGTTAGTCAGTGTTGTCTTAATATCCTTGA 

TAATGCTGTAAAGTTTATTTTTACAAATATTTCTGTTTAAGCTATTTCACCTTTGTTTGGAAATCCTTCCCTTTTAA 

AGAGAAAATGTGACACTTGTGAAAAGGCTTGTAGGAAAGCTCCTCCCTTTTTTTCTTTAAACCTTTAAATGACAAAC 

CTAGGTAATTAATGGTTGTGAATTTCTATTTTTGCTTTGTTTTTAATGAACATTTGTCTTTCAGAATAGGATTCTGT 

GATAATATTTAAATGGCAAAAACAAAACATAATTTTGTGCAATTAACAAAGCTACTGCAAGAAAAATAAAACATTTC 

TTGGTAAAAACGTAT 

<210> SEQ ID NO 1195 

<211> Length : 153 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1195 
>HSENA7 8_node_9 

ATATAATATATATTATATATTTAGCATTGCTGAGCTTTTTAGATGCCTATTGTGTATCTTTTAAAGGTTTTGACCAT 
TTTGTTATGAGTAATTACATATATATTACATTCACTATATTAAAATTGTACTTTTTTACTATGTGTCTCATTGGTT 

<210> SEQ ID NO 1196 

<211> Length : 110 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1196 
>HSENA7 8_node_3 

GTAAGTTCTGTGCTGCTGTGTCCGCTGTGACCTTGGCAAGAGAGAAATCCCGCAGCCTGGGTCTTCAACCTTGGTAT 
CTCATGAGTGTATCTTCTTTTTCTTTCCTTCAG 

<210> SEQ ID NO 1197 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1197 
>HSENA7 8_node_4 

AGCCTCCCTGAAGAACGGGAAGGAAATTTGTCTTGATCCAGAAGCCCCTTTTCTAAAGAAAGTCATCCAGAAAATTT 
TGGACGG 

<210> SEQ ID NO 1198 

<211> Length : 23 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1198 
>HSENA7 8_node_8 
GTATTTATATATTATATATTTAT 



Segment nucleic acid sequences: 

<210> SEQ ID NO 1199 

<211> Length : 139 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1199 
>HUMODCA_node_l 

GTGCGTCTCCATGGCGACCCGCCGGTGCTATAAGTAGGGAGCGGCGTGCCGTGGGGCTTTGTCAGTCCCTCCTGTAG 
CCGCCGCCGCCGCCGCCCGCCGCCCCTCTGCCAGCAGCTCCGGCGCCACCTCGGGCCGGCGT 

<210> SEQ ID NO 1200 

<211> Length : 135 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1200 
> HUMO DC A_n o de_2 5 

GTTGGTTTTGCGGATTGCCACTGATGATTCCAAAGCAGTCTGTCGTCTCAGTGTGAAATTCGGTGCCACGCTCAGAA 
CCAGCAGGCTCCTTTTGGAACGGGCGAAAGAGCTAAATATCGATGTTGTTGGTGTCAG 



<210> SEQ ID NO 1201 
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<211> Length : 163 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1201 
>HUMODCA_node_32 

ATCACCGGCGTAATCAACCCAGCGTTGGACAAATACTTTCCGTCAGACTCTGGAGTGAGAATCATAGCTGAGCCCGG 
CAGATACTATGTTGCATCAGCTTTCACGCTTGCAGTTAATATCATTGCCAAGAAAATTGTATTAAAGGAACAGACGG 

GCTCTGATG 

<210> SEQ ID NO 1202 

<211> Length : 215 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1202 
>HUMODCA_node_3 6 

AGACCTAAACCAGATGAGAAGTATTATTCATCCAGCATATGGGGACCAACATGTGATGGCCTCGATCGGATTGTTGA 
GCGCTGTGACCTGCCTGAAATGCATGTGGGTGATTGGATGCTCTTTGAAAACATGGGCGCTTACACTGTTGCTGCTG 
CCTCTACGTTCAATGGCTTCCAGAGGCCGACGATCTACTATGTGATGTCAGGGCCTGCGTG 

<210> SEQ ID NO 1203 

<211> Length : 173 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1203 
>HUMODCA node 3 9 
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GATGCCAGCACCCTGCCTGTGTCTTGTGCCTGGGAGAGTGGGATGAAACGCCACAGAGCAGCCTGTGCTTCGGCTAG 
TATTAATGTGTAGATAGCACTCTGGTAGCTGTTAACTGCAAGTTTAGCTTGAATTAAGGGATTTGGGGGGACCATGT 
AACTTAATTACTGCTAGTT 

<210> SEQ ID NO 1204 

<211> Length : 1,096 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1204 
>HUMODCA_node_41 

TTGAATATTTGTTTTATATGGATTTTTATTCACTCTTCAGACACGCTACTCAAGAGTGCCCCTCAGCTGCTGAACAA 
GCATTTGTAGCTTGTACAATGGCAGAATGGGCCAAAAGCTTAGTGTTGTGACCTGTTTTTAAAATAAAGTATCTTGA 
AATAATTAGGCATTGGGACGTTTTTATGGTGTGTTCATTCCAGACAGTTCACGAATCCCGTATAGCTCGCTCTGATT 
CTCAGAGAACAATGAGTGGGTCCACCCACACACAGGTAGGAGGACAGGTGAGACGGAAGCCCCATCCTCCCATGTGG 
ACGGTGCACATCTGCTCAGCCCACCCCACATGTCCAGAGTTGGCTGCAAACTCCTTGTCCAGAGCCTCTGGTGGTGG 
GACCTACTTAAGTCTGACGGACCTGTCCTGTCCAGGCCAGTGCCCAGGGAAGGTGTGGGAGGCCCTTTGAGCCTGGC 
CTGCAGAGACCATCCGTGTCCCCTCCCACCTTCATGCCTGTGAGAAGTTAGGAATGTATACGGTACCACATTTGGCA 
GTCAGCTTATTTTAATAAATTCAGCAACAGCAAGTCCCTACCATGTTGTGTATCTTCACCATCTTGTCTGACCATGA 
CCACTGGCCTTGTGTGTTCTTTTACTCAACGTGTACCCCCGCTCTCCCCCAAAGTGTGGCAGGCTCTCATGCTCCTT 
AACCCCCATTGTGGCAATGTCTTACGGGTAACGCTGGAGCTGCAGGAGGAGGGAAGGACACGTCAGAGCCACCAGGC 
AGTGGGAGCATCTTGGAGTCCCCACCAGCCTCATGAGGGGGACAGGAAGAGAGCAAATGTGTAGGGAGGAAGGCTGT 
GGCTCCTCCCGGGGTGGGAAGGTCAAGCCGATGCTGTCACCCATTTACCAAAGCTGAAGAGAGTGACTTCCTTTCTC 
AAAAGCATCACCTTCCCCTGAACCCTGAGTCCAGAGAAGCCAGGAGCCCTCATGTGGCTGCCGAGTTAGCTCAGGGC 
TTGGCTTATCACCAACTCTGGTCTCCCTGGGCCAGGGTTGCCAAAACATGAAAGATTTTTTCAGGAGCCAGAGGTTG 

GTTCTGACTGGAGGGGGA 

<210> SEQ ID NO 1205 

<211> Length : 117 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1205 
>HUMODCA_node_0 

GACGTCGGCCCGCCGGCGCCCCACCAGCTCCGCGCGGGCCCGGGTTGGCCACCGCCGGGCCCCCGCCCCTCCCCCGG 
CGGTGTCCCGGCCGGAACCGATCGTGGCTGGTTTGAGCTG 

<210> SEQ ID NO 1206 

<211> Length : 110 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1206 
>HUMODCA_node_l 0 

ATTGTCACTGCTGTTCCAAGGGCACACGCAGAGGGATTTGGAATTCCTGGAGAGTTGCCTTTGTGAGAAGCTGGAAA 
TATTTCTTTCAATTCCATCTCTTAGTTTTCCAT 

<210> SEQ ID NO 1207 

<211> Length : 92 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1207 
>HUMODCA_node_l 2 

AGGAACATCAAGAAATCATGAACAACTTTGGTAATGAAGAGTTTGACTGCCACTTCCTCGATGAAGGTTTTACTGCC 
AAGGACATTCTGGAC 

<210> SEQ ID NO 1208 
<211> Length : 27 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 1208 

>HUMODCA_node_13 

CAGAAAATTAATGAAGTTTCTTCTTCT 

<210> SEQ ID NO 1209 

<211> Length : 72 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1209 
>HUMODCA_node_2 

CTCCGGCGGGCGGGAGCCAGGCGCTGACGGGCGCGGCGGGGGCGGCCGAGCGCTCCTGCGGCTGCGACTCAG 

<210> SEQ ID NO 1210 

<211> Length : 82 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1210 
>HUMODCA_node_27 

CTTCCATGTAGGAAGCGGCTGTACCGATCCTGAGACCTTCGTGCAGGCAATCTCTGATGCCCGCTGTGTTTTTGACA 
TGGGG 

<210> SEQ ID NO 1211 
<211> Length : 56 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 1211 
>HUMODCA_node_3 

GCTCCGGCGTCTGCGCTTCCCCATGGGGCTGGCCTGCGGCGCCTGGGCGCTCTGAG 

<210> SEQ ID NO 1212 

<211> Length : 84 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1212 
>HUMODCA_node_3 0 

GCTGAGGTTGGTTTCAGCATGTATCTGCTTGATATTGGCGGTGGCTTTCCTGGATCTGAGGATGTGAAACTTAAATT 
TGAAGAG 

<210> SEQ ID NO 1213 

<211> Length : 113 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1213 
>HUMODCA_node__34 

ACGAAGATGAGTCGAGTGAGCAGACCTTTATGTATTATGTGAATGATGGCGTCTATGGATCATTTAATTGCATACTC 
TATGACCACGCACATGTAAAGCCCCTTCTGCAAAAG 

<210> SEQ ID NO 1214 
<211> Length : 55 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 1214 
>HUMODCA_node_3 8 

GCAACTCATGCAGCAATTCCAGAACCCCGACTTCCCACCCGAAGTAGAGGAACAG 

<210> SEQ ID NO 1215 

<211> Length : 94 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1215 
>HUMODCA_node_4 0 

TTGAAATGTCTTTGTAAGAGTAGGGTCGCCATGATGCAGCCATATGGAAGACTAGGATATGGGTCACACTTATCTGT 
GTTCCTATGGAAACTAT 



Segment nucleic acid sequences: 

<210> SEQ ID NO 1216 

<211> Length : 271 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1216 
>R002 99_node_2 

GCGGCCGCAGAGCACTTTGCCCGGAGCCCAGCGTCCTCCCCTGAGTTCGCTGAGTCTCCCGGGACCAGCAAAGGCTG 
CGCGCCCCGCATCGGCCCGGAGGCGGGGAGCCCTGGGAGGCCTGGCCGAGCTGCCCGCAGGGAAATGGCGGAGAAAG 
CGCTTCTCTGCCCGAGTTCAGCCGGGCTGGGGACTTGGCCCTGGGTCCTGAACTCGGCATGGCCAGTTCTGCCTCTG 
GCTGTGGACCAGGGTGTGGACTGGAGACCGCGGGGGCCAG 
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<210> SEQ ID NO 1217 

<211> Length : 172 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1217 
>R0 02 9 9_node_3 0 

GGGATCGACATTGAGACCAAGATGCACGTCCGCTTCCTTAACATGGAAACCATGGCCCTCTGCCACTGACCCACCGC 

CACCTCCGCGGAGAAACTGCACTTTGCAATGGGGCCGCCTCCCCGCGTAGCTGGAGCAGCCCAGGCCCGGCGGACAG 
CCTCTTCCTGCAGCGCCG 

<210> SEQ ID NO 1218 

<211> Length : 77 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1218 
>R002 9 9_node_10 

GAGAACTTCAACAATGTCCCGGACCTGGAGCTCAACCCCATCCGATCCAAAATTGTTCGTGCCTTCTTCGACAACAG 

<210> SEQ ID NO 1219 

<211> Length : 115 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1219 
>R00299 node 14 



WO 2006/131783 



PCT/IB2005/004037 



535 

GAACCTGCGCAAGGGACCCAGTGGCCTGGCTGATGAGATCAATTTCGAGGACTTCCTGACCATCATGTCCTACTTCC 
GGCCCATCGACACCACCATGGACGAGGAACAGGTGGAG 

<210> SEQ ID NO 1220 

<211> Length : 25 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1220 

>R00299__node_15 

CTGTCCCGGAAGGAGAAGCTGAGAT 

<210> SEQ ID NO 1221 

<211> Length : 62 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1221 
>R00299_node_20 

TTCTGTTCCACATGTACGACTCGGACAGCGACGGCCGCATCACTCTGGAAGAATATCGAAAT 

<210> SEQ ID NO 1222 

<211> Length : 108 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1222 
>R00299 node 23 
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GTGGTCGAGGAGCTGCTGTCGGGAAACCCTCACATCGAGAAGGAGTCCGCTCGCTCCATCGCCGACGGGGCCATGAT 
GGAGGCGGCCAGCGTGTGCATGGGGCAGATG 

<210> SEQ ID NO 1223 

<211> Length : 48 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1223 
>R002 99_node_2 5 

GAGCCTGATCAGGTGTACGAGGGGATCACCTTCGAGGACTTCCTGAAG 

<210> SEQ ID NO 1224 

<211> Length : 9 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1224 
>R002 9 9_node_2 8 
ATCTGGCAG 

<210> SEQ ID NO 1225 

<211> Length : 108 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1225 
>R00299 node 31 
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GTACATAGCCAAGGCTCGTCTGCGCACCTTGTGTCTTGTAGGGTATGGTATGTGGGACTTCGCTGTTTTTATCTCCA 
ATAAAAAAAAAAAAAAGGTTTGTTAATTAAT 

<210> SEQ ID NO 1226 

<211> Length : 70 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1226 
>R002 9 9_node_5 

TCTCATCGGATCAGATCGAGCAGCTCCATCGGAGATTTAAGCAGCTGAGTGGAGATCAGCCTACCATTCG 

<210> SEQ ID NO 1227 

<211> Length : 4 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1227 

>R002 99_node_9 

CAAG 

Segment nucleic acid sequences: 

<210> SEQ ID NO 1228 

<211> Length : 157 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1228 
>W 6 0 2 8 2_PE A_l_no de_l 0 

GGCTTGTAGGGGGAGAGACCAGGATCATCAAGGGGTTCGAGTGCAAGCCTCACTCCCAGCCCTGGCAGGCAGCCCTG 
TTCGAGAAGACGCGGCTACTCTGTGGGGCGACGCTCATCGCCCCCAGATGGCTCCTGACAGCAGCCCACTGCCTCAA 

GCC 

<210> SEQ ID NO 1229 

<211> Length : 137 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1229 
>W602 82_PEA_l_node_l 8 

TACGCCTGCCTCACACCTTGCGATGCGCCAACATCACCATCATTGAGCACCAGAAGTGTGAGAACGCCTACCCCGGC 
AACATCACAGACACCATGGTGTGTGCCAGCGTGCAGGAAGGGGGCAAGGACTCCTGCCAG 

<210> SEQ ID NO 1230 

<211> Length : 436 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1230 
>W 6 0 2 82_PEA_l_n o de_2 2 

TCTCTTCAAGGCATTATCTCCTGGGGCCAGGATCCGTGTGCGATCACCCGAAAGCCTGGTGTCTACACGAAAGTCTG 
CAAATATGTGGACTGGATCCAGGAGACGATGAAGAACAATTAGACTGGACCCACCCACCACAGCCCATCACCCTCCA 
TTTCCACTTGGTGTTTGGTTCCTGTTCACTCTGTTAATAAGAAACCCTAAGCCAAGACCCTCTACGAACATTCTTTG 
GGCCTCCTGGACTACAGGAGATGCTGTCACTTAATAATCAACCTGGGGTTCGAAATCAGTGAGACCTGGATTCAAAT 
TCTGCCTTGAAATATTGTGACTCTGGGAATGACAACACCTGGTTTGTTCTCTGTTGTATCCCCAGCCCCAAAGACAG 
CTCCTGGCCATATATCAAGGTTTCAATAAATATTTGCTAAATGAGTGAATC 
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<210> SEQ ID NO 1231 

<211> Length : 669 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1231 
>W 6 0 2 8 2_PEA_l_n o de__5 

GGAGCGGCCTAGGGGAGGCCAGGGGCCCACCTGGGCTGGGGCTGTGGAGAGGGAGTGGCTGGGACGGGAGGAAAAAG 
AGAGACGGAGATTAGATGGAAGAAGAGGGATTTCAAGACAAATTGCCAGAGATGCAGTCAGAGAGACTGACTGAGAG 
ACACAAAGATAGAAGGAATTAGAGAAAGGGCCACACAGAGCCAGACAGAGAGAGAAGAGTGGAGATGGAGACAGGGA 
CGAGGACAGAGAAAGGCAGACAGACACATAGGGACAGAAAGAGAAAAATCACACAAAGTCAGAATTACTGAATGACA 
GGGAATGACACATAGAACGAGACACAGATTCAGAGACTCAGGGCAGGGAAAGGAAGGCTGCAGACAGACAGACAGAC 
AGAGGGAGGCTGAGACACAGGGAGAAGAGGGGCTTGGAGAGGTGGCACAGGCAGGCAGCCAGTGCCTCAGAGGCCTC 
CGGGGAGGGCCCTCACACACACCCCGCCCCGGGGCATTAAGGCAGGGCTTGGAGGCCAGTCATCCTGGGCCCGCCCA 
GGGCCGCCCCCCTGCCAGCCCGCCTGCCTGGTGCCTGGCACCTGGCGCTCCAACCCAGCCTACCTGCTGTAGCTGCC 
GCCACTGCCGTCTCCGCCGCCACTGGGCCCCCAGAGCCCCAGCCCCAGAGCCT 

<210> SEQ ID NO 1232 

<211> Length : 33 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1232 
>W 6 0 2 8 2_PE A__l_node_2 1 
GGTGACTCCGGGGGCCCTCTGGTCTGTAACCAG 

<210> SEQ ID NO 1233 

<211> Length : 75 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1233 
>W60282_PEA_l_node_8 

AGGAACCTGGGGCCCGCTCCTCCCCCCTCCAGGCCATGAGGATTCTGCAGTTAATCCTGCTTGCTCTGGCAACAG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 1234 

<211> Length : 616 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1234 
> Z 4 1 6 4 4_PE A__l_node_0 

CCTTGCTGTTCATGGCCCAGCAGGGGCCCTATGGGGGTCAGGCCTGCAGGCACTCACACTTGGCACCTGCTCCAAAA 
CCCTTTCAGGTCTTTGAGGATCTGAGCCCTGGGCCTGGGTCTCCCGCCGGCTCGGAAAAGCTGGCCTGCCGGGCCAG 
ACGAGAGAACCACACGATTCAGAAAAGCAGTGCCCTTCAGCAGCCTCTCCACCGTCTGGGCTCCCCAAAGGCAGAGC 
GGGACGCTGGAAATGTGTGCGCGCTGTGGTATGGGTGTGCAAGTGTGCGAAGGCGGCGTGTTGTGTGAGCGAGAGGG 
TAGCGGATGTGTGTGTGCGTGTGCGCGCGTGGCTCCGGGTGTGCGCCGCTGCGATAGCGGGTCCTTTCCCGGGGCGG 
GCGACGGGCGGGCTGGGAAGGTCTCCTCCCCTCACCACATTGAGAAATCTCAGTGAGTCACCGAGTGGTTCTGCATA 
TTAATGAGCTCGCTCGCTGCGAGGGCAGGAGCGGATTTAAAAGAGGCCAGGGCGGGCGGAGGGAGGCTGTGGAGAGA 
GCGCGGAGACAAGCGCAGAGCGCAGCGCACGGCCACAGACAGCCCTGGGCATCCACCGACGGCGCAGCCGGAGCCAG 

<210> SEQ ID NO 1235 

<211> Length : 1, 062 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1235 
>Z41644 PEA 1 node 11 
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GTACGCCCCGCCTCTACTCACCTTCCTTCCCACCAGACCCAGCTGTGGCTCTCAGGATGGGAAGGGACCCCCCCACC 
AGGTCATCTAGCCCCATCTAATATGTGAACACCCACCACAACATCCACAGCAAACAGATACTCAGACAATGCTTACA 
TACCCCCAGGGACAAGGAACTCACCACTTACCAAAGCTAGCTATCCATCTCTTGTCCATTTGCAAGCATGGCAGGTT 
TGTCATTTTGTAAACTAAAGTCTGTCTCACTCTAATATTTGCATTATAATCTTAATTCCTCTTTTTATTTCAGTTAC 
GTAAGTTGTTAAATGGCAGAGTGAGCACTGGCATGGCTGCCAGGGGAGCTCTGAGGACTTCAGTGGGGTGAAATGTG 
ACCACTTAGGTGACTGTGTATGTTGGCTATAAAACTGCGCTATAAAACCATGAGGTGCTGAGGATGATCCTTGCCAG 
AAACATGTTTTCTTCTCCAAGGTGCCCCACTCCCTCTGCTGCCCAGAAACCTGATAAACTCCTTCCTTCGCAGGTGC 
TGGAAGGCACCACAGGTTTGGCTCTTTAAAATCAGAGCCACTGTTAACCAAGGCGGGCAGCAGTGTTAAGACCACCA 
GCACCCTGAACCAGCCCTGTACTTACTGGGCACTGTTTCCTTAAAATCAGAAGGTGGCTTCCCATCTCTGGTTTCCT 
GGGGTCTTATGTCTGTCCTCGGAGGGAGAATCCAGTTCCTAGCTCCCCTGTACCATGCGAAGGTAGCCTGTCCTGTC 
TCACTCCTCAGATACGCAGAGTCTGTTTACACATTTGCCTGCATAGCATGATCAGGAAGCACACACACACACACACA 
CACACACACACACGCATGCATGCACACACCATGCAGGTGACTTCCCCAGGAACTAGTGCCAGCACCCCTGCTGCAGA 
GGGGGATATCAAGGCTAAATGGAAGAGAGGGGTGACTTGCCTGGGAGCACAGGGCAAAGCCAGGACAGCAAACCAGG 
CCTCCTGGTGCTACCCCACCAGCTGCCCTCACAGGGTGGAAGGTACAGCCATAGTGGGTGC 



<210> SEQ ID NO 1236 

<211> Length : 261 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1236 
>Z 4 1 6 4 4_PEA_l_node_l 2 

CTGCATTGCCCTCCCCTCACCTGGCCCAGCCATGCTACCCCAAGCTCAGCCCTGTGACCAGCTCTCCCAGAGCTGAC 
ACTCGGGCTCAACCCCTATACCTGAGCCTTTTTTGCTGCCTCCAAAACAGCCTCATCTGCAGTTGCTTGAAATAGAA 
AGTGATGAGAGCAATAAATTATTTTCTATAAATCTGCTGGGAATGAAGCCCTCTTTCTGGTCAAGCCAGGCAGCTCA 
TGTGGCAAAGGCCAGAACTGCGCAGTCCAC 



<210> SEQ ID NO 1237 
<211> Length : 1,361 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 1237 
>Z 4 1 64 4_PEA_l_node_l 5 

GCCCTGTGTCATCAAAACTGGCAGAAGGCTAATCCCATGGGCAGGTTATGGAAGAGGCTGAGGGCATCTTGATCTGA 

TTGCTGGGGGATACTCAAACCTTTAGCTCACCTTGCTTCTCCCCTCCACCTGAGCTGCAGCCTGGAAAAGGAGGCAC 

CCACAGGTCTAAACATGGCCCTGCTTTTTTTTTTCCTGAAAATTCCAATAACAAAAGCAACGAGAGCCTCTCACTAC 

CAGGCCTTCTCTCACTTTGCTATAAAATTAGTTCACCCCTCTTTCTTAGAGTGTTGAGGTCCCTGCCCTCCCCACCT 

CCCTCCCCTGAAACAAGTTGAAAATATCTTAATGAACATAGAACAGTGATAAAGGAAGTGTTTGAAGTCCTCTTTGT 

ACAGAGAGAGAGAGAAAGAGAATGCCAAAGCTAGGTTGGAGGAAGTAGAAGGGTATACGGTGGGCTCAGGCCCATGG 

GGGCCACACAGAGGAGCTCTGTGCACTTCAGAGACCAGAGCTTCCAGGGAGCTTCTGGCCACCACAGGAAGCAGCCT 

AGTCAGGCATTTTATTTCAATGGATAATTCAGTGGTCTTACTCAGAAATCAAGAACGAGACAGAAAAGTGATAGGCT 

AAGTGTAACGTATGGCCCCAGGGCAGCCATGGGGCAGAACTAGAAGAAAGCAAAATATCTAACTGGGCACAGCTTGA 

GAGGTGAGGGGAAGGTGGGGCTGGGAACGAGTAGAGATGAGGCAATGCAGCCAGGAGCAGGGACTGAGGGGCACAGG 

CCTCCTGCACCACTGCCCCACCCCACCAACCACCTCTTCTGTCTCCAGGAAGCAGCTTCTAGAGCTAGCATTCTTCT 

GGAGGACATGCATTATTTGGGCAAAATACAAAGAAATATACAAGCCTAAGTCAAGTAAGGGAATGCCTCCCACCCTT 

GCTATTTTCTCTAAATAGAGAGGCTGAGTACAGACGCGGAAAGAAACAAGGAGGTGTGGGAGCAGCCCGCCATGCTA 

GAGAAAGACTACATTCCTGCCACTAACAGTCGGTGGCCACTGGGCAAATCTTAAGTCTGTGGTGCCTCAGTTTCCTC 

ATATGCAAAGCGGGTTTGTTCCATAGGCCTCTGAGGACAAAATGAGATTGCAGAAGTGAGATTGCAGATGGTTAGAA 

AAGACAAAGCCACACTGGTGTGAGTTTTCATGGTCCCCGGGACCACATCCTCAGAAGGATCCCTCCCACTTCTCCTG 

GGGGTTCCTGCAGTTCTGGGACAGGGGCATTCCCTGCAGACCAGACGTGAATGAAGCCGCTTAGCCAGCATCTTGTG 

AACGGCCTGCCTCATGTCCTGAGCCACTTACACATGTGTTTTTTCTCCCCAG 



<210> SEQ ID NO 1238 

<211> Length : 569 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1238 
>Z 4 1 6 4 4_PEA_l_node_2 0 

ATGGGAGACCCATCTCTCTTGTGCTCCAGACTTCATCACAGGCTGCTTTTTATCAAAAAGGGGAAAACTCATGCCTT 
TCCTTTTTAAAAAATGCTTTTTTGTATTTGTCCATACGTCACTATACATCTGAGCTTTATAAGCGCCCGGGAGGAAC 
AATGAGCTTGGTGGACACATTTCATTGCAGTGTTGCTCCATTCCTAGCTTGGGAAGCTTCCGCTTAGAGGTCCTGGC 
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GCCTCGGCACAGCTGCCACGGGCTCTCCTGGGCTTATGGCCGGTCACAGCCTCAGTGTGACTCCACAGTGGCCCCTG 
TAGCCGGGCAAGCAGGAGCAGGTCTCTCTGCATCTGTTCTCTGAGGAACTCAAGTTTGGTTGCCAGAAAAATGTGCT 
TCATTCCCCCCTGGTTAATTTTTACACACCCTAGGAAACATTTCCAAGATCCTGTGATGGCGAGACAAATGATCCTT 
AAAGAAGGTGTGGGGTCTTTCCCAACCTGAGGATTTCTGAAAGGTTCACAGGTTCAATATTTAATGCTTCAGAAGCA 

TGTGAGGTTCCCAACACTGTCAGCAAAAAC 

<210> SEQ ID NO 1239 

<211> Length : 163 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1239 
> Z 4 1 6 4 4_PE A_l_node_2 4 

CAATATATTTGTGATTCCCCATGTAATTCTTCAATGTTAAACAGTGCAGTCCTCTTTCGAAAGCTAAGATGACCATG 
CGCCCTTTCCTCTGTACATATACCCTTAAGAACGCCCCCTCCACACACTGCCCCCCAGTATATGCCGCATTGTACTG 

CTGTGTTAT 

<210> SEQ ID NO 1240 

<211> Length : 81 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1240 
> Z 4 1 6 4 4 _PE A_l_n o de__ 1 

CAGAGCCGGAAGGCGCGCCCCGGGCAGAGAAAGCCGAGCAGAGCTGGGTGGCGTCTCCGGGCCGCCGCTCCGACGGG 
CCAG 



<210> SEQ ID NO 1241 
<211> Length : 5 6 
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<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 1241 
> Z 4 1 6 4 4_PE A_l_node_l 0 

CTGCAGAGCACCAAGCGCTTCATCAAGTGGTACAACGCCTGGAACGAGAAGCGCAG 



<210> SEQ ID NO 1242 

<211> Length : 17 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1242 
>Z 4 1 6 4 4_PEA_l_node__l 3 
ACTCTGTCACCCTCCAG 



<210> SEQ ID NO 1243 

<211> Length : 81 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1243 
>Z 4 1 6 4 4_PEA_l_node_l 6 

GGTCTACGAAGAATAGGGTGAAAAACCTCAGAAGGGAAAACTCCAAACCAGTTGGGAGACTTGTGCAAAGGACTTTG 
CAGA 
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<210> SEQ ID NO 1244 

<211> Length : 20 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1244 
>Z 4 1 6 4 4_PEA__l_node_l 7 
T T AAA AA AAA A AAA AAA A A A 

<210> SEQ ID NO 1245 

<211> Length : 108 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1245 
>Z4164 4_PEA_l_node_19 

AAGCCTTTCTTTCTCACAGGCATAAGACACAAATTATATATTGTTATGAAGCACTTTTTACCAACGGTCAGTTTTTA 
CATTTTATAGCTGCGTGCGAAAGGCTTCCAG 

<210> SEQ ID NO 1246 

<211> Length : 40 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1246 
>Z 4 1 6 4 4_PEA__l__node_2 

CGCCCTCCCCATGTCCCTGCTCCCACGCCGCGCCCCTCCG 
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<210> SEQ ID NO 1247 

<211> Length : 23 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1247 
> Z 4 1 6 4 4_PEA_l_node_2 1 
CTTAG GAGAAAAC T T A A A AAT AT 

<210> SEQ ID NO 1248 

<211> Length : 53 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1248 

> Z 4 1 6 4 4_PEA_l_node_2 2 

ATGAATACATGCGCAATACACAGCTACAGACACACATTCTGTTGACAAGGGAA 

<210> SEQ ID NO 1249 

<211> Length : 54 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1249 

> Z 4 1 6 4 4_PEA_l_nocie_2 3 

AACCTTCAAAGCATGTTTCTTTCCCTCACCACAACAGAACATGCAGTACTAAAG 
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<210> SEQ ID NO 1250 

<211> Length : 103 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1250 
>Z4164 4_PEA_l_node_25 

ATGCTATGTACATGTCAGAAACCATTAGCATTGCATGCAGGTTTCATATTCTTTCTAAGATGGAAAGTAATAAAATA 
TATTTGAAATGTACCAAAATTCTAGA 

<210> SEQ ID NO 1251 

<211> Length : 36 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1251 
> Z 4 1 6 4 4__PEA__l_node__3 

GTCAGCATGAGGCTCCTGGCGGCCGCGCTGCTCCTG 

<210> SEQ ID NO 1252 

<211> Length : 34 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1252 
>Z41644 PEA 1 node_4 
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CTGCTGCTGGCGCTGTACACCGCGCGTGTGGACG 

<210> SEQ ID NO 1253 

<211> Length : 106 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1253 
> Z 4 1 6 4 4_PEA_l_node_6 

GGTCCAAATGCAAGTGCTCCCGGAAGGGACCCAAGATCCGCTACAGCGACGTGAAGAAGCTGGAAATGAAGCCAAAG 
TACCCGCACTGCGAGGAGAAGATGGTTAT 

<210> SEQ ID NO 1254 

<211> Length : 58 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1254 
> Z 4 1 6 4 4_PEA_l_node_9 

CATCACCACCAAGAGCGTGTCCAGGTACCGAGGTCAGGAGCACTGCCTGCACCCCAAG 



Segment nucleic acid sequences: 

<210> SEQ ID NO 1255 

<211> Length : 669 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1255 
>Z 4 4 8 0 8_PEA__l_node_0 

CCTGGACCCTGGGGCGTGAGGAGGGCGCGGTGCGTCCCGTGGTTGTGCTTGGAAGCCCCCCGAGGGTGCGCGCGCGT 
GGGTATGAGTGCGTGCGTGTGCCTGGGTGTGCGTGTGTGTAAGTGTGCACGTGTGTGTGTGAGAGTGCGCGCGGGGA 
AGGAGGCACAGAGACAGCCCGGACAGGCCACTGCGCAGCCCTGGTGGCCCCCGCTCCACCTCTCGCTCCGCAGACCC 
GCGCCAGGGAGGCCTCTGGGCCGCAGCGGGCACCGGAGCGGAGCGGGCGCGGCAGCGGGCGCTGGGAGGTGGGGCTG 
GGGGAGGAGAGGGGGAGGGAGAGAGGCGGGCGGGAGGGGAGGATCCGGGAAGCTCCGGGGTATTTGACAGGAGCGAG 
GGCGGACGCAAAGAACGCGGAGGACCTCTGGGTGCCTGCAGGGGAGCTGCTCCAGCCGGGCCGCCGGGAGCGGTGGG 
GAGAGCATCGCGGAGCCGCCCCTCCACGCGCCCGCCCAGCCGCGCTCGCCCACTGGGCTCTCCCGGCTGCAGTGCCA 
GGGCGCAGGACGCGGCCGATCTCCCGCTCCCGCCACCTCCGCCACCATGCTGCTCCCCCAGCTCTGCTGGCTGCCGC 
TGCTCGCTGGGCTGCTCCCGCCGGTGCCCGCTCAGAAGTTCTCGGCGCTCACG 



<210> SEQ ID NO 1256 

<211> Length : 187 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1256 
>Z4 4 8 0 8_PEA_l_node_l 6 

TGTCATCCTGTGACCAAGAGCACCAGTCTGCCCTGGAGGAAGCCAAGCAGCCCAAGAACGACAATGTGGTGATCCCT 
GAGTGTGCGCACGGCGGCCTCTACAAGCCAGTGCAGTGCCACCCCTCCACGGGGTACTGCTGGTGCGTCCTGGTGGA 
CACGGGGCGCCCCATTCCCGGCACATCCACAAG 

<210> SEQ ID NO 1257 

<211> Length : 172 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1257 
> Z 4 4 8 0 8_PEA_l_node__2 

TTTTTGAGAGTGGATCAAGATAAAGACAAGGATTGTAGCTTGGACTGTGCGGGTTCGCCCCAGAAACCTCTCTGCGC 
ATCTGACGGAAGGACCTTCCTTTCCCGTTGTGAATTTCAACGTGCCAAGTGCAAAGATCCCCAGCTAGAGATTGCAT 

ATCGAGGAAACTGCAAAG 



<210> SEQ ID NO 1258 

<211> Length : 275 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1258 
>Z 4 4 8 0 8_PEA_l_node_2 4 

GCTCTCAGAACCCGACCCCAGCCATACCCTAGAGGAGCGGGTGGTGCACTGGTACTTCAAACTACTGGATAAAAACT 
CCAGTGGAGACATCGGCAAAAAGGAAATCAAACCCTTCAAGAGGTTCCTTCGCAAAAAATCAAAGCCCAAAAAATGT 
GTGAAGAAGTTTGTTGAATACTGTGACGTGAATAATGACAAATCCATCTCCGTACAAGAACTGATGGGCTGCCTGGG 
CGTGGCGAAAGAGGACGGCAAAGCGGACACCAAGAAACGCCACA 

<210> SEQ ID NO 1259 

<211> Length : 1, 685 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1259 
> Z 4 4 8 0 8JPE A_l_node__3 2 

GATGCAATGGTGGTGTCCTCCAGACCCAAAGCCACAACCCATCGCAAGTCAAGAACACTTTCCAGAAGATAAACATG 
AGTGGGTTCATGTCTCTCTCCTTCAAAGCCAGGACAAAATCCCCACTTCTTTGCTGCCGCGAGTCAATTTGTGATTT 
ATTTTGTCTGCACCTGTTTGATGCCAGGTCGACATTTCCTAAGGCAAGCCCCTGTATTTGTTGTGGATTTAAGTGGA 
GGCGGCCAGCACACACCTTGGATGTAATTTAAAACCATTTCCTGAGGAAAGATGTGTGATATGCTTTCCTTTGTTTA 
GCAAATGTTTATGGTTTTAACTTTAAATCTCACCGCAAATCACTTACACTTGAAAACAGGGCTGGTCTGAAAGTAAT 
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TACCCTCCCTGAGTGCCAAGACCTCCAGAAGTTGTTTTCATTCCCGAATGGCAATCACTGTACTCATGCGCTCCACG 

CATCTTAAATAAACTCAGTTCAAAGCACATGCCTCCTGCTTCAGCTCTTTTTTCCAAAAAGAGAAACAGAAGCAGGT 

TCCCCCTCCTTTTATAGTGCCTGCGTGGACACGCGGACCTCCATGCCTTTCATGCTGTGGCTATGTCAGCAAACTAC 

GATATTGGGATGATCCTAACGGGCAAGCCAGCTGCGGCTCCTACCGGCCGTGGCCATTGAAGGCCACCATGTTGCTT 

TGAAACATCTCAAAGAATAACATAGTGCCAGCCAGCAAGGGTTTCACCATATGCATGACCCAGACAGGAACTATCAA 

AAGAAGGGATCACGGGAAGGTGCATGATGCTAATGTGGAATCCAGAGGAGCTCTTTCCTGATCTCTTCAGCTTCCGC 

TGCCACTCCAGAATCATCAGAGCTGATATTAAATAAGTTAAAATGTTAGTCCACCGTCTCCTCCTGCAATCCTAACC 

ATCTTTTGAGACTGTTAGAATACTTTGACGGGTTGTCTTTCTGTGCAACTAATTTAAACCTCAAGTTTAGTGTAGGA 

GATGGGTTTGTCTXCTCACCTCTTCAGATCTTTATCAAGGGGGAATAAAAGCCAACCCAGAAACCTAAACTTTAAAA 

TTTAATTATTTGAAATAATAAAACAGAAGAAGGGATCAACATTTGTCGGAATTGGCACTCTTGGAAAACTAAGTCTA 

GGAGATCATATATTGCTTTTTTTTTTTCATTCTAAATTACTTTTAATTGAAAGTCAAGATGCTGAGTTACAGTTGTT 

TATCATTATAATAAGCAAACTTTTTAAGTTGGATTTCTTCTTAAAGAGGTAAACTAGTGAACAAAAAAGATAAAAAG 

GAAAATTAAGAATCAACTATGCCTTTATCAAATTTGAAGCATAAGTTATATTATTAAAATTATTTTTGTATAATCAA 

GGTGATAAGACATTCTGGAAAACATTTAATGTATTTAGTACTTAGAATATTTACAGTGGATGTTACTTTTTTGAAAC 

GATATATTTTTCCCAATTTTTCTATCATGTCAAGGAAGGAAACTGTTAAGAAGTTACCAGTGTCCAAAATGTCTTCA 

TTGTTTCTTACTCATACTTACACCTCACATGACCTGCCCAGCCCTCTTTGGTTCAGTTCATTCCCAGAAGCCAAGCC 

TTAGTCTTCACAGATGAGCGACACACACCTCTGAATATAATGTCTCTTTTTTGTTTTTTCCTTTTCAG 



<210> SEQ ID NO 1260 

<211> Length : 877 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1260 
> Z 4 4 8 0 8_PEA_1_ node_3 3 

CCAAGGAAACAAGGATAAATGGCTCATACCCCGAAGGCAGTTCCTAGACACATGGGAAATTTCCCTCACCAAAGAGC 
AATTAAGAAAACAAAAACAGAAACACATAGTATTTGCACTTTGTACTTTAAATGTAAATTCACTTTGTAGAAATGAG 
CTATTTAAACAGACTGTTTTAATCTGTGAAAATGGAGAGCTGGCTTCAGAAAATTAATCACATACAATGTATGTGTC 
CTCTTTTGACCTTGGAAATCTGTATGTGGTGGAGAAGTATTTGAATGCATTTAGGCTTAATTTCTTCGCCTTCCATA 
TGTTAACAGTAGAGCTCTATGCACTCCGGCTGCAATCGTATGGCTTTCTCTAACCCCTGCAGTCACTTCCAGATGCC 
TGTGCTTACAGCATTGTGGAATCATGTTGGAAGCTCCACATGTCCATGGAAGTTTGTGATGTACGGCCGACCCTACA 
GGCAGTTAACATGCATGGGCTGGTTTGTTTCTTGGGATTTTCTGTTAGTTTGTCTTGTTTTGCTTTCCAGAGATCTT 
GCTCATACAATGAATCACGCAACCACTAAAGCTATCCAGTTAAGTGCAGGTAGTTCCCCTGGAGGAAATAATATTTT 
CAAACTGTCGTTGGTGTGATACTTTGGCTCAAAGGATCTTTGCTTTTCCATTTTAAGCTTCTGTTTTGAGTTTTGCC 
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CTGGGGCTTGAATGAGTCCCAGAGAGTCGTTCGGATGGTGGGAGGCTGCCTAGGAGGCAGTAAATCCAGTCACAGTG 
CCTGGGAGGGGCCCATCCTTCCAAAATGTAAATCCAGTCGCGGTGTGACCGAGCTGGCTAACAGGCTTGTCTGCCTG 
GTTTTCCTCCTACACGTGGACATTATTCTC 

<210> SEQ ID NO 1261 

<211> Length : 252 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1261 
> Z 4 4 8 0 8__PE A__l_node_3 6 

GATACGTGGTGCCCCCGGGGCTGGTGTTGGCAGCCGGGGGGAGGTGCCTGAGGGTCCCCACGGTTCCTTTCTGCTTT 
TCTGAATGCATCAAGGGTACGAGAACTTGCCAATGGGAAATTCATCCGAGTGGCACTGGCAGAGAAGGATAGGAGTG 
GAATGCCCACACAGTGACCAACAGAACTGGTCTGCGTGCATAACCAGCTGCCACCCTCAGGCCTGGGCCCCAGAGCT 
CAGGGCACCCAGTGTCTTAAG 

<210> SEQ ID NO 1262 

<211> Length : 349 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1262 
>Z 44 8 0 8_PEA_l_node_37 

GAACCATTTGGAGGACAGTCTGAGAGCAGGAACTTCAAGCTGTGATTCTATCTCGGCTCAGACTTTTGGTTGGAAAA 
AGATCTTCATGGCCCCAAATCCCCTGAGACATGCCTTGTAGAATGATTTTGTGATGTTGTGATGCTTGTGGAGCATC 
GCGTAAGGCTTCTTGCTTATTTAAACTGTGCAAGGTAAAAATCAAGCCTTTGGAGCCACAGAACCAGCTCAAGTACA 
TGCCAATGTTGTTTAAGAAACAGTTATGATCCTAAACTTTTTGGATAATCTTTTATATTTCTGACCTTTGAATTTAA 
TCATTGTTCTTAGATTAAAATAAAATATGCTATTGAAACTA 
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<210> SEQ ID NO 1263 

<211> Length : 233 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1263 
>Z 4 4 8 0 8_PEA_l_node_4 1 

CACCTCATAATCATGTGAAAAAGACACTCAAAAACTACCATTTGAATGGATGGATGAAAATAACCTCCGTATATTCT 
ACGAAGATGTTTAATAATAAATAGGTTTCGTTATAAGAGAATGTGTGTCACTTCGTCTCTTCCCTCACCCCCGAGAC 
TTAGTGACAGTTATTTTTGACTTTTCCAACTATACTATTTGCCTAGAAAATGTGTCTATTAAATAGCGTATTGAGAA 

AT 

<210> SEQ ID NO 1264 

<211> Length : 51 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1264 
>Z 4 4 8 0 8_PEA_l_node_l 1 

ATGATGCCGCAGCTCCAGCGTTGGAGACTCAGCCTCAAGGAGATGAAGAAG 

<210> SEQ ID NO 1265 

<211> Length : 75 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1265 
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>Z 4 4 8 0 8_PEA_l_node_l 3 

ATATTGCATCACGTTACCCTACCCTTTGGACTGAACAGGTTAAAAGTCGGCAGAACAAAACCAATAAGAATTCAG 

<210> SEQ ID NO 1266 

<211> Length : 83 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1266 
> Z 4 4 8 0 8_PEA_l_node_l 8 

GTACGAGCAGCCGAAATGTGACAACACGGCCAGGGCCCACCCAGCCAAAGCCCGGGACCTGTACAAGGGCCGCCAGC 
TACAAG 

<210> SEQ ID NO 1267 

<211> Length : 103 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1267 
>Z44 8 0 8_PEA_l_node_22 

GTTGTCCGGGTGCCAAAAAGCATGAGTTTCTGACCAGCGTTCTGGACGCGCTGTCCACGGACATGGTCCACGCCGCC 
TCCGACCCCTCCTCCTCGTCAGGCAG 

<210> SEQ ID NO 1268 

<211> Length : 95 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1268 
> Z 4 4 8 0 8_PE A_l_node_2 6 

GAAGTAAGAGAAACCTGTGATGGCCAGAGCCCAGATGTTCTTAGGAGGCAAGCCAGGAGAAGCCGGGTCTGACTTTT 
CAGCTCAGAGACAGCACT 



<210> SEQ ID NO 1269 

<211> Length : 38 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1269 
> Z 4 4 8 0 8_PEA__l_node_3 0 

CCCCCAGAGGTCATGCTGAAAGTACGTCTAATAGACAG 



<210> SEQ ID NO 1270 

<211> Length : 75 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1270 
>Z 4 4 8 0 8_PEA_l_node_3 4 

CTGATCCTCCTACCTGGTCCACCCCAGGGCTACCGGAAGGTAAAATCTTCACCTGAACCAATTATGAGCAGTCTC 

<210> SEQ ID NO 1271 

<211> Length : 19 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1271 
>Z4 48 0 8_PEA_l_node__35 
CTTACTGAAGGTACAGCCG 

<210> SEQ ID NO 1272 

<211> Length : 65 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1272 
>Z 4 4 8 0 8_PE A_l_node_3 9 

CTACTGTGGTTGAGAGGAAAGGTGTCTTTTTATTGCTTCTAGAGACGTTGAAAGTGTGACCTGAG 

<210> SEQ ID NO 1273 

<211> Length : 107 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1273 
> Z 4 4 8 0 8_PE A__l_no de_4 

ACGTGTCCAGGTGTGTGGCCGAAAGGAAGTATACCCAGGAGCAAGCCCGGAAGGAGTTTCAGCAAGTGTTCATTCCT 
GAGTGCAATGACGACGGCACCTACAGTCAG 

<210> SEQ ID NO 1274 

<211> Length : 100 

<212> Type : DNA 

<213> Organism : Homo sapiens 



<400> sequence : 1274 
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> Z 4 4 8 0 8_PE A_l_node_6 

GTCCAGTGTCACAGCTACACGGGATACTGCTGGTGCGTCACGCCCAACGGGAGGCCCATCAGCGGCACTGCCGTGGC 
CCACAAGACGCCCCGGTGCCCGG 

<210> SEQ ID NO 1275 

<21I> Length : 48 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1275 
>Z 4 4 8 0 8_PEA_l_node_8 

GTTCCGTAAATGAAAAGTTACCCCAACGCGAAGGCACAGGAAAAACAG 

Segment nucleic acid sequences: 

<210> SEQ ID NO 1276 

<211> Length : 126 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1276 
>HSU33147_PEA_l_node_0 

GTGCTCACCTCCACAGCGGCTTCCTTGATCCTTGCCACCCGCGACTGAACACCGACAGCAGCAGCCTCACCATGAAG 
TTGCTGATGGTCCTCATGCTGGCGGCCCTCTCCCAGCACTGCTACGCAG 

<210> SEQ ID NO 1277 
<211> Length : 179 
<212> Type : DNA 
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<213> Organism : Homo sapiens 

<400> sequence : 1277 
>HSU33147_PEA_l_node_2 

GCTCTGGCTGCCCCTTATTGGAGAATGTGATTTCCAAGACAATCAATCCACAAGTGTCTAAGACTGAATACAAAGAA 
CTTCTTCAAGAGTTCATAGACGACAATGCCACTACAAATGCCATAGATGAATTGAAGGAATGTTTTCTTAACCAAAC 

GGATGAAACTCTGAGCAATGTTGAG 

<210> SEQ ID NO 1278 

<211> Length : 593 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1278 
>HSU3 314 7_PEA_l_node_4 

GTAATTTCATTTTCTTCCTATAAGCTTTTTAAATCCCCTGACCAGGGACAAGTGGGCTCTTCATTTCTCACTGACAA 
TGCCAAAGCCACTAGTGAACAAGCCTTTTCTTACATTGGTTAATTTAGTTGAATGGTTAGTCTAATGACTTTGCCAT 
CAAGAAAAACATCCAGTGTCCCTGTGTTGTCACTCTACCCAGAGAATCCTCAGTGGATGATAAATGAATAGGGCAAG 
AGAGGAAAAGGAAAGGTCGGTAGAAGTCTTACCTATCCCCAGAGCTCTCTAATTCATGCTCACAAACACAGACACAA 
TCACACAAACACAGAAACACACATACACACATCCAGACACATGCAAACACACAGACACAGTCACAATCACACAAACA 
CACACACATTCAGACATACACAAACATAGACAGACAGGCAAAGACACAGACACAGACACAGACACAATCACACCAGC 
ACACAATCATCCAGACACAAACACAAACACACAGACAGAACCACACAACCACAGAAACACAGAGACACACACAAACA 
CACTCAGACACACACATACAAACATATGTTCACTCTCTACAGAAAAAACAATTT 

<210> SEQ ID NO 1279 

<211> Length : 211 

<212> Type : DNA 

<213> Organism : Homo sapiens 
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<400> sequence : 1279 
>HSU3314 7_PEA__l_node_7 

CAATTAATATATGACAGCAGTCTTTGTGATTTATTTTAACTTTCTGCAAGACCTTTGGCTCACAGAACTGCAGGGTA 
TGGTGAGAAACCAACTACGGATTGCTGCAAACCACACCTTCTCTTTCTTATGTCTTTTTACTACAAACTACAAGACA 
ATTGTTGAAACCTGCTATACATGTTTATTTTAATAAATTGATGGCAAAAAAAAAAAT 

<210> SEQ ID NO 1280 

<211> Length : 9 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1280 

>HSU33147_PEA_l_node_3 
GTGTTTATG 
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<210> SEQ ID NO 1281 

<211> Length : 152 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1281 
>H61775_P16 

MVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESVVLGCDLLPPAGRPPLHVIEWLRFGFLLPIFIQFGLYSPRI 
DPDYVGDCGFPAFRELKRAETVSPVFFTRRCIWEDLKSTGFSPAGGGRPPGGGPRTQEDSGLPCWRSSCSVTLQV 

<210> SEQ ID NO 1282 

<211> Length : 83 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1282 
>H61775_P17 

MVWCLGLAVLSLVISQGADGRGKPEVVSWGRAGESVVLGCDLLPPAGRPPLHVIEWLRFGFLLPIFIQFGLYSPRI 
DPDYVG 
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<210> SEQ ID NO 1283 

<211> Length : 496 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1283 
>M854 91_PEA_1_P13 

MALRRLGAALLLLPLLAAVEETLMDSTTATAELGWMVHPPSGWEEVSGYDENMNTIRTYQVCNVFESSQNNWLRTKF 
IRRRGAHRIHVEMKFSVRDCSSIPSVPGSCKETFNLYYYEADFDSATKTFPNWMENPWVKVDTIAADESFSQVDLGG 
RVMKINTEVRSFGPVSRSGFYLAFQDYGGCMSLIAVRVFYRKCPRI IQNGAIFQETLSGAESTSLVAARGSCIANAE 
EVDVPIKLYCNGDGEWLVPIGRCMCKAGFEAVENGTVCRGCPSGTFKANQGDEACTHCPINSRTTSEGATNCVCRNG 
YYRADLDPLDMPCTTIPSAPQAVISSVNETSLMLEWTPPRDSGGREDLVYNIICKSCGSGRGACTRCGDNVQYAPRQ 
LGLTEPRIYISDLLAHTQYTFEIQAVNGVTDQSPFSPQFASVNITTNQAAPSAVSIMHQVSRTVDSITLSWSQPDQP 
NGVILDYELQYYEKVPIGWVLSPSPTSLRAPLPG 

<210> SEQ ID NO 1284 

<211> Length : 301 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1284 
>M8 5 4 9 1_PEA_1_P1 4 

MALRRLGAALLLLPLLAAVEETLMDSTTATAELGWMVHPPSGWEEVSGYDENMNTIRTYQVCNVFESSQNNWLRTKF 
IRRRGAHRIHVEMKFSVRDCSSIPSVPGSCKETFNLYYYEADFDSATKTFPNWMENPWVKVDTIAADESFSQVDLGG 
RVMKINTEVRSFGPVSRSGFYLAFQDYGGCMSLIAVRVFYRKCPRIIQNGAIFQETLSGAESTSLVAARGSCIANAE 
EVDVPIKLYCNGDGEWLVPIGRCMCKAGFEAVENGTVCRERQDLTMLSRLVLNSWPQMILPPQPPKVLEL 

<210> SEQ ID NO 1285 

<211> Length : 283 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1285 
>T39971_P6 

MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAECKPQVTRGDVFTMPEDEYTV 
YDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEE 
LCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFKGSQYWRFE 
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DGVLDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKGTQGVVGD 

<210> SEQ ID NO 1286 

<211> Length : 447 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1286 

>T39971_P9 

MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAECKPQVTRGDVFTMPEDEYTV 
YDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEE 
LCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFKGSQYWRFE 
DGVLDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEECEGSSLSAVFEHFAMM 
QRDSWEDIFELLFWGRTSGMAPRPSLAKKQRFRHRNRKGYRSQRGHSRGRNQNSRRPSRATWLSLFSSEESNLGANN 
YDDYRMDWLVPATCEPIQSVFFFSGDKYYRVNLRTRRVDTVDPPYPRSIAQYWLGCPAPGHL 

<210> SEQ ID NO 1287 

<211> Length : 363 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1287 
>T39971_P11 

MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAECKPQVTRGDVFTMPEDEYTV 
YDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEE 
LCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFKGSQYWRFE 
DGVLDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEECEGSSLSAVFEHFAMM 
QRDSWEDIFELLFWGRTSDKYYRVNLRTRRVDTVDPPYPRSIAQYWLGCPAPGHL 

<210> SEQ ID NO 1288 

<211> Length : 238 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1288 
>T39971_P12 

MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAECKPQVTRGDVFTMPEDEYTV 



WO 2006/131783 



PCT/IB2005/004037 



563 

YDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEE 
LCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFKVPGAVGQG 

RKHLGRV 

<210> SEQ ID NO 1289 

<211> Length : 790 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1289 
>Z213 6 8_PEA_1_P2 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSLQVMNKTRKIMEHGGATF 
INAFVTTPMCCPSRSSMLTGKYVHNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPP 
GWREWLGLIKNSRFYNYTVCR3SIGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSA 
PQFSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELEN 
TYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSXVPQIVLNIDLAPTILDIAGLDTPPDVDGKSV 
LKLLDPEKPGNRFRTNKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARYQTACEQPGQKWQ 
CIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVH 
TRQTRSLSVEFEGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGPPTTVRVTH 
KCFILPNDSIHCERELYQSARAWKDHKAYIDKEIEALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVKKQE 
KLKSHLHPFKEAAQEVDSKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHDNNHWQTAPFWNPHKYSAHGR 

TRHFESATRTTNGAQKLSRI 

<210> SEQ ID NO 1290 

<211> Length : 791 

<212> Type : PRT 

<213> Organism ; Homo sapiens 

<400> sequence : 1290 
> Z 2 1 3 6 8_PEA_1_P5 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELAFFGKYLNEYNGSYIPPGWR 
EWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQF 
SKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELENTYI 
IYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKL 
LDPEKPGNRFRTNKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARYQTACEQPGQKWQCIE 
DTSGKLRIHKCKGPSDLLTVRQSTRNLYARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQ 
TRSLSVEFEGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGPPTTVRVTHKCF 
ILPNDSIHCERELYQSARAWKDHKAYIDKEIEALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVKKQEKLK 
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SHLHPFKEAAQEVDSKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHDNNHWQTAPFWNLGSFCACTSSNN 
NTYWCLRTVNETHNFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQCNPRPKNLDV 
GNKDGGSYDLHRGQLWDGWEG 

<210> SEQ ID NO 1291 

<211> Length : 416 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1291 
> Z 2 1 3 6 8_PEA_1_P1 5 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSLQVMNKTRKIMEHGGATF 
INAFVTTPMCCPSRSSMLTGKYVHNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPP 
GWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSA 
PQFSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELEN 
TYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSV 
LKLLDPEKPGNRFRTNKKAKIWRDTFLVERG 

<210> SEQ ID NO 1292 

<211> Length : 410 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1292 
>Z 2 1 3 6 8_PEA_1_P1 6 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNI ILVLTDDQDVELGSLQVMNKTRKIMEHGGATF 
INAFVTTPMCCPSRSSMLTGKYVHNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPP 
GWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSA 
PQFSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELEN 
TYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSV 

LKLLDPEKPGNRCVIVPPLSQPQIH 

<210> SEQ ID NO 1293 

<211> Length : 210 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1293 
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> Z 2 1 3 6 8_PE A_1_P2 2 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSLQVMNKTRKIMEHGGATF 
INAFVTTPMCCPSRSSMLTGKYVHNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPP 
GWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKARYDGDQPRCAPRPRGLSPTVF 

<210> SEQ ID NO 1294 

<211> Length : 145 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1294 
> Z 2 1 3 6 8_PEA_1_P23 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSLQVMNKTRKIMEHGGATF 
INAFVTTPMCCPSRSSMLTGKYVHNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTGLLHRLNH 

<210> SEQ ID NO 1295 

SEQ ID NO: 1295 
HPRT1 Forward primer 
TGAC ACT GGC AAAACAATGCA 

SEQ ID NO: 1296 
HPRT1 Reverse primer 
GGTCCTTTTCACCAGCAAGCT 

SEQ ID NO: 1297 
HPRT 1 - amp 1 icon 

TGACACTGGCAAAACAATGCAGACTTTGCTTTCCTTGGTCAGGCAGTATAATCCAAAGATGGTCAAGGTCGCAAGCT 
TGCTGGTGAAAAGGACC 

SEQ ID NO: 1298 

RPL19 Forward primer 

TGGCAAGAAGAAGGTCTGGTTAG 

<210> SEQ ID NO 1299 

<211> Length : 141 

<212> Type : PRT 

<213> Organism : Homo sapiens 
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<400> sequence : 1299 
>HUMGRP5E_P4 

MRGSELPLVLLALVLCLAPRGRAVPLPAGGGTVLTKMYPRGNHWAVGHLMGKKSTGESSSVSERGSLKQQLREYIRW 
EEAARNLLGLIEAKENRNHQPPQPKALGNQQPSWDSEDSSNFKDVGSKGKGSQREGRNPQLNQQ 

<210> SEQ ID NO 1300 

<211> Length : 142 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1300 
>HUMGRP5E_P5 

MRGSELPLVLLALVLCLAPRGRAVPLPAGGGTVLTKMYPRGNHWAVGHLMGKKSTGESSSVSERGSLKQQLREYIRW 
EEAARNLLGLIEAKENRNHQPPQPKALGNQQPSWDSEDSSNFKDVGSKGKDSLLQVLNVKEGTPS 

<210> SEQ ID NO 1301 

<2X1> Length : 201 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1301 
>D5 64 0 6_PEA_1_P2 

MMAGMKIQLVCMLLLAFSSWSLCSDSEEEMKALEADFLTNMHTSKISKAHVPSWKMTLLNVCSLVNNLNSPAEETGE 
VHEEELVARRKLPTALDGFSLEAMLTIYQLHKICHSRAFQHWEARWLTPVIPALWEAETGGSRGQEMETIPANTLIQ 
EDILDTGNDKNGKEEVIKRKIPYILKRQLYENKPRRPYILKRDSYYY 

<210> SEQ ID NO 1302 

<211> Length : 168 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1302 
>D 5 6 4 0 6_PE A_1_P 5 

MMAGMKIQLVCMLLLAFSSWSLCSEEEMKALEADFLTNMHTSKISKAHVPSWKMTLLNVCSLVNNLNSPAEETGEVH 
EEELVARRKLPTALDGFSLEAMLTIYQLHKICHSRAFQHWELIQEDILDTGNDKNGKEEVIKRKIPYILKRQLYENK 

PRRPYILKRDSYYY 
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<210> SEQ ID NO 1303 

<211> Length : 95 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1303 
>D5 6 4 0 6_PEA_1_P 6 

MMAGMKIQLVCMLLLAFSSWSLCSDSEEEMKALEADFLTNMHTSKLIQEDILDTGNDKNGKEEVIKRKIPYILKRQL 
YENKPRRPYILKRDSYYY 

Variant protein amino acid sequences: 

<210> SEQ ID NO 1304 

<211> Length :• 33 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1304 
>F050 68_PEA_1_P7 

MKLVSVALMYLGSLAFLGADTARLDVASEFRKK 

<210> SEQ ID NO 1305 

<211> Length : 83 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1305 
>F050 68__PEA__1JP8 

MKLVSVALMYLGSLAFLGADTARLDVASEFRKKWNKWALSRGKRELRMSSSYPTGLADVKAGPAQTLIRPQDMKGAS 
RSPEDR 

<210> SEQ ID NO 1306 

<211> Length : 180 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1306 
>H14624 P15 
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MLQGPGSLLLLFLASHCCLGSARGLFLFGQPDFSYKRSNCKP1PANLQLCHGIEYQNMRLPNLLGHETMKEVLEQAG 
AWIPLVMKQCHPDTKKFLCSLFAPVCLDDLDETIQPCHSLCVQVKDRCAPVMSAFGFPWPDMLECDRFPQDNDLCIP 
LASSDHLLPATEEGKPSLLLPHSLLG 

<210> SEQ ID NO 1307 

<211> Length : 381 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1307 
>H 3 8 8 0 4_PEA_1_P5 

MGRVRTLAGECSAQAQAQSLLAVVLSAPPSGGTPSARLSVRSPSPRDPWGLWAPVLQMTGSNEFKLNQPPEDGISSV 
KFSPNTSQFLLVSSWDTSVRLYDVPANSMRLKYQHTGAVLDCAFYDPTHAWSGGLDHQLKMHDLNTDQENLVGTHDA 
PIRCVEYCPEVNVMVTGSWDQTVKLWDPRTPCNAGTFSQPEKVYTLSVSGDRLIVGTAGRRVLVWDLRNMGYVQQRR 
ESSLKYQTRCIRAFPNKQGYVLSSIEGRVAVEYLDPSPEVQKKKYAFKCHRLKENNXEQIYPVNAISFHNIHNTFAT 
GGSDGFVNIWDPFNKKRLCQFHRYPTSIASLAFSNDGTTLAIASSYMYEMDDTEHPEDGIFIRQVTDAETKPK 

<210> SEQ ID NO 1308 

<211> Length : 385 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1308 
>H 3 8 8 0 4_PEA_1_P1 7 

MGRVRTLAGECSAQAQAQSLLAVVLSAPPSGGTPSARLSVRSPSPRDPWGLWAPVLQMTGSNEFKLNQPPEDGISSV 
KFSPNTSQFLLVSSWDTSVRLYDVPANSMRLKYQHTGAVLDCAFYDPTHAWSGGLDHQLKMHDLNTDQENLVGTHDA 
PIRCVEYCPEVNVMVTGSWDQTVKLWDPRTPCNAGTFSQPEKVYTLSVSGDRLIVGTAGRRVLVWDLRNMGYVQQRR 
ES SLKYQTRC IRAFPNKQGYVLS S IEGRVAVEYLDPS PEVQKKKYAFKCHRLKENNIEQI YPVNAI S FHNIHNTFAT 
GGS DGFVNIWDPFNKKRLCQFHRYPTS I ASLAFSNDGTTLAI AS SYMYEMDDTEHPEDGIFIRQVTDAETKPKSPCT 

<210> SEQ ID NO 1309 

<211> Length : 81 

<212> Type r PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1309 
>HSENA7 8_P2 

MSLLSSRAARVPGPSSSLCALLVLLLLLTQPGPIASAGPAAAVLRELRCVCLQTTQGVHPKMISNLQVFAIGPQCSK 
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VEVV 

<210> SEQ ID NO 1310 

<211> Length : 340 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1310 
>HUMODCA_P9 

MKSLTATSSMKVLLPRTFWTRKLMKFLLLLVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAKELNIDVVGVSFHV 
GSGCTDPETFVQAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEEITGVINPALDKYFPSDSGVRIIAEP 
GRYYVASAFTLAVNIIAKKIVLKEQTGSDDEDESSEQTFMYYVNDGVYGSFNCILYDHAHVKPLLQKRPKPDEKYYS 
SSIWGPTCDGLDRIVERCDLPEMHVGDWMLFENMGAYTVAAASTFNGFQRPTI YYVMSGPAWQLMQQFQNPDFPPEV 
EEQDASTLPVSCAWESGMKRHRAACASASINV 

<210> SEQ ID NO 1311 

<211> Length : 283 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1311 
>R002 99_P3 

MAEKALLCPSSAGLGTWPWVLNSAWPVLPLAVDQGVDWRPRGPVSSDQIEQLHRRFKQLSGDQPTIRKENFNNVPDL 
ELNPIRSKIVRAFFDNRNLRKGPSGLADEINFEDFLTIMSYFRPIDTTMDEEQVELSRKEKLRFLFHMYDSDSDGRI 
TLEEYRNVVEELLSGNPHIEKESARSIADGAMMEAASVCMGQMEPDQVYEGITFEDFLKIWQGIDIETKMHVRFLNM 

ETMALCH 

<210> SEQ ID NO 1312 

<211> Length : 80 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1312 
>W 6 0 2 8 2_PEA_1_P 1 4 

MRILQLILLALATGLVGGETRIIKGFECKPHSQPWQAALFEKTRLLCGATLIAPRWLLTAAHCLKPTPASHLAMRQH 
HHH 



<210> SEQ ID NO 1313 
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<211> Length : 123 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1313 
> Z 4 1 6 4 4JPE A_1_P 1 0 

MRLLAAALLLLLLALYTARVDGSKCKCSRKGPKIRYSDVKKLEMKPKYPHCEEKMVIITTKSVSRYRGQEHCLHPKL 
QSTKRFIKWYNAWNEKRRYAPPLLTFLPTRPSCGSQDGKGPPHQVI 

<210> SEQ ID NO 1314 

<211> Length : 464 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1314 
>Z4 4 8 08_PEA_1_P5 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGRTFLSRCEFQRAKCKDPQLE 
IAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCP 
GSVNEKLPQREGTGKTDDAAAPALETQPQGDEEDIASRYPTLWTEQVKSRQNKTNKNSVSSCDQEHQSALEEAKQPK 
NDNVVIPECAHGGLYKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQLQGCPGAK 
KHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEERVVHWYFKLLDKNSSGDIGKKEIKPFKRFLRKKSKP 
KKCVKKFVEYCDVNNDKSISVQELMGCLGVAKEDGKADTKKRHTPRGHAESTSNRQDAMVVSSRPKATTHRKSRTLS 

RR 

<210> SEQ ID NO 1315 

<211> Length : 434 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1315 
> Z 4 4 8 0 8_PEA__1_P 6 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGRTFLSRCEFQRAKCKDPQLE 
IAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCP 
GSVNEKLPQREGTGKTDDAAAPALETQPQGDEEDIASRYPTLWTEQVKSRQNKTNKNSVSSCDQEHQSALEEAKQPK 
NDNVVIPECAHGGLYKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQLQGCPGAK 
KHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEERVVHWYFKLLDKNSSGDIGKKEIKPFKRFLRKKSKP 
KKCVKKFVEYCDVNNDKSISVQELMGCLGVAKEDGKADTKKRHRSKRNL 
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<210> SEQ ID NO 1316 

<211> Length : 454 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1316 
>Z 4 4 8 0 8_PEA_1_P7 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGRTFLSRCEFQRAKCKDPQLE 
IAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCP 
GSVNEKLPQREGTGKTDDAAAPALETQPQGDEEDIASRYPTLWTEQVKSRQNKTNKNSVSSCDQEHQSALEEAKQPK 
NDNVVIPECAHGGLYKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQLQGCPGAK 
KHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEERWHWYFKLLDKNSSGDIGKKEIKPFKRFLRKKSKP 
KKCVKKFVEYCDVNNDKSISVQELMGCLGVAKEDGKADTKKRHTPRGHAESTSNRQLLWLRGKVSFYCF 

<210> SEQ ID NO 1317 
<211> Length : 429 
<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1317 
>Z 4 4 8 0 8_PEA_1_P1 1 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGRTFLSRCEFQRAKCKDPQLE 
IAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCP 
GSVNEKLPQREGTGKTDIASRYPTLWTEQVKSRQNKTNKNSVSSCDQEHQSALEEAKQPKNDNWIPECAHGGLYKP 
VQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQLQGCPGAKKHEFLTSVLDALSTDMV 
HAASDPSSSSGRLSEPDPSHTLEERVVHWYFKLLDKNSSGDIGKKEIKPFKRFLRKKSKPKKCVKKFVEYCDVNNDK 
SI SVQELMGCLGVAKEDGKADTKKRHTPRGHAESTSNRQPRKQG 

<210> SEQ ID NO 1318 

<211> Length : 314 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1318 
>AA161187_P1 

MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWA 
LTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYTKHIQ 
PICLQASTFEFENRTDCWVTGWGYIKEDEALPSPHTLQEVQVAIINNSMCNHLFLKYSFRKDIFGDMVCAGNAQGGK 
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DACFGDSGGPLACNKNGLWYQIGVVSWGVGCGRPNRPGVYTNl SHHFEWIQKLMAQSGMSQPDPSWPLLFFPLLWAL 
PLLGPV 

<210> SEQ ID NO 1319 

<211> Length : 326 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1319 
>AA161187_P6 

HTREGTLGGQKRAFPDGVEGEKGRGRAWGAASRGSAVPLTIRGPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSH 
VCGVSLLSHRWALTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAYYTRY FVSNI YLSPRYLGNSPYDIALVK 
LSAPVTYTKHIQPICLQASTFEFENRTDCWVTGWGYIKEDEALPSPHTLQEVQVAI INNSMCNHLFLKYSFRKDIFG 
DMVCAGNAQGGKDACFGDSGGPLACNKNGLWYQIGVVSWGVGCGRPNRPGVYTNISHHFEWIQKLMAQSGMSQPDPS 

WPLLFFPLLWALPLLGPV 

<210> SEQ ID NO 1320 

<211> Length : 213 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1320 
>AA161187_P13 

MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWA 
LTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYTKHIQ 
PICLQASTFEFENRTDCWVTGWGYIKEDEGSSGRHHKQLYVQPPLPQVQFPQGHLWRHG 

<210> SEQ ID NO 1321 

<211> Length : 307 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1321 
>AA161187_P14 

MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWA 
LTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYTKHIQ 
PICLQASTFEFENRTDCWVTGWGYIKEDEGCCLSPSHYRPHSTAISPHPPGSSGRHHKQLYVQPPLPQVQFPQGHLW 
RHGLCWQCPRREGCLLRECPCHHSQPRKASCVPVPYLTLMPTPGGGDCCPTLQMQKRRLGCCQGEEEDVHPVYPAP 
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<210> SEQ ID NO 1322 

<211> Length : 265 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1322 
>AA161187_P18 

HTREGTLGGQKRAFPDGVEGEKGRGRAWGAASRGSAVPLTIRGPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSH 
VCGVSLLSHRWALTAAHCFETDLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLS 
APVTYTKHIQPICLQASTFEFENRTDCWVTGWGYIKEDEALPSPHTLQEVQVAIINNSMCNHLFLKYSFRKDIFGDM 
VCAGNAQGGKDACFVSVPATTPSPGKHPVSLCLI 

<210> SEQ ID NO 1323 

<211> Length : 188 

<212> Type : PRT 

<213> Organism : Homo sapiens j 

<400> sequence : 1323 
>AA161187_P19 

MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWA 
LTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYTKHIQ 
PICLQASTFEFENRTDCWVTGWGYIKEDEDKRTQ 

<210> SEQ ID NO 1324 

<211> Length : 354 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1324 
>R66178_P3 

MARMGLAGAAGRWWGLALGLTAFFLPGVHSQVVQVNDSMYGFIGTDVVLHCSFANPLPSVKITQVTWQKSTNGSKQN 
VAIYNPSMGVSVLAPYRERVEFLRPSFTDGTIRLSRLELEDEGVYICEFATFPTGNRESQLNLTVMAKPTNWIEGTQ 
AVLRAKKGQDDKVLVATCTSANGKPPSVVSWETRLKGEAEYQEIRNPNGTVTVISRYRLVPSREAHQQSLACIVNYH 
MDRFKESLTLNVQYEPEVTIEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLPKGVEAQNRTLFFKGPINY 
SLAGTYICEATNPIGTRSGQVEVNITGEGHSLPISPGVLQTQNCGP 



<210> SEQ ID NO 1325 
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<211> Length : 352 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1325 
>R66178JP4 

MARMGLAGAAGRWWGLALGLTAFFLPGVHSQVVQVNDSMYGFIGTDVVLHCSFANPLPSVKITQVTWQKSTNGSKQN 
VAIYNPSMGVSVLAPYRERVEFLRPSFTDGTIRLSRLELEDEGVYICEFATFPTGNRESQLNLTVMAKPTNWIEGTQ 
AVLRAKKGQDDKVLVATCTSANGKPPSVVSWETRLKGEAEYQEIRNPNGTVTVISRYRLVPSREAHQQSLACIVNYH 
MDRFKESLTLNVQYEPEVTIEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLPKGVEAQNRTLFFKGPINY 
SLAGTYICEATNPIGTRSGQVEVNITAFCQLIYPGKGRTRARMF 

<210> SEQ ID NO 1326 

<211> Length : 363 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1326 
>R66178_P8 

MARMGLAGAAGRWWGLALGLTAFFLPGVHSQVVQVNDSMYGFIGTDVVLHCSFANPLPSVKITQVTWQKSTNGSKQN 
VAIYNPSMGVSVLAPYRERVEFLRPSFTDGTIRLSRLELEDEGVYICEFATFPTGNRESQLNLTVMAKPTNWIEGTQ 
AVLRAKKGQDDKVLVATCTSANGKPPSVVSWETRLKGEAEYQEIRNPNGTVTVISRYRLVPSREAHQQSLACIVNYH 
MDRFKESLTLNVQYEPEVTIEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLPKGVEAQNRTLFFKGPINY 
SLAGTYICEATNPIGTRSGQVENSPTPRLLPNMGGAPGRCPRPSLGAWRGASCWC 

<210> SEQ ID NO 1327 

<211> Length : 398 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1327 
>HUMPHOSLIP_PEA_2_P10 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGHFYYNISEKVYDFLSTFI 
TSGMRFLLNQQICPVLYHAGTVLLNSLLDTVPVRSSVDELVGIDYSLMKDPVASTSNLDMDFRGAFFPLTERNWSLP 
NRAVEPQLQEEERMVYVAFSEFFFDSAMESYFRAGALQLLLVGDKVPHDLDMLLRATYFGSIVLLSPAVIDSPLKLE 
LRVLAPPRCTIKPSGTTISVTASVTIALVPPDQPEVQLSSMTMDARLSAKMALRGKALRTQLDLRRFRIYSNHSALE 
SLALIPLQAPLKTMLQIGVMPMLNERTWRGVQIPLPEGINFVHEVVTNHAGFLTIGADLHFAKGLREVIEKNRPADV 
RASTAPTPSTAAV 
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<210> SEQ ID NO 1328 

<211> Length : 432 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1328 
>HUMPHOSLIP_PEA_2_P 1 2 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGHFYYNISEVKVTELQLTS 
SELDFQPQQELMLQITNASLGLRFRRQLLYWFFYDGGYINASAEGVSIRTGLELSRDPAGRMKVSNVSCQASVSRMH 
AAFGGTFKKVYDFLSTFITSGMRFLLNQQICPVLYHAGTVLLNSLLDTVPVRSSVDELVGIDYSLMKDPVASTSNLD 
MDFRGAFFPLTERNWSLPNRAVEPQLQEEERMVYVAFSEFFFDSAMESYFRAGALQLLLVGDKVPHDLDMLLRATYF 
GSIVLLSPAVIDSPLKLELRVLAPPRCTIKPSGTTISVTASVTIALVPPDQPEVQLSSMTMDARLSAKMALRGKALR 
TQLDLRRFRIYSNHSALESLALIPLQAPLKTMLQIGVMPMLNGKAGV 

<210> SEQ ID NO 1329 

<211> Length : 52 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1329 
>HUMPHO S L I P_PE A_2_P 3 0 

MALFGALFLALLAGAHAEFPGRGCAFWSKSWRLSPFRTCGAKKATSTTTSLR 

<210> SEQ ID NO 1330 

<211> Length : 98 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1330 
>HUMPHOSLIP_PEA_2_P31 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGHFYYNISEPGLERGADKF 
PVVGGSSLFLALDLTLRPPVG 

<210> SEQ ID NO 1331 

<211> Length : 200 

<212> Type : PRT 

<213> Organism : Homo sapiens 
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<400> sequence : 1331 
>HUMPHOSLI P_PEA_2_P 3 3 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITI PDLRGKEGHFYYNI SEVKVTELQLTS 
SELDFQPQQELMLQITNASLGLRFRRQLLYWFFYDGGYINASAEGVSIRTGLELSRDPAGRMKVSNVSCQASVSRMH 
AAFGGTFKKVYDFLSTFITSGMRFLLNQQVWAATGRRVARVGMLSL 

<210> SEQ ID NO 1332 

<211> Length : 217 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1332 
>HUMPHOSLIP_PEA__2_P34 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETIT I PDLRGKEGHFYYNI SEVKVTELQLTS 
SELDFQPQQELMLQITNASLGLRFRRQLLYWFFYDGGYINASAEGVSIRTGLELSRDPAGRMKVSNVSCQASVSRMH 
AAFGGTFKKVYDFLSTFITSGMRFLLNQQICPVLYHAGTVLLNSLLDTVPVLWTSLLALTIPS 

<210> SEQ ID NO 1333 

<211> Length : 148 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1333 

>HUMPHOSLI P_PEA_2_P3 5 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETIT I PDLRGKEGHFYYNI SEVKVTELQLTS 
SELDFQPQQELMLQITNASLGLRFRRQLLYWFLKVYDFLSTFITSGMRFLLNQQVWAATGRRVARVGMLSL 



<210> SEQ ID NO 1334 

<211> Length : 258 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1334 
>AI076020 PI 
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MLLVLVVLI PVLVSSGGPEGHYEMLGTCRMVCDPYPARGPGAGARTDGGDALSEQSGAPPPSTLVQGPQGKPGRTGK 
PGPPGPPGDPGPPGPVGPPGEKGEPGKPGPPGLPGAGGSGAISTATYTTVPRVAFYAGLKNPHEGYEVLKFDDVVTN 
LGNNYDAASGKFTCNIPGTYFFTYHVLMRGGDGTSMWADLCKNGQVRASAIAQDADQNYDYASNSVILHLDAGDEVF 
1KLDGGKAHGGNSNKYSTFSGFIIYSD 

<210> SEQ ID NO 1335 

<211> Length : 140 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1335 
>T23580_P5 

MTLFPLTFLVQESPAESERLFRGAASPGTERNRPADGQQQGSLPGGHDRVRDAQADHVGRGILPLVERTNVPHHGLY 
EKEIVSHLLTFSSFSKPSVPGFCKCCISAENPRCLLLPPPVHLELCKDSASVFLSSSGPRVSV 

<210> SEQ ID NO 1336 

<211> Length : 919 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1336 
>M7 92 1 7_PEA_1_P1 

MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYYLTTLDEADEAGKRIFGPRVGNELCEVK 
HVLDLCRIRESVSEELLQLEAKRQELNSEIAKLNLKIEACKKSIENAKQDLLQLKNVISQTEHSYKELMAQNQPKLS 
LPIRLLPEKDDAGLPPPKATRGCRLHNCFDYSRCPLTSGFPVYVYDSDQFVFGSYLDPLVKQAFQATARANVYVTEN 
ADIACLYVILVGEMQEPVVLRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTGRAMVAQSTFYTVQY 
RPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKIESLRSSLQEARSFEEEMEGDPPADYDDRIIATLKA 
VQDSKLDQVLVEFTCKNQPKPSLPTEWALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATRLFEALEVGAVPV 
VLGEQVQLPYQDMLQWNEAALVVPKPRVTEVHFLLRSLSDSDLLAMRRQGRFLWETYFSTADSIFNTVLAMIRTRIQ 
IPAAPIREEAAAEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYLRNFTLTVTDFYRSWNCAPGPFHLFPH 
TPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEFQAALGGNVPREQFTVVMLTYEREEVLMNSLERLNGLPYLNKVV 
VVWNSPKLPSEDLLWPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLRHDEIMFGFRVWREARDRIVG 
FPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHKYYAYLYSYVMPQAIRDMVDEYINCEDIAMNFLVSHITRKP 
PIKVTSRWTFRCPGCPQALSHDDSHFHERHKCINFFVKVYGYMPLLYTQFRVDSVLFKTRLPHDKTKCFKFI 

210> SEQ ID NO 1337 
<211> Length : 907 
<212> Type : PRT 



WO 2006/131783 



PCT/IB2005/004037 



578 

<213> Organism : Homo sapiens 

<400> sequence : 1337 
>M7 9217__PEA_1_P2 

MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYYLTTLDEADEAGKRIFGPRVGNELCEVK 
HVLDLCRIRESVSEELLQLEAKRQELNSEIAKLNLKIEACKKSIENAKQDLLQLKNVISQTEHSYKELMAQNQPKLS 
LPIRLLPEKDDAGLPPPKATRGCRLHNCFDYSRCPLTSGFPVYVYDSDQFVFGSYLDPLVKQAFQATARANVYVTEN 
ADIACLYVILVGEMQEPVVLRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTGRAMVAQSTFYTVQY 
RPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKIESLRSSLQEARSFEEEMEGDPPADYDDRIIATLKA 
VQDSKLDQVLVEFTCKNQPKPSLPTEWALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATRLFEALEVGAVPV 
VLGEQVQLPYQDMLQWNEAALVVPKPRVTEVHFLLRSLSDSDLLAMRRQGRFLWETYFSTADSIFNTVLAMIRTRIQ 
IPAAPIREEAAAEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYLRNFTLTVTDFYRSWNCAPGPFHLFPH 
TPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEFQAALGGNVPREQFTVVMLTYEREEVLMNSLERLNGLPYLNKVV 
VVWNSPKLPSEDLLWPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLRHDEIMFGFRVWREARDRIVG 
FPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHKAIRDMVDEYINCEDIAMNFLVSHITRKPPIKVTSRWTFRC 
PGCPQALSHDDSHFHERHKCINFFVKVYGYMPLLYTQFRVDSVLFKTRLPHDKTKCFKFI 

<210> SEQ ID NO 1338 

<211> Length : 212 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1338 
>M7 921 7_PE A_1_P 4 

PELRQPARLGLPECWDYRHEPRC PAQMGSHFIVQAGLKLLASSKPPKCWDYRVWREARDRIVGFPGRYHAWDIPHQS 
WLYNSNYSCELSMVLTGAAFFHKYYAYLYSYVMPQAIRDMVDEYINCEDIAMNFLVSHITRKPPIKVTSRWTFRCPG 
CPQALSHDDSHFHERHKCINFFVKVYGYMPLLYTQFRVDSVLFKTRLPHDKTKCFKFI 

<210> SEQ ID NO 1339 

<211> Length : 812 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1339 
>M7 9217_PEA_1_P8 

MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYYLTTLDEADEAGKRIFGPRVGNELCEVK 
HVLDLCRIRESVSEELLQLEAKRQELNSEIAKLNLKIEACKKSIENAKQDLLQLKNVISQTEHSYKELMAQNQPKLS 
LPIRLLPEKDDAGLPPPKATRGCRLHNCFDYSRCPLTSGFPVYVYDSDQFVFGSYLDPLVKQAFQATARANVYVTEN 
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ADIACLYVILVGEMQEPVVLRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTGRAMVAQSTFYTVQY 
RPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKIESLRSSLQEARSFEEEMEGDPPADYDDRIIATLKA 
VQDSKLDQVLVEFTCKNQPKPSLPTEWALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATRLFEALEVGAVPV 
VLGEQVQLPYQDMLQWNEAALVVPKPRVTEVHFLLRSLSDSDLLAMRRQGRFLWETYFSTADSIFNTVLAMIRTRIQ 
IPAAPIREEAAAEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYLRNFTLTVTDFYRSWNCAPGPFHLFPH 
TPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEFQAALGGNVPREQFTVVMLTYEREEVLMNSLERLNGLPYLNKW 
VVWNSPKLPSEDLLWPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLRHDEIMFGFRVWREARDRIVG 
FPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHKVRKSW 

<210> SEQ ID NO 1340 

<211> Length : 107 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1340 
>M7 921 7_PEA_1_P1 1 

MGKRRHRPRVSLAVCGPPLATLSWLAGCRSASFSGWPACCSGLGCLTITPSQGSAGHCERWLPGQCSSVWTVPQARA 
HQLGSCPEGWDLSGSCAGQSGELLVCGGKL 

<210> SEQ ID NO 1341 

<211> Length : 725 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1341 
>M62 0 9 6_PEA_1_P4 

MATYIHVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTIVICCSPSVF 
NEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEKEKNKTLKNVIQHLEMELNRWRNGEAVPEDEQISAKD 
QKNLEPCDNTPIIDNIAPVVAGISTEEKEKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQMLDQDELLASTRRDY 
EKIQEELTRLQIENEAAKDEVKEVLQALEELAVNYDQKSQEVEDKTRANEQLTDELAQKTTTLTTTQRELSQLQELS 
NHQKKRATEILNLLLKDLGEIGGIIGTNDVKTLADVNGVIEEEFTMARLYISKMKSEVKSLVNRSKQLESAQMDSNR 
KMNASERELAACQLLISQHEAKIKSLTDYMQNMEQKRRQLEESQDSLSEELAKLRAQEKMHEVSFQDKEKEHLTRLQ 
DAEEMKKALEQQMESHREAHQKQLSRLRDEIEEKQKIIDEIRDLNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLL 
LLNDKREQAREDLKGLEETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQKQKISFLENNLEQLTKVH 
KQLVRDNADLRCELPKLEKRLRATAERVKALESALKEAKENAMRDRKRYQQEVDRIKEAVRAKNMARRAHSAQIAKP 

IRPGHYPAS S PTAVHAIRGGGGS S SNSTHYQK 



<210> SEQ ID NO 1342 
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<211> Length : 674 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1342 
>M620 9 6_PEA_1_P5 

MTRILQDSLGGNCRTTIVICCSPSVFNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEKEKNKTLKNVI 
QHLEMELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIIDNIAPVVAGI STEEKEKYDEEI SSLYRQLDDKDDEINQ 
QSQLAEKLKQQMLDQDELLASTRRDYEKIQEELTRLQIENEAAKDEVKEVLQALEELAVNYDQKSQEVEDKTRANEQ 
LTDELAQKTTTLTTTQRELSQLQELSNHQKKRATEILNLLLKDLGEIGGIIGTNDVKTLADVNGVIEEEFTMARLYI 
SKMKSEVKSLVNRSKQLESAQMDSNRKMNASERELAACQLLISQHEAKIKSLTDYMQNMEQKRRQLEESQDSLSEEL 
AKLRAQEKMHEVSFQDKEKEHLTRLQDAEEMKKALEQQMESHREAHQKQLSRLRDEIEEKQKIIDEIRDLNQKLQLE 
QEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGLEETVSRELQTLHNLRKLFVQDLTTRVKKSVELDND 
DGGGSAAQKQKISFLENNLEQLTKVHKQLVRDNADLRCELPKLEKRLRATAERVKALESALKEAKENAMRDRKRYQQ 
EVDRIKEAVRAKNMARRAHSAQIAKPIRPGHYPASSPTAVHAIRGGGGSSSNSTHYQK 

<210> SEQ ID NO 1343 

<211> Length : 593 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1343 
>M62 0 9 6_PEA_1_P3 

MELNRWRNGEAVPEDEQISAKDQKNLEPCDNTPIIDNIAPVVAGISTEEKEKYDEEISSLYRQLDDKDDEINQQSQL 
AEKLKQQMLDQDELLASTRRDYEKIQEELTRLQIENEAAKDEVKEVLQALEELAVNYDQKSQEVEDKTRANEQLTDE 
LAQKTTTLTTTQRELSQLQELSNHQKKRATEILNLLLKDLGEIGGIIGTNDVKTLADVNGVIEEEFTMARLYISKMK 
SEVKSLVNRSKQLESAQMDSNRKMNASERELAACQLLISQHEAKIKSLTDYMQNMEQKRRQLEESQDSLSEELAKLR 
AQEKMHEVSFQDKEKEHLTRLQDAEEMKKALEQQMESHREAHQKQLSRLRDEIEEKQKIIDEIRDLNQKLQLEQEKL 
SSDYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGLEETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDDGGG 
SAAQKQKISFLENNLEQLTKVHKQLVRDNADLRCELPKLEKRLRATAERVKALESALKEAKENAMRDRKRYQQEVDR 
IKEAVRAKNMARRAHSAQIAKPIRPGHYPASSPTAVHAIRGGGGSSSNSTHYQK 

<210> SEQ ID NO 1344 

<211> Length : 239 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1344 
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>M62 0 9 6_PE A_1_P7 

MTQNFRLMWNILLFPLNFSLNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGLEETVSREL 
QTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQKQK1SFLENNLEQLTKVHKQLVRDNADLRCELPKLEKRLRAT 
AERVKALESALKEAKENAMRDRKRYQQEVDRIKEAVRAKNiylARRAHSAQIAKPIRPGHYPASSPTAVHAIRGGGGSS 
SNSTHYQK 

<210> SEQ ID NO 1345 

<211> Length : 737 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1345 
>M6 20 9 6_PE A_JL_P8 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFDRVLPPNTTQEQVYNACAKQIVKDVLEG 
YNGTIFAYGQTSSGKTHTMEGKLHDPQLMG1 IPRIAHDIFDHIYSMDENLEFHIKVSYFEI YLDKIRDLLDVSKTNL 
AVHEDKNRVPYVKGCTERFVSSPEEVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLINIKQENVETEKKLSGKLYLV 
DLAGSEKVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTIVICCSPSV 
FNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEKEKNKTLKNVIQHLEMELNRWRNGEAVPEDEQISAK 
DQKNLEPCDNTPIIDNIAPVVAGISTEEKEKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQMLDQDELLASTRRD 
YEKIQEELTRLQIENEAAKDEVKEVLQALEELAVNYDQKSQEVEDKTRANEQLTDELAQKTTTLTTTQRELSQLQEL 
SNHQKKRATEILNLLLKDLGEIGGIIGTNDVKTLADVNGVIEEEFTMARLYISKMKSEVKSLVNRSKQLESAQMDSN 
RKMNASERELAACQLLISQHEAKIKSLTDYMQNMEQKRRQLEESQDSLSEELAKLRAQEKMHEVSFQDKEKEHLTRL 
QDAEEMKKALEQQMESHREAHQKQLSRLRDEIEEKQKIIDEIRE 

<210> SEQ ID NO 1346 

<211> Length : 514 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1346 
>M620 9 6_PEA_1__P9 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFDRVLPPNTTQEQVYNACAKQIVKDVLEG 
YNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIAHDIFDHIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNL 
AVHEDKNRVPYVKGCTERFVSSPEEVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLINIKQENVETEKKLSGKLYLV 
DLAGSEKVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTIVICCSPSV 
FNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEKEKNKTLKNVIQHLEMELNRWRNGEAVPEDEQISAK 
DQKNLEPCDNTPIIDNIAPWAGISTEEKEKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQMLDQDEVKNAIYFF 
FHKVLLLLFVVDVCSRNLIGIEAFHNYRIMWKFLGRCPFTASYKLIITEFRK 
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<210> SEQ ID NO 1347 

<211> Length : 125 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1347 
>M 6 2 0 9 6_PEA_1_P1 0 

MTQNFRLMWNILLFPLNFSLNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKLLLLNDKREQAREDLKGLEETVSREL 
QTLHNLRKLFVQDLTTRVKKVSSLCLNGTEKKIKDGREESFSVEISLA 

<210> SEQ ID NO 1348 

<211> Length : 385 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1348 
>M62 0 9 6_PEA_1_P1 1 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFDRVLPPNTTQEQVYNACAKQIVKDVLEG 
YNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIAHDIFDHIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNL 
AVHEDKNRVPYVKGCTERFVSSPEEVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLINIKQENVETEKKLSGKLYLV 
DLAGSEKVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTIVICCSPSV 
FNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEKEKNKTLKNVIQHLEMELNRWRNDFLAAHVFGKLLE 

<210> SEQ ID NO 1349 

<211> Length : 324 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1349 
>M 6 2 0 9 6_PEA_1_P 1 2 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFDRVLPPNTTQEQVYNACAKQIVKDVLEG 
YNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIAHDIFDHIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNL 
AVHEDKNRVPYVKGCTERFVSSPEEVMDVIDEGKANRHVAVTNMNEHSSRSHSIFLINIKQENVETEKKLSGKLYLV 
DLAGSEKVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTIVICCSPSV 

FNEAETKSTLMFGQRV 
<210> SEQ ID NO 1350 
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<211> Length : 519 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1350 
>M7 807 6_PEA_1_P3 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGLCGRLTLHRDLRTGRWEPD 
PQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPE 
GCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSW 
PPGSRVEGAE DEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDI YFGMPGEI SEHE 
GFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQALNEHFQSILQTLEEQVSGERQRLVETHATRVIALIN 
DQRRAALEGFLAALQADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQ 
SLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDGE 

<210> SEQ ID NO 1351 
<211> Length : 541 
<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1351 
>M7 807 6_PE A_1_P 4 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGLCGRLTLHRDLRTGRWEPD 
PQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPE 
GCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSW 
PPGSRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGMPGEISEHE 
GFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQALNEHFQSILQTLEEQVSGERQRLVETHATRVIALIN 
DQRRAALEGFLAALQADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQ 
SLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMTLPKGECLTVNPSLQIPL 
NP 

<210> SEQ ID NO 1352 

<211> Length : 544 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1352 
>M7 807 6_PEA__1_P 1 2 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGLCGRLTLHRDLRTGRWEPD 
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PQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPE 
GCRFLHQERMDQCESSTRRHQEAQEACS SQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSW 
PPGSRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDI YFGMPGEISEHE 
GFLRAKMDLEERRMRQINEVMREWAMADNQSRNLPKADRQALNEHFQSILQTLEEQVSGERQRLVETHATRVIALIN 
DQRRAALEGFLAALQADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQ 
SLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMTLPKGECVCSKGFPFPLI 

GDSEG 

<210> SEQ ID NO 1353 

<211> Length : 619 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1353 
>M7 807 6_PEA_1_P 1 4 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGLCGRLTLHRDLRTGRWEPD 
PQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPE 
GCRFLHQERMDQCESSTRRHQEAQEACSSQGL1LHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSW 
PPGSRVE GAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDI YFGMPGEISEHE 
GFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQALNEHFQSILQTLEEQVSGERQRLVETHATRVIALIN 
DQRRAALEGFLAALQADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQ 
SLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMTLPKGSTEQDAASPEKEK 
MNPLEQYERKVNASVPRGFPFHSSEIQRDELVRGGTAGYLGEETRGQRPGCDSQSHTGPSKKPSAPSPLPAGTSWDR 

GVP 

<210> SEQ ID NO 1354 

<211> Length : 597 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1354 
>M7 807 6_PE A_1_P2 1 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGLCGRLTLHRDLRTGRWEPD 
PQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGE FVSEALLVPE 
GCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSW 
PPGSRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDI YFGMPGEISEHE 
GFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQALNEAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVD 
PEKAQQMRFQVHTHLQVIEERVNQSLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDS 
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KDDTPMTLPRGSTEQDAASPEKEKMNPLEQYERKVNASVPRGFPFHSSEIQRDELAPAGTGVSREAVSGLLIMGAGG 
GSLIVLSMLLLRRKKPYGAISHGVVEVDPMLTLEEQQLRELQRHGYENPTYRFLEERP 

<210> SEQ ID NO 1355 

<211> Length : 498 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1355 
>M7 807 6_PEA__1_P2 4 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGLCGRLTLHRDLRTGRWEPD 
PQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPE 
GCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSW 
PPGSRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGMPGEISEHE 
GFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQALNEHFQSILQTLEEQVSGERQRLVETHATRVIALIN 
DQRRAALEGFLAALQADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQ 
SLGLLDQNPHLAQELRPQIRECLLPWLPLQISEGRS 

<210> SEQ ID NO 1356 

<211> Length : 588 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1356 
>M7 8 07 6_PEA_1_P2 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGLCGRLTLHRDLRTGRWEPD 
PQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPE 
GCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSW 
PPGSRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDIYFGMPGEISEHE 
GFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQALNEHFQSILQTLEEQVSGERQRLVETHATRVIALIN 
DQRRAALEGFLAALQADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVLTSFQLPNAPLFL 
RRPRLRLFSCPLDPLSVSWTPSYPLNTASLPLPSLSAQLPDPETWTLTCCVFDPCFLALGFLLPPPSILCSVPWIFT 
AFPRIVFFFFFFLRQVLALSPRQESSVRSWLIATSTSWVQAILLPQPLE 

<210> SEQ ID NO 1357 

<211> Length : 505 

<212> Type : PRT 

<213> Organism : Homo sapiens 
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<400> sequence : 1357 
>M7 8 07 6_PEA_1_P2 5 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGLCGRLTLHRDLRTGRWEPD 
PQRSRRCLRDPQRVLEYCRQMYPELQIARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPE 
GCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSW 
PPGSRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDI YFGMPGEISEHE 
GFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQALNEHFQSILQTLEEQVSGERQRLVETHATRVIALIN 
DQRRAALEGFLAALQADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQPQNPNSQPRAAGSL 
EVIISHPFVRRLEILISPFQFQNSIPKNSQIVPAASPRGTSSP 

<210> SEQ ID NO 1358 

<211> Length : 62 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1358 
>T99080_PEA_4_P1 

MPASARLAGAGLLLAFLRALGCAGRAPGLSMAEGNTLISVDYEIFGKVQGVFFRKHTQLFTI 

<210> SEQ ID NO 1359 

<211> Length : 64 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1359 
>T 9 9 0 8 0_PEA_4_P2 

SGRGGLRALVSRWRGGPGVILAAGGEDKEGLSMAEGNTLISVDYEIFGKVQGVFFRKHTQLFTI 

<210> SEQ ID NO 1360 

<211> Length : 129 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1360 
>T 9 9 0 8 0_PEA_4_P 5 

MPASARLAGAGLLLAFLRALGCAGRAPGLSMAEGNTLISVDYEIFGKVQGVFFRKHTQAEGKKLGLVGWVQNTDRGT 
VQGQLQGPI SKVRHMQEWLETRGSPKSHIDKANFNNEKVILKLDYSDFQIVK 
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<210> SEQ ID NO 1361 

<211> Length : 73 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1361 
>T990 80_PEA_4_P8 

MQAEGKKLGLVGWVQNTDRGTVQGQLQGPISKVRHMQEWLETRGSPKSHIDKANFNNEKVILKLDYSDFQIVK 

<210> SEQ ID NO 1362 

<211> Length : 87 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1362 
>T990 80_PEA_4_P9 

SGRGGLRALVSRWRGGPGVILAAGGEDKEGLSMAEGNTLISVDYEIFGKVQGVFFRKHTQEMTVENRIAETHSKSCV 
PVSFATSYAG 

<210> SEQ ID NO 1363 

<211> Length : 80 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1363 
>T 9 9 0 8 0__PE A_4_P 1 0 

SGRGGLRALVSRWRGGPGVILAAGGEDKEGLSMAEGNTLISVDYEIFGKVQGVFFRKHTQLCFWEAEEGGSFEPGRL 
RLQ 

<210> SEQ ID NO 1364 

<211> Length : 94 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1364 
>T 9 9 0 8 0_PEA_4_P1 2 

SGRGGLRALVSRWRGGPGVILAAGGEDKEGLSMAEGNTLISVDYEIFGKVQGVFFRKHTQEMTVENRIAETHSKSCV 
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P V S F A T S Y AG I NE FKR I 

<210> SEQ ID NO 1365 

<211> Length : 69 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1365 
>T990 8 0_PEA_4_P13 

SGRGGLRALVSRWRGGPGVILAAGGEDKEGLSMAEGNTLISVDYEIFGKVQGVFFRKHTQVCGLQALGW 

<210> SEQ ID NO 1366 

<211> Length : 85 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1366 
>T 9 9 0 8 0_PEA_4_P1 4 

MPASARLAGAGLLLAFLRALGCAGRAPGLSMAEGNTLISVDYEIFGKVQGVFFRKHTQEMTVENRIAETHSKSCVPV 
SFATSYAG 

<210> SEQ ID NO 1367 

<211> Length : 78 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1367 
>T 9 9 0 8 0_PEA_4_P1 5 

MPASARLAGAGLLLAFLRALGCAGRAPGLSMAEGNTLISVDYEIFGKVQGVFFRKHTQLCFWEAEEGGSFEPGRLRL 
Q 

<210> SEQ ID NO 1368 

<211> Length : 92 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1368 
>T99080 PEA 4 P16 
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MPASARLAGAGLLLAFLRALGCAGRAPGLSMAEGNTLISVDYEIFGKVQGVFFRKHTQEMTVENRIAETHSKSCVPV 
SFATS YAGINEFKRI 

<210> SEQ ID NO 1369 

<211> Length : 67 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1369 
>T 9 9 0 8 0_PE A_4__P1 7 

MPASARLAGAGLLLAFLRALGCAGRAPGLSMAEGNTLISVDYEIFGKVQGVFFRKHTQVCGLQALGW 

<210> SEQ ID NO 1370 

<211> Length : 1305 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1370 
>T 0 8 4 4 6_PEA_1_P1 8 

MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRGPFPRLADCAHFHYENVDF 
GHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSYDDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQ 
MLVPLLLQYLETLSGLVDSNLNCGPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIV 
SVIDMPPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPGLRADADGPPCGIPAPQGISSLTSAVPRPRGKLAGLLR 
TFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSEFIEAHGVVDGIYRLSGVSSNIQRLRHEFDS 
ERIPELSGPAFLQDIHSVSSLCKLYFRELPNPLLTYQLYGKFSEAMSVPGEEERLVRVHDVIQQLPPPHYRTLEYLL 
RHLARMARHSANTSMHARNLAIVWAPNLLRSMELESVGMGGAAAFREVRVQSVVVEFLLTHVDVLFSDTFTSAGLDP 
AGRCLLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAERRKGERGEKQRKPGGSSWKTFFALG 
RGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRSAKSEESLSSQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSC 
ESLSSSSSSESSSSESSSSSSESSAAGLGALSGSPSHRTSAWLDDGDELDFSPPRCLEGLRGLDFDPLTFRCSSPTP 
GDPAPPASPAPPAPASAFPPRVTPQAISPRGPTSPASPAALDISEPLAVSVPPAVLELLGAGGAPASATPTPALSPG 
RSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLPPPPLSLLRPGGAPPPPPKNPARLMALALAE 
RAQQVAEQQSQQECGGTPPASQSPFHRSLSLEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQ 
RPMGTSRRGLRGPAQVSAQLRAGGGGRDAPEAAAQSPCSVPSQVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSF 
QPSSPAPVWRSSLGPPAPLDRGENLYYEIGASEGSPYSGPTRSWSPFRSMPPDRLNASYGMLGQSPPLHRSPDFLLS 
YPPAPSCFPPDHLGYSAPQHPARRPTPPEPLYVNLALGPRGPSPASSSSSSPPAHPRSRSDPGPPVPRLPQKQRAPW 
GPRTPHRVPGPWGPPEPLLLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHSEGQTRSYC 
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<210> SEQ ID NO 1371 

<211> Length : 246 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1371 
>T0 8 4 4 6 JPEA_1_P1 9 

MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRGPFPRLADCAHFHYENVDF 
GHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSYDDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQ 
MLVPLLLQYLETLSGLVDSNLNCGPVLTWMEVGLGRGLGDSEWVRGCVCHHAQHREILDGNRVASAVEDEGAEVDGE 
AFRWG SLWVGE S WDM 

<210> SEQ ID NO 1372 

<211> Length : 1081 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1372 
>HUMCA1XIA_P14 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTTGFCTNRKNSKGSDTAYRVS 
KQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIYNEHGIQQIGVEVGRSPVFLFEDHTGKPAPEDYPLF 
RTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDTNGITVFGTRILDEEVFEGDIQQFLITGDPKA 
AYDYCEHYSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQTEANIVDDFQE 
YNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDSQRKNSEDTLYENKEIDGRDSDLLVDGDLGEYDFYE 
YKEYEDKPTSPPNEEFGPGVPAETDITETSINGHGAYGEKGQKGEPAVVEPGMLVEGPPGPAGPAGIMGPPGLQGPT 
GPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARIALRGPPGPMGLTGR 
PGPVGGPGSSGAKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDRGFDGLPGLPGDKGHRG 
ERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAGPRGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQ 
GNPGPQGLPGPQGPIGPPGEKGPQGKPGLAGLPGADGPPGHPGKEGQSGEKGALGPPGPQGPIGYPGPRGVKGADGV 
RGLKGSKGEKGEDGFPGFKGDMGLKGDRGEVGQIGPRGEDGPEGPKGRAGPTGDPGPSGQAGEKGKLGVPGLPGYPG 
RQGPKGSTGFPGFPGANGEKGARGVAGKPGPRGQRGPTGPRGSRGARGPTGKPGPKGTSGGDGPPGPPGERGPQGPQ 
GPVGFPGPKGPPGPPGKDGLPGHPGQRGETGFQGKTGPPGPGGVVGPQGPTGETGPIGERGHPGPPGPPGEQGLPGA 
AGKEGAKGDPGPQGISGKDGPAGLRGFPGERGLPGAQGAPGLKGGEGPQGPPGPWSMMIINSQTIMVVNYSSSFIT 
LML 

<210> SEQ ID NO 1373 
<211> Length : 729 
<212> Type : PRT 
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<213> Organism : Homo sapiens 

<400> sequence : 1373 
>HUMCA1XIA_P15 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTTGFCTNRKNSKGSDTAYRVS 

KQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIYNEHGIQQIGVEVGRSPVFLFEDHTGKPAPEDYPLF 

RTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDTNGITVFGTRILDEEVFEGDIQQFLITGDPKA 

AYDYCEHYSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQTEANIVDDFQE 

YNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDSQRKNSEDTLYENKEIDGRDSDLLVDGDLGEYDFYE 

YKEYEDKPTSPPNEEFGPGVPAETDITETSINGHGAYGEKGQKGEPAVVEPGMLVEGPPGPAGPAGIMGPPGLQGPT 

GPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARIALRGPPGPMGLTGR 

PGPVGGPGSSGAKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDRGFDGLPGLPGDKGHRG 

ERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAGPRGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQ 
GNPGPQGLPGPQGPIGPPGEKMCCNLSFGILIPLQK 

<210> SEQ ID NO 1374 

<211> Length : 738 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1374 
>HUMCA1XIA_P16 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTTGFCTNRKNSKGSDTAYRVS 

KQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIYNEHGIQQIGVEVGRSPVFLFEDHTGKPAPEDYPLF 

RTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDTNGITVFGTRILDEEVFEGDIQQFLITGDPKA 

AYDYCEHYSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQTEANIVDDFQE 

YNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDSQRKNSEDTLYENKEIDGRDSDLLVDGDLGEYDFYE 

YKEYEDKPTSPPNEEFGPGVPAETDITETSINGHGAYGEKGQKGEPAWEPGMLVEGPPGPAGPAGIMGPPGLQGPT 

GPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARXALRGPPGPMGLTGR 

PGPVGGPGSSGAKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDRGFDGLPGLPGDKGHRG 

ERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAGMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQGLPGPQGPIGPP 
GEKVSFSFSLFYKKVIKFACDKRFVGRHDERKVVKLSLPLYLIYE 

<210> SEQ ID NO 1375 

<211> Length : 273 

<212> Type : PRT 

<213> Organism : Homo sapiens 
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<400> sequence : 1375 
>HUMCA1XIAJ?17 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTTGFCTNRKNSKGSDTAYRVS 
KQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIYNEHGIQQIGVEVGRSPVFLFEDHTGKPAPEDYPLF 
RTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDTNGITVFGTRILDEEVFEGDIQQFLITGDPKA 
AYDYCEHYSPDCDSSAPKAAQAQEPQIDEVRSTRPEKVFVFQ 

<210> SEQ ID NO 1376 

<211> Length : 154 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1376 
>T1 1 62 8_PEA_1_P2 

MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASEDLKKHGATVLTALGGIL 
KKKGHHEAEIKPLAQSHATKHK1PVKYLEFISECIIQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG 

<210> SEQ ID NO 1377 

<211> Length : 99 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1377 
>T11628_PEA_1_P5 

MKASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKHPGDFGADAQGAM 
NKALELFRKDMASNYKELGFQG 

<210> SEQ ID NO 1378 

<211> Length : 135 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1378 
>T11628_PEA_1_P7 

MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASEDLKKHGATVLTALGGIL 
KKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECI IQVLQSKHPGDFGADAQGAMNKG 

<210> SEQ ID NO 1379 
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<211> Length : 154 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1379 
>T 1 1 6 2 8_PEA_1_P 1 0 

MGLSDGEWQLVLNVWGKVEADI PGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASEDLKKHGATVLTALGGIL 
KKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECI IQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG 

<210> SEQ ID NO 1380 

<211> Length : 315 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1380 
>HUMCEA_PEA_1_P4 

MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKEVLLLVHNLPQHLFGYSWYKGERVDGNR 
QI IGYVIGTQQATPGPAYSGREI I YPNASLLIQNI IQNDTGFYTLHVIKSDLVNEE AT GQFRVYPELPKPS IS SNNS 
KPVEDKDAVAFTCEPETQDATYLWWVNNQSLPVSPRLQLSNGNRTLTLFNVTRNDTASYKCETQNPVSARRSDSVIL 
NVLCEYICSSLAQAASPNPQGQRQDFSVPLRFKYTDPQPWTSRLSVTFCPRKTWADQVLTKNRRGGAASVLGGSGST 

PYDGRNR 

<210> SEQ ID NO 1381 

<211> Length : 719 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1381 
>HUMCEA_PEA_1_P5 

MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKEVLLLVHNLPQHLFGYSWYKGERVDGNR 
QI IGYVIGTQQATPGPAYSGREI IYPNASLLIQNI IQNDTGFYTLHVIKSDLVNEEATGQFRVYPELPKPSI SSNNS 
KPVEDKDAVAFTCEPETQDATYLWWVNNQSLPVSPRLQLSNGNRTLTLFNVTRNDTASYKCETQNPVSARRSDSVIL 
NVLYGPDAPTISPLNTSYRSGENLNLSCHAASNPPAQYSWFVNGTFQQSTQELFIPNITVNNSGSYTCQAHNSDTGL 
NRTTVTTITVYAEPPKPFITSNNSNPVEDEDAVALTCEPEIQNTTYLWWVNNQSLPVSPRLQLSNDNRTLTLLSVTR 
NDVGPYECGIQNELSVDHSDPVILNVLYGPDDPTISPSYTYYRPGVNLSLSCHAASNPPAQYSWLIDGNIQQHTQEL 
FISNITEKNSGLYTCQANNSASGHSRTTVKTITVSAELPKPSISSNNSKPVEDKDAVAFTCEPEAQNTTYLWWVNGQ 
SLPVSPRLQLSNGNRTLTLFNVTRNDARAYVCGIQNSVSANRSDPVTLDVLYGPDTPIISPPDSSYLSGANLNLSCH 
SASNPSPQYSWRINGIPQQHTQVLFIAKITPNNNGTYACFVSNLATGRNNSIVKSITVSGKWLPGASASYSGVESIW 
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FSPKSQEDIFFPSLCSMGTRKSQILS 

<210> SEQ ID NO 1382 

<211> Length : 569 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1382 
>HUMCEA_PEA_1_P1 4 

MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKEVLLLVHNLPQHLFGYSWYKGERVDGNR 
QIIGYVIGTQQATPGPAYSGREIIYPNASLLIQNI IQNDTGFYTLHVIKSDLVNEEATGQFRVYPELPKPSISSNNS 
KPVEDKDAVAFTCEPETQDATYLWWVNNQSLPVSPRLQLSNGNRTLTLFNVTRNDTASYKCETQNPVSARRSDSVIL 
NVLYGPDAPTISPLNTSYRSGENLNLSCHAASNPPAQYSWFVNGTFQQSTQELFIPNITVNNSGSYTCQAHNSDTGL 
NRTTVTTITVYAEPPKPFITSNNSNPVE DEDAVALTCEPEIQNTTYLWWVNNQSLPVSPRLQLSNDNRTLTLLSVTR 
NDVGPYECGIQNELSVDHSDPVILNVLYGPDDPTI SPSYTYYRPGVNLSLSCHAASNPPAQYSWLIDGNIQQHTQEL 
FISNITEKNSGLYTCQANNSASGHSRTTVKTITVSAELPKPSISSNNSNPVEDKDAVAFTCEPEVQNTTYLWWVNGQ 
SLPVSPRLQLSNGNMTLTLLSCQKERCRIL 

<210> SEQ ID NO 1383 

<211> Length : 346 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1383 
>HUMCEA__PEA_1_P1 9 

MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKEVLLLVHNLPQHLFGYSWYKGERVDGNR 
QIIGYVIGTQQATPGPAYSGREI I YPNASLLIQNIIQNDTGFYTLHVIKSDLVNEEATGQFRVYPELPKPSISSNNS 
KPVEDKDAVAFTCEPETQDATYLWWVNNQSLPVSPRLQLSNGNRTLTLFNVTRNDTASYKCETQNPVSARRSDSVIL 
NVLYGPDTPIISPPDSSYLSGANLNLSCHSASNPSPQYSWRINGIPQQHTQVLFIAKITPNNNGTYACFVSNLATGR 
NNSIVKSITVSASGTSPGLSAGATVGIMIGVLVGVALI 

<210> SEQ ID NO 1384 

<211> Length : 346 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1384 
>HUMCEA PEA 1 P2 0 
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MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKEVLLLVHNLPQHLFGYSWYKGERVDGNR 
QI IGYVIGTQQATPGPAYSGREIIYPNASLLIQNI IQNDTGFYTLHVTKSDLVNEEATGQFRVYPELPKPSI SSNNS 
KPVEDKDAVAFTCEPEAQNTTYLWWVNGQSLPVSPRLQLSNGNRTLTLFNVTRNDARAYVCGIQNSVSANRSDPVTL 
DVLYGPDTPIISPPDSSYLSGANLNLSCHSASNPSPQYSWRINGIPQQHTQVLFIAKITPNNNGTYACFVSNLATGR 
NNSIVKSITVSASGTSPGLSAGATVGIMIGVLVGVALI 



<210> SEQ ID NO 1385 

<211> Length : 385 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1385 

>R3 5 1 37_PEA_1_PEA_1_PEA_1__P 9 

MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVKKPFTEVIRANIGDAQAMGQR 
PITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNV 
FLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDH 
CRPRALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEVRGAGEREAGQQSAPVTPCALPGVPGQRVRRGFAV 
PLIQEGAHGDGAALRRAAGACLLPLHLQGLHGRVRAYEAGGGSRAMARPSSPDGPPPPPHLTWPCAGAGSAAAMWRW 

<210> SEQ ID NO 1386 

<211> Length : 346 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1386 

>R3 5 1 3 7__PEA_1_PEA__1_PEA_1_P 8 

MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVKKPFTEVIRANIGDAQAMGQR 
PITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNV 
FLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDH 
CRPRALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEVYQDNVYAAGSQFHSFKKVLMEMGPPYAGQQELAS 
FHSTSKGYMGECVRTRRVGARGPWPGPPRPMGHPLLRT 

<210> SEQ ID NO 1387 
<211> Length : 271 
<212> Type : PRT 
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<213> Organism : Homo sapiens 

<400> sequence : 1387 

>R3 513 7_PEA_1_PEA_1_PEA_1_P 1 1 

MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVKKPFTEVIRANIGDAQAMGQR 
PITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNV 
FLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARSG 
FGQREGTYHFRMT1LPPLEKLRLLLEKLSRFHAKFTLEYS 

<210> SEQ ID NO 1388 

<211> Length : 399 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1388 

>R3 513 7_PEA_1_PEA_1_PEA_1_P2 

MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVKKPFTEVIRANIGDAQAMGQR 
PITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNV 
FLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDH 
CRPRALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEVRGAGEREAGQQSAPVTPCALPGVPGQRVRRGFAV 
PLIQEGAHGDGAALRRAAGACLLPLHLQGLHGRVRVPRRLCGGGEHGRCSAAADAEADECAAVPAGARTGPAGPGGQ 
PARAHRPLLCAVPG 

<210> SEQ ID NO 1389 

<211> Length : 555 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1389 

>R3 513 7_PEA_1_PEA_1_PEA_1_P 4 

MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVKKPFTEVIRANIGDAQAMGQR 
PITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNV 
FLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDH 
CRPRALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEVYQDNVYAAGSQFHSFKKVLMEMGPPYAGQQELAS 
FHSTSKGYMGECGFRGGYVEVVNMDAAVQQQMLKLMSVRLCPPVPGQALLDLVVSPPAPTDPSFAQFQAEKQAVLAE 
LAAKAKLTEQVFNEAPGISCNPVQGAMYSFPRVQLPPRAVERAQELGLAPDMFFCLRLLEETGICVVPGSGFGQREG 
TYHFRMTILPPLEKLRLLLEKLSRFHAKFTLESPGRLWSPLYLLLMPGGVGWGGCWAPASLQVPNKAVWQSDSKKEA 
LAAAWPAPTCLPFLQA 
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<210> SEQ ID NO 1390 

<211> Length : 139 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1390 
> Z 2 5 2 9 9_PEA_2_P2 

MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQCPGKKRCCPDTCGIKCLDPVD 
' TPNPTRRKPGKCPVTYGQCLMLNPPNFCEMDGQCKRDLKCCMGMCGKSCVSPVKGKQGMRAH 

<210> SEQ ID NO 1391 

<211> Length : 156 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1391 
>Z2 52 99_PEA_2_P3 

MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQCPGKKRCCPDTCGIKCLDPVD 
TPNPTRRKPGKCPVTYGQCLMLNPPNFCEMDGQCKRDLKCCMGMCGKSCVSPVKGEKRHHKQLRDQEVDPLEMRRHS 

AG 

<210> SEQ ID NO 1392 

<211> Length : 89 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1392 

>Z252 99_PEA_2_P7 

MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQCPGKKRCCPDTCGIKCLDPVD 

TPNPRGSLGSAQ 

<210> SEQ ID NO 1393 

<211> Length : 82 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1393 
>Z25299 PEA 2 P10 
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MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQCPGKKRCCPDTCGIKCLDPVD 
TPNPT 

<210> SEQ ID NO 1394 

<211> Length : 496 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1394 
>HSSTROL3_P4 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSSPAPAPATQEAPRPASSLRP 
PRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRAD 
IMIDFARYWHGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLGLQHTTAAKA 
LMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLEPDAPPDACEASFDAVSTIRGEL 
FFFKAGFVWRLRGGQLQPGYPALASRHWQGLPSPVDAAFEDAQGHIWFFQGAQYWVYDGEKPVLGPAPLTELGLVRF 
PVHAALVWGPEKNKIYFFRGRDYWRFHPSTRRVDSPVPRRATDWRGVPSEIDAAFQDADGALGVRQLVGGGHSSRFS 
HLVVAGLPHACHRKSGSSSQVLCPEPSALLSVAG 

<210> SEQ ID NO 1395 

<211> Length : 382 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1395 
>HSSTROL3_P5 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSSPAPAPATQEAPRPASSLRP 
PRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRAD 
IMIDFARYWHGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLGLQHTTAAKA 
LMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLEPDAPPDACEASFDAVSTIRGEL 
FFFKAGFVWRLRGGQLQPGYPALASRHWQGLPSPVDAAFEDAQGHIWFFQELGFPSSTGRDESLEHCRCQGLHK 

<210> SEQ ID NO 1396 

<211> Length : 370 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1396 
>HSSTROL3 P7 
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MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSSPAPAPATQEAPRPASSLRP 
PRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRAD 
IMIDFARYWHGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLGLQHTTAAKA 
LMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGI DTNEIAPLEPDAPPDACEASFDAVSTIRGEL 
FFFKAGFVWRLRGGQLQPGYPALASRHWQGLPSPVDAAFEDAQGHIWFFQGTTGVSTPAPGV 

<210> SEQ ID NO 1397 

<211> Length : 301 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1397 
>HSSTROL3_P8 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSSPAPAPATQEAPRPASSLRP 
PRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRAD 
IMIDFARYWHGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLGLQHTTAAKA 
LMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLEVRPCLPVPLLLCWPL 

<210> SEQ ID NO 1398 

<211> Length : 354 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1398 
>HSSTROL3_P9 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSSPAPAPATQEAPRPASSLRP 
PRCGVPDPSDGLSARNRQKRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYWHGDDLPF 
DGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSP 
DDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLEPDAPPDACEASFDAVSTIRGELFFFKAGFVWRLRGGQL 
QPGYPALASRHWQGLPSPVDAAFEDAQGHIWFFQGTTGVSTPAPGV 

<210> SEQ ID NO 1399 

<211> Length : 137 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1399 
>HUMTREFAC PEA 2_P7 
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MAARALCMLGLVLALLSSSSAEEYVGLSQQGLWQLTGLCLGQLQTSVPCQPRTGWTAATPMSPPRSATTGAAALTPG 
SLECLGVSSPCRKQNAPSEAPPAAPGRGMRGSEHPCPAVXAARHCSSQLFCPFAPGKRFC 

<210> SEQ ID NO 1400 

<211> Length : 41 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1400 
>HUMTRE FAC_PE A_2_P 8 

MAARALCMLGLVLALLSSSSAEEYVGLWKVHLPKGEGFSSG 

<210> SEQ ID NO 1401 

<211> Length : 159 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1401 
>HSS100PCB_P3 

MHTVVSWSDDMFLPILIVFLSTWTHSSLAQGWVISTPCCCCSDLHPGPAWSHSENALLSHGQALRALPAWPSQRRPE 
SLGPKQRGRTGLRVLAPCSLHTWGSTLAFPGSLCDFRAGPLGPLGLIIFICNRAMPLPCLVVLRSSCLCKQLVQCQH 

EMGGP 

<210> SEQ ID NO 1402 

<211> Length : 187 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1402 
>R207 7 9_P2 

MCAERLGQFMTLALVLATFDPARGTDATNPPEGPQDRSSQQKGRLSLQNTAEIQHCLVNAGDVGCGVFECFENNSCE 
IRGLHGICMTFLHNAGKFDAQGKSFIKDALKCKAHALRHRFGCISRKCPAIREMVSQLQRECYLKHDLCAAAQENTR 

VI VEMI HFKDLLLHEC YKI E I TMPKRRKVKLRD 

<210> SEQ ID NO 1403 

<211> Length : 449 

<212> Type : PRT 

<213> Organism : Homo sapiens 
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<400> sequence : 1403 
>R3 814 4__PE A_2_P 6 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLT 
LIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMA 
EEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSD 
IGLVGNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWVQMYKGTVS 
MPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRATG 
DPTLLELGRDAVESIEKISKVECGFATLASFSHMSDQRSARPQAGQPHGVVLPGRDCEIPLPPV 

<210> SEQ ID NO 1404 

<211> Length : 341 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1404 
>R3 814 4_PE A_2JP 1 3 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLT 
LIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMA 
EEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSD 
IGLVGNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWVQMYKGTVS 
MPVFQSLEAYWPGLQNLLKAQCTSTVPRGIPPS 

<210> SEQ ID NO 1405 

<211> Length : 287 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1405 
>R3 8144 _PEA_2_P 1 5 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLT 

LIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMA 

EEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSD 

IGLVGNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEPHWRH 

<210> SEQ ID NO 1406 

<211> Length : 433 

<212> Type : PRT 

<213> Organism : Homo sapiens 
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<400> sequence : 1406 
>R3 81 4 4_PEA_2_P1 9 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLT 
LIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMA 
EEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSD 
IGLVGNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWVQMYKGTVS 
MPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRATG 
DPTLLELGRDAVESIEKISKVECGFATKRSRSVAQAGVQWCDHDSPQP 

<210> SEQ ID NO 1407 

<211> Length : 418 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1407 
>R3 814 4_PEA_2_P2 4 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLT 
LIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFETNIREYNKAIRNYTRFDDWYLWVQMYKGTVSMPVFQS 
LEAYWPGLQSL1GDIDNAMRTFLNYYTVWKQFGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLE 
LGRDAVESIEKISKVECGFATIKDLRDHKLDNRMESFFLAETVKYLYLLFDPTNFIHNNGSTFDAVITPYGECILGA 
GGYIFNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQKNTVSSGPWEPPARPGTLFSPENHDQARE 
RKPAKQKVPLLSCPSQPFTSKLALLGQVFLDSS 

<210> SEQ ID NO 1408 

<211> Length : 60 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1408 
>R3 814 4_PE A_2_P3 6 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRFWGMSQNSKEWLKCSRTAWTLILM 

<210> SEQ ID NO 1409 

<211> Length : 112 

<212> Type : PRT 

<213> Organism : Homo sapiens 
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<400> sequence : 1409 
>R1 1 7 2 3_PEA_1_P2 

MYAQALLVVGVLQRQAAAQHLHEHPPKLLRGHRVQERVDDRAEVEKRLREGEEDHVRPEVGPRPVVLGFGRSHDPPN 

LVGHPAYGQCHNNQPWADTSRRERQRKEKHSMRTQ 

<210> SEQ ID NO 1410 

<211> Length : 222 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1410 
>R1 17 2 3_PEA_1_P 6 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEVMEQSAGIMYRKSCASSAAC 
LIASAGSPCRGLAPGREEQRALHKAGAVGGGVRMYAQALLVVGVLQRQAAAQHLHEHPPKLLRGHRVQERVDDRAEV 
EKRLREGEEDHVRPEVGPRPVVLGFGRSHDPPNLVGHPAYGQCHNNQPWADTSRRERQRKEKHSMRTQ 

<210> SEQ ID NO 1411 

<211> Length : 93 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1411 
>R1 172 3_PEA__1_P7 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEVMEQSAGSHCVTRLECSGTI 
SAHCNLCLPGSNDHPT 

<210> SEQ ID NO 1412 

<211> Length : 84 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1412 
>R1 1 7 2 3_PEA_1_P 1 3 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEVMEQSADTKRTNTLLFEMRH 
FAKQLTT 

<210> SEQ ID NO 1413 
<211> Length : 90 
<212> Type : PRT 
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<213> Organism : Homo sapiens 

<400> sequence : 1413 
>R1 17 2 3_PEA_1_P1 0 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEVMEQSADRVSLCHEAGVQWN 
NFSTLQPLPPRLK 

<210> SEQ ID NO 1414 

<211> Length : 111 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1414 
>R162 7 6_PEA_1_P7 

MQSVQSTSFCLRKQCLCLTFLLLHLLGQVAATQRCPPQCPGQCPATPPTCAPGVRAVLDGCSCCLVCARQRGESCSD 
LEPCDESSGLYCDRSADPSNQTGICTGNPAPSAV 

<210> SEQ ID NO 1415 

<211> Length : 111 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1415 
>HSU3314 7_PEA_1_P5 

MKLLMVLMLAALSQHCYAGSGCPLLENVISKTINPQVSKTEYKELLQEFIDDNATTNAIDELKECFLNQTDETLSNV 
EQLIYDSSLCDLF 

<210> SEQ ID NO 1416 

<211> Length : 93 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1416 
>Mammaglobin A precursor 

MKLLMVLMLAALSQHCYAGSGCPLLENVISKTINPQVSKTEYKELLQEFIDDNATTNAIDELKECFLNQTDETLSNV 
EVFMQLIYDSSLCDLF 
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<210> SEQ ID NO 1417 

<211> Length : 1055 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1417 

>Ephrin type-B receptor 2 [precursor] 

MALRRLGAALLLLPLLAAVEETLMDSTTATAELGWMVHPPSGWEEVSGYDENMNTIRTYQVCNVFESSQNNWLRTKF 

IRRRGAHRIHVEMKFSVRDCSSIPSVPGSCKETFNLYYYEADFDSATKTFPNWMENPWVKVDTIAADESFSQVDLGG 

RVMKINTEVRSFGPVSRSGFYLAFQDYGGCMSLIAVRVFYRKCPRIIQNGAIFQETLSGAESTSLVAARGSCIANAE 

EVDVPIKLYCNGDGEWLVPIGRCMCKAGFEAVENGTVCRGCPSGTFKANQGDEACTHCPINSRTTSEGATNCVCRNG 

YYRADLDPLDMPCTTIPSAPQAVISSVNETSLMLEWTPPRDSGGREDLVYNIICKSCGSGRGACTRCGDNVQYAPRQ 

LGLTEPRIYI SDLLAHTQYTFEIQAVNGVTDQSPFSPQFASVNITTNQAAPSAVSIMHQVSRTVDSITLSWSQPDQP 

NGVILDYELQYYEKELSEYNATAIKSPTNTVTVQGLKAGAIYVFQVRARTVAGYGRYSGKMYFQTMTEAEYQTSIQE 

KLPLIIGSSAAGLVFLIAVVVIAIVCNRRGFERADSEYTDKLQHYTSGHMTPGMKIYIDPFTYEDPNEAVREFAKEI 

DISCVKIEQVIGAGEFGEVCSGHLKLPGKREIFVAIKTLKSGYTEKQRRDFLSEASIMGQFDHPNVIHLEGVVTKST 

PVMIITEFMENGSLDSFLRQNDGQFTVIQLVGMLRGIAAGMKYLADMNYVHRDLAARNILVNSNLVCKVSDFGLSRF 

LEDDTSDPTYTSALGGKIPIRWTAPEAIQYRKFTSASDVWSYGIVMWEVMSYGERPYWDMTNQDVINAIEQDYRLPP 

PMDCPSALHQLMLDCWQKDRNHRPKFGQIVNTLDKMIRNPNSLKAMAPLSSGINLPLLDRTIPDYTSFNTVDEWLEA 

IKMGQYKESFANAGFTSFDVVSQMMMEDILRLGVTLAGHQKKILNSXQVMRAQMNQIQSVEGQPLARRPRATGRTKR 

CQPRDVTKKTCNSNDGKKKGMGKKKTDPGRGREIQGIFFKEDSHKESNDCSCGG 



<210> SEQ ID NO 1418 

<211> Length : 478 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1418 

>Vitronectin precursor 

MAPLRPLLILALLAWVALADQESCKGRCTEGFNVDKKCQCDELCSYYQSCCTDYTAECKPQVTRGDVFTMPEDEYTV 
YDDGEEKNNATVHEQVGGPSLTSDLQAQSKGNPEQTPVLKPEEEAPAPEVGASKPEGIDSRPETLHPGRPQPPAEEE 
LCSGKPFDAFTDLKNGSLFAFRGQYCYELDEKAVRPGYPKLIRDVWGIEGPIDAAFTRINCQGKTYLFKGSQYWRFE 
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DGVLDPDYPRNISDGFDGIPDNVDAALALPAHSYSGRERVYFFKGKQYWEYQFQHQPSQEECEGSSLSAVFEHFAMM 
QRDSWEDIFELLFWGRTSAGTRQPQFISRDWHGVPGQVDAAMAGRI YISGMAPRPSLAKKQRFRHRNRKGYRSQRGH 
SRGRNQNSRRPSRATWLSLFSSEESNLGANNYDDYRMDWLVPATCEPIQSVFFFSGDKYYRVNLRTRRVDTVDPPYP 

RSIAQYWLGCPAPGHL 



<210> SEQ ID NO 1419 

<211> Length : 871 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1419 

>Extracellular sulfatase Sulf-1 precursor 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSLQVMNKTRK1MEHGGATF 
INAFVTTPMCCPSRSSMLTGKYVHNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPP 
GWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSA 
PQFSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELEN 
TYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSV 
LKLLDPEKPGNRFRTNKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARYQTACEQPGQKWQ 
CIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVH 
TRQTRSLSVEFEGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGPPTTVRVTH 
KCFILPNDSIHCERELYQSARAWKDHKAYIDKEIEALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVKKQE 
KLKSHLHPFKEAAQEVDSKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHDNNHWQTAPFWNLGSFCACTS 
SNNNTYWCLRTVNETHNFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQCNPRPKN 

LDVGNKDGGSYDLHRGQLWDGWEG 



SEQ ID NO: 1420 
RPL19 Reverse primer 
TGATCAGCCCATCTTTGATGAG 



<210> SEQ ID NO 1421 
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<211> Length : 148 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1421 

>Gastrin-releasing peptide precursor 

MRGSELPLVLLALVLCLAPRGRAVPLPAGGGTVLTKMYPRGNHWAVGHLMGKKSTGESSSVSERGSLKQQLREYIRW 
EEAARNLLGLIEAKENRNHQPPQPKALGNQQPSWDSEDSSNFKDVGSKGKVGRLSAPGSQREGRNPQLNQQ 



<210> SEQ ID NO 1422 

<211> Length : 170 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1422 

>Neurotensin/neuromedin N precursor [Contains: Large neuromedin N (NmN- 125); 
Neuromedin N (NmN) (NN) ; Neurotensin (NT) ; Tail peptide] 

MMAGMKIQLVCMLLLAFSSWSLCSDSEEEMKALEADFLTNMHTSKISKAHVPSWKMTLLNVCSLVNNLNSPAEETGE 
VHEEELVARRKLPTALDGFSLEAMLTIYQLHKICHSRAFQHWELIQEDILDTGNDKNGKEEVIKRKIPYILKRQLYE 

NKPRRPYILKRDSYYY 

<210> SEQ ID NO 1423 

<211> Length : 185 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1423 

>ADM precursor [Contains: Adrenomedullin (AM); Proadrenomedullin N-20 
terminal peptide (ProAM-N20) (ProAM N-terminal 20 peptide) (PAMP) ] 
MKLVSVALMYLGSLAFLGADTARLDVASEFRKKWNKWALSRGKRELRMSSSYPTGLADVKAGPAQTLIRPQDMKGAS 
RSPEDSSPDAARIRVKRYRQSMNNFQGLRSFGCRFGTCTVQKLAHQIYQFTDKDKDNVAPRSKISPQGYGRRRRRSL 
PEAGPGRTLVSSKPQAHGAPAPPSGSAPHFL 
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<210> SEQ ID NO 1424 

<211> Length : 328 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1424 

>Mitotic checkpoint protein BUB3 

MTGSNEFKLNQPPEDGISSVKFSPNTSQFLLVSSWDTSVRLYDVPANSMRLKYQHTGAVLDCAFYDPTHAWSGGLDH 
QLKMHDLNTDQENLVGTHDAPIRCVEYCPEVNVMVTGSWDQTVKLWDPRTPCNAGTFSQPEKVYTLSVSGDRLIVGT 
AGRRVLVWDLRNMGYVQQRRESSLKYQTRCIRAFPNKQGYVLSSIEGRVAVEYLDPSPEVQKKKYAFKCHRLKENNI 
EQIYPVNAISFHNIHNTFATGGSDGFVNIWDPFNKKRLCQFHRYPTSIASLAFSNDGTTLAIASSYMYEMDDTEHPE 
DGIFIRQVT DAETKPKSPCT 



<210> SEQ ID NO 1425 

<211> Length : 114 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1425 

>Small inducible cytokine B5 precursor 

MSLLSSRAARVPGPSSSLCALLVLLLLLTQPGPIASAGPAAAVLRELRCVCLQTTQGVHPKMISNLQVFAIGPQCSK 
VEVVASLKNGKEICLDPEAPFLKKVIQKILDGGNKEN 



<210> SEQ ID NO 1426 

<211> Length : 461 

<212> Type : PRT 

<213> Organism : Homo sapiens 



<400> sequence : 1426 
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>Ornithine decarboxylase 

MNNFGNEEFDCHFLDEGFTAKDILDQKINEVSSSDDKDAFYVADLGDILKKHLRWLKALPRVTPFYAVKCNDSKAIV 
KTLAATGTGFDCASKTEIQLVQSLGVPPERIIYANPCKQVSQIKYAANNGVQMMTFDSEVELMKVARAHPKAKLVLR 
IATDDSKAVCRLSVKFGATLRTSRLLLERAKELNIDVVGVSFHVGSGCTDPETFVQAI SDARCVFDMGAEVGFSMYL 
LD1GGGFPGSEDVKLKFEEITGVINPALDKYFPSDSGVRIIAEPGRYYVASAFTLAVNIIAKKIVLKEQTGSDDEDE 
SSEQTFMYYVNDGVYGSFNCILYDHAHVKPLLQKRPKPDEKYYSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFEN 
MGAYTVAAASTFNGFQRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCAWESGMKRHRAACASASINV 



<210> SEQ ID NO 1427 

<211> Length : 214 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1427 

>Tescalcin 

MGAAHSASEEVRELEGKTGFSSDQIEQLHRRFKQLSGDQPTIRKENFNNVPDLELNPIRSKIVRAFFDNRNLRKGPS 
GLADE INFEDFLTIMSYFRPIDTTMDEEQVELSRKEKLRFLFHMYDSDSDGRITLEEYRNVVEELLSGNPHI EKES A 
RSIADGAMMEAASVCMGQMEPDQVYEGITFEDFLKIWQGIDIETKMHVRFLNMETMALCH 



<210> SEQ ID NO 1428 

<211> Length : 250 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1428 

>Kallikrein 11 precursor 

MRILQLILLALATGLVGGETRIIKGFECKPHSQPWQAALFEKTRLLCGATLIAPRWLLTAAHCLKPRYIVHLGQHNL 
QKEEGCEQTRTATESFPHPGFNNSLPNKDHRNDIMLVKMASPVSITWAVRPLTLSSRCVTAGTSCLISGWGSTSSPQ 
LRLPHTLRCANITIIEHQKCENAYPGNITDTMVCASVQEGGKDSCQGDSGGPLVCNQSLQGIISWGQDPCAITRKPG 

VYTKVCKYVDWIQETMKNN 
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<210> SEQ ID NO 1429 

<211> Length : 99 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1429 

>Small inducible cytokine B14 precursor 

MRLLAAALLLLLLALYTARVDGSKCKCSRKGPKIRYSDVKKLEMKPKYPHCEEKMVIITTKSVSRYRGQEHCLHPKL 
QSTKRFIKWYNAWNEKRRVYEE 



<210> SEQ ID NO 1430 
<211> Length : 446 
<212> Type : PRT 

<213> Organism : Homo sapiens 
<400> sequence : 1430 

>SPARC related modular calcium-binding protein 2 precursor 

MLLPQLCWLPLLAGLLPPVPAQKFSALTFLRVDQDKDKDCSLDCAGSPQKPLCASDGRTFLSRCEFQRAKCKDPQLE 
IAYRGNCKDVSRCVAERKYTQEQARKEFQQVFIPECNDDGTYSQVQCHSYTGYCWCVTPNGRPISGTAVAHKTPRCP 
GSVNEKLPQREGTGKTDDAAAPALETQPQGDEEDIASRYPTLWTEQVKSRQNKTNKNSVSSCDQEHQSALEEAKQPK 
NDNVVIPECAHGGLYKPVQCHPSTGYCWCVLVDTGRPIPGTSTRYEQPKCDNTARAHPAKARDLYKGRQLQGCPGAK 
KHEFLTSVLDALSTDMVHAASDPSSSSGRLSEPDPSHTLEERVVHWYFKLLDKNSSGDIGKKEIKPFKRFLRKKSKP 
KKCVKKFVEYCDVNNDKSISVQELMGCLGVAKEDGKADTKKRHTPRGHAESTSNRQPRKQG 

<210> SEQ ID NO 1431 

<211> Length : 314 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1431 

>Testisin precursor 
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MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGEDAELGRWPWQGSLRLWDSHVCGVSLLSHRWA 
LTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYLSPRYLGNSPYDIALVKLSAPVTYTKHIQ 
PICLQASTFEFENRTDCWVTGWGYIKEDEALPSPHTLQEVQVAIINNSMCNHLFLKYSFRKDIFGDMVCAGNAQGGK 
DACFGDSGGPLACNKNGLWYQIGVVSWGVGCGRPNRPGVYTNISHHFEWIQKLMAQSGMSQPDPSWPLLFFPLLWAL 
PLLGPV 



<210> SEQ ID NO 1432 

<211> Length : 517 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1432 

>Poliovirus receptor related protein 1 precursor 

I^RMGLAGAAGRWWGLALGLTAFFLPGVHSQVVQVNDSMYGFIGTDVVLHCSFANPLPSVKITQVTWQKSTNGSKQN 
VAIYNPSMGVSVLAPYRERVEFLRPSFTDGTIRLSRLELEDEGVYICEFATFPTGNRESQLNLTVMAKPTNWIEGTQ 
AVLRAKKGQDDKVLVATCTSANGKPPSVVSWETRLKGEAEYQEIRNPNGTVTVISRYRLVPSREAHQQSLACIVNYH 
MDRFKESLTLNVQYEPEVTIEGFDGNWYLQRMDVKLTCKADANPPATEYHWTTLNGSLPKGVEAQNRTLFFKGPINY 
SLAGTYICEATNPIGTRSGQVEVNITEFPYTPSPPEHGRRAGPVPTAIIGGVAGSILLVLIVVGGIVVALRRRRHTF 
KGDYSTKKHVYGNGYSKAGIPQHHPPMAQNLQYPDDSDDEKKAGPLGGSSYEEEEEEEEGGGGGERKVGGPHPKYDE 
DAKRPYFTVDEAEARQDGYGDRTLGYQYDPEQLDLAENMVSQNDGS FI SKKEWYV 



<210> SEQ ID NO 1433 

<211> Length : 493 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1433 

Phospholipid transfer protein precursor 

MALFGALFLALLAGAHAEFPGCKIRVTSKALELVKQEGLRFLEQELETITIPDLRGKEGHFYYNISEVKVTELQLTS 
SELDFQPQQELMLQITNASLGLRFRRQLLYWFFYDGGYINASAEGVSIRTGLELSRDPAGRMKVSNVSCQASVSRMH 
AAFGGTFKKVYDFLSTFITSGMRFLLNQQICPVLYHAGTVLLNSLLDTVPVRSSVDELVGIDYSLMKDPVASTSNLD 
MDFRGAFFPLTERNWSLPNRAVEPQLQEEERMVYVAFSEFFFDSAMESYFRAGALQLLLVGDKVPHDLDMLLRATYF 
GS I VLLS PAVI DS PLKLELRVLAP PRCTIKPS GTT I S VTASVTI ALVPPDQPEVQLS SMTMDARLSAKMALRGKALR 
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TQLDLRRFRIYSNHSALESLAL1PLQAPLKTMLQIGVMPMLNERTWRGVQIPLPEGINFVHEVVTNHAGFLTIGADL 
HFAKGLREVIEKNRPADVRASTAPTPSTAAV 



<210> SEQ ID NO 1434 

<211> Length : 258 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1434 

>Clq-related factor precursor 

MLLVLVVLIPVLv^GGPEGHYEMLGTCRMVCDPYPARGPGAGARTDGGDALSEQSGAPPPSTLVQGPQGKPGRTGK 
PGPPGPPGDPGPPGPVGPPGEKGEPGKPGPPGLPGAGGSGAISTATYTTVPRVAFYAGLKNPHEGYEVLKFDDVVTN 
LGNNYDAASGKFTCNIPGTYFFTYHVLMRGGDGTSMWADLCKNGQVRASAIAQDADQNYDYASNSVILHLDAGDEVF 
IKLDGGKAHGGNSNKYSTFSGFIIYSD 



<210> SEQ ID NO 1435 

<211> Length : 199 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1435 

>Neuronal protein NP25 

MANRGPSYGLSREVQEKIEQKYDADLENKLVDWIILQCAEDIEHPPPGRAHFQKWLMDGTVLCKLINSLYPPGQEPI 
PKISESKMAFKQMEQISQFLKAAETYGVRTTDIFQTVDLWEGKDMAAVQRTLMALGSVAVTKDDGCYRGEPSWFHRK 
AQQNRRGFSEEQLRQGQNVIGLQMGSNKGASQAGMTGYGMPRQIM 

<210> SEQ ID NO 1436 

<211> Length : 919 

<212> Type : PRT 

<213> Organism : Homo sapiens 



WO 2006/131783 



PCT/IB2005/004037 



613 

<400> sequence : 1436 
>Exostosin-like 3 

MTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYYLTTLDEADEAGKRIFGPRVGNELCEVK 
HVLDLCRIRESVSEELLQLEAKRQELNSEIAKLNLKIEACKKSIENAKQDLLQLKNVISQTEHSYKELMAQNQPKLS 
LPIRLLPEKDDAGLPPPKATRGCRLHNCFDYSRCPLTSGFPVYVYDSDQFVFGSYLDPLVKQAFQATARANVYVTEN 
ADIACLYVILVGEMQEPVVLRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTGRAMVAQSTFYTVQY 
RPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEKIESLRSSLQEARSFEEEMEGDPPADYDDRIIATLKA 
VQDSKLDQVLVEFTCKNQPKPSLPTEWALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATRLFEALEVGAVPV 
VLGEQVQLPYQDMLQWNEAALVVPKPRVTEVHFLLRSLSDSDLLAMRRQGRFLWETYFSTADSIFNTVLAMIRTRIQ 
IPAAPIREEAAAEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYLRNFTLTVTDFYRSWNCAPGPFHLFPH 
TPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEFQAALGGNVPREQFTVVMLTYEREEVLMNSLERLNGLPYLNKVV 
VVWNSPKLPSEDLLWPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLRHDEIMFGFRVWREARDRIVG 
FPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHKYYAYLYSYVMPQAIRDMVDEYINCEDIAMNFLVSHITRKP 
PIKVTSRWTFRCPGCPQALSHDDSHFHERHKCINFFVKVYGYMPLLYTQFRVDSVLFKTRLPHDKTKCFKFI 



<210> SEQ ID NO 1437 

<211> Length : 931 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1437 

>BAA254 45 

PDVIWGAGCRGLMTGYTMLRNGGAGNGGQTCMLRWSNRIRLTWLSFTLFVILVFFPLIAHYYLTTLDEADEAGKRIF 
GPRVGNELCEVKHVLDLCRIRESVSEELLQLEAKRQELNSEIAKLNLKIEACKKSIENAKQDLLQLKNVISQTEHSY 
KELMAQNQPKLSLPIRLLPEKDDAGLPPPKATRGCRLHNCFDYSRCPLTSGFPVYVYDSDQFVFGSYLDPLVKQAFQ 
ATARANVYVTENADIACLYVILVGEMQEPVVLRPAELEKQLYSLPHWRTDGHNHVIINLSRKSDTQNLLYNVSTGRA 
MVAQSTFYTVQYRPGFDLVVSPLVHAMSEPNFMEIPPQVPVKRKYLFTFQGEK1ESLRSSLQEARSFEEEMEGDPPA 
DYDDRIIATLKAVQDSKLDQVLVEFTCKNQPKPSLPTEWALCGEREDRLELLKLSTFALIITPGDPRLVISSGCATR 
LFEALEVGAVPVVLGEQVQLPYQDMLQWNEAALWPKPRVTEVHFLLRSLSDSDLLAMRRQGRFLWETYFSTADSIF 
NTVLAMIRTRIQIPAAPIREEAAAEIPHRSGKAAGTDPNMADNGDLDLGPVETEPPYASPRYLRNFTLTVTDFYRSW 
NCAPGPFHLFPHTPFDPVLPSEAKFLGSGTGFRPIGGGAGGSGKEFQAALGGNVPREQFTVVMLTYEREEVLMNSLE 
RLNGLPYLNKWVVWNSPKLPSEDLLWPDIGVPIMVVRTEKNSLNNRFLPWNEIETEAILSIDDDAHLRHDEIMFGF 
RVWREARDRIVGFPGRYHAWDIPHQSWLYNSNYSCELSMVLTGAAFFHKYYAYLYSYVMPQAIRDMVDEYINCEDIA 
MNFLVSHITRKPPIKVTSRWTFRCPGCPQALSHDDSHFHERHKCINFFVKVYGYMPLLYTQFRVDSVLFKTRLPHDK 
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TKCFKFI 



<210> SEQ ID NO 1438 

<211> Length : 957 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1438 

>Kinesin heavy chain isoform 5C 

MADPAECSIKVMCRFRPLNEAEILRGDKFIPKFKGDETVVIGQGKPYVFDRVLPPNTTQEQVYNACAKQIVKDVLEG 
YNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIAHDIFDHIYSMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNL 
AVHEDKNRVPYVKGCTERFVS SPEEVMDVI DEGKANRHVAVTNMNEHS SRSHS I FLINIKQENVETEKKLSGKLYLV 
DLAGSEKVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKTHVPYRDSKMTRILQDSLGGNCRTTIVICCSPSV 
FNEAETKSTLMFGQRAKTIKNTVSVNLELTAEEWKKKYEKEKEKNKTLKNVIQHLEMELNRWRNGEAVPEDEQISAK 
DQKNLEPCDNTPIIDNIAPVVAGISTEEKEKYDEEISSLYRQLDDKDDEINQQSQLAEKLKQQMLDQDELLASTRRD 
YEKIQEELTRLQIENEAAKDEVKEVLQALEELAVNYDQKSQEVEDKTRANEQLTDELAQKTTTLTTTQRELSQLQEL 
SNHQKKRATEILNLLLKDLGEIGGIIGTNDVKTLADVNGVIEEEFTMARLYISKMKSEVKSLVNRSKQLESAQMDSN 
RKMNASERELAACQLLISQHEAKIKSLTDYMQNMEQKRRQLEESQDSLSEELAKLRAQEKMHEVSFQDKEKEHLTRL 
QDAEEMKKALEQQMESHREAHQKQLSRLRDEIEEKQKIIDEIRDLNQKLQLEQEKLSSDYNKLKIEDQEREMKLEKL 
LLLNDKREQAREDLKGLEETVSRELQTLHNLRKLFVQDLTTRVKKSVELDNDDGGGSAAQKQKISFLENNLEQLTKV 
HKQLVRDNADLRCELPKLEKRLRATAERVKALESALKEAKENAMRDRKRYQQEVDRIKEAVRAKNMARRAHSAQIAK 
PIRPGHYPASSPTAVHAIRGGGGSSSNSTHYQK 



<210> SEQ ID NO 1439 

<211> Length : 650 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1439 

>Amyloid-like protein 1 precursor 

MGPASPAARGLSRRPGQPPLPLLLPLLLLLLRAQPAIGSLAGGSPGAAEAPGSAQVAGLCGRLTLHRDLRTGRWEPD 
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PQRSRRCLRDPQRVLEYCRQMYPELQ1ARVEQATQAIPMERWCGGSRSGSCAHPHHQVVPFRCLPGEFVSEALLVPE 
GCRFLHQERMDQCESSTRRHQEAQEACSSQGLILHGSGMLLPCGSDRFRGVEYVCCPPPGTPDPSGTAVGDPSTRSW 
PPGSRVEGAEDEEEEESFPQPVDDYFVEPPQAEEEEETVPPPSSHTLAVVGKVTPTPRPTDGVDI YFGMPGEISEHE 
GFLRAKMDLEERRMRQINEVMREWAMADNQSKNLPKADRQALNEHFQSILQTLEEQVSGERQRLVETHATRVIALIN 
DQRRAALEGFLAALQADPPQAERVLLALRRYLRAEQKEQRHTLRHYQHVAAVDPEKAQQMRFQVHTHLQVIEERVNQ 
SLGLLDQNPHLAQELRPQIQELLHSEHLGPSELEAPAPGGSSEDKGGLQPPDSKDDTPMTLPKGSTEQDAASPEKEK 
MNPLEQYERKVNASVPRGFPFHSSEIQRDELAPAGTGVSREAVSGLLIMGAGGGSLIVLSMLLLRRKKPYGAISHGV 
VEVDPMLTLEEQQLRELQRHGYENPTYRFLEERP 



<210> SEQ ID NO 1440 

<211> Length : 98 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1440 

>Acylphosphatase, organ-common type isozyme 

AEGNTLISVDYEIFGKVQGVFFRKHTQAEGKKLGLVGWVQNTDRGTVQGQLQGPISKVRHMQEWLETRGSPKSHIDK 
ANFNNEKVILKLDYSDFQIVK 



<210> SEQ ID NO 1441 

<211> Length : 99 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1441 

>ACYO_HUMAN_Vl 

MAEGNTLISVDYEIFGKVQGVFFRKHTQAEGKKLGLVGWVQNTDRGTVQGQLQGPISKVRHMQEWLETRGSPKSHID 
KANFNNEKVILKLDYSDFQIVK 
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<210> SEQ ID NO 1442 

<211> Length : 246 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1442 

>Sorting nexin 26 

MLSLSLCSHLWGPLILSALQARSTDSLDGPGEGSVQPLPTAGGPSVKGKPGKRLSAPRGPFPRLADCAHFHYENVDF 
GHIQLLLSPDREGPSLSGENELVFGVQVTCQGRSWPVLRSYDDFRSLDAHLHRCIFDRRFSCLPELPPPPEGARAAQ 
MLVPLLLQYLETLSGLVDSNLNCGPVLTWMEVGLGRGLGDSEWVRGCVCHHAQHREILDGNRVASAVEDEGAEVDGE 

AFRWGSLWVGESWDM 



<210> SEQ ID NO 1443 

<211> Length : 862 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1443 

>Q9NT23 

HDVIQQLPPPHYRTLEYLLRHLARMARHSANTSMHARNLAIVWAPNLLRSMELESVGMGGAAAFREVRVQSVWEFL 
LTHVDVLFSDTFTSAGLDPAGRCLLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAERRKGER 
GEKQRKPGGSSWKTFFALGRGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRSAKSEESLSSQASGAGLQRLHRLR 
RPHSSSDAFPVGPAPAGSCESLSSSSSSESSSSESSSSSSESSAAGLGALSGSPSHRTSAWLDDGDELDFSPPRCLE 
GLRGLDFDPLTFRCSSPTPGDPAPPASPAPPAPASAFPPRVTPQAISPRGPTSPASPAALDISEPLAVSVPPAVLEL 
LGAGGAPASATPTPALSPGRSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLPPPPLSLLRPGG 
APPPPPKNPARLMALALAERAQQVAEQQSQQECGGTPPASQSPFHRSLSLEVGGEPLGTSGSGPPPNSLAHPGAWVP 
GPPPYLPRQQSDGSLLRSQRPMGTSRRGLRGPAQVSAQLRAGGGGRDAPEAAAQSPCSVPSQVPTPGFFSPAPRECL 
PPFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGPPAPLDRGENLYYEIGASEGSPYSGLTRSWSPFRSMPPDRLNAS 
YGMLGQSPPLHRSPDFLLSYPPAPSCFPPDHLGYSAPQHPARRPTPPEPLYVNLALGPRGPSPASSSSSSPPAHPRS 
RSDPGPPVPRLPQKQRAPWGPRTPHRVPGPWGPPEPLLLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYPT 

PSWSLHSEGQTRSYC 



<210> SEQ ID NO 1444 
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<211> Length : 295 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1444 

>Q96CP3 

LRGPAQVSAQLRAGGGGRDAPEAAAQSPCSVPSQVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSFQPSSPAPVW 
RSSLGPPAPLDRGENLYYEIGASEGSPYSGPTRSWSPFRSMPPDRLNASYGMLGQSPPLHRSPDFLLSYPPAPSCFP 
PDHLGYSAPQHPARRPTPPEPLYVNLALGPRGPSPASSSSSSPPAHPRSRSDPGPPVPRLPQKQRAPWGPRTPHRVP 
GPWGPPEPLLLYRAAPPAYGRGGELHRGSLYRNGGQRGEGAGPPPPYPTPSWSLHSEGQTRSYC 



<210> SEQ ID NO 1445 
<211> Length : 1007 
<212> Type : PRT 

<213> Organism : Homo sapiens 
<400> sequence : 1445 
>BAC86902 

MLVPLLLQYLETLSGLVDSNLNCGPVLTWMELDNHGRRLLLSEEASLNIPAVAAAHVIKRYTAQAPDELSFEVGDIV 
SVIDMPPTEDRSWWRGKRGFQVGFFPSECVELFTERPGPGLKADADGPPCGIPAPQGISSLTSAVPRPRGKLAGLLR 
TFMRSRPSRQRLRQRGILRQRVFGCDLGEHLSNSGQDVPQVLRCCSEFIEAHGVVDGIYRLSGVSSNIQRLRHEFDS 
ERIPELSGPAFLQDIHSVSSLCKLYFRELPNPLLTYQLYGKFSEAMSVPGEEERLVRVHDVIQQLPPPHYRTLEYLL 
RHLARMARHSANTSMHARNLAIVWAPNLLRSMELESVGMGGAAAFREVRVQSVVVEFLLTHVDVLFSDTFTSAGLDP 
AGRCLLPRPKSLAGSCPSTRLLTLEEAQARTQGRLGTPTEPTTPKAPASPAERRKGERGEKQRKPGGSSWKTFFALG 
RGPSVPRKKPLPWLGGTRAPPQPSGSRPDTVTLRSAKSEESLSSQASGAGLQRLHRLRRPHSSSDAFPVGPAPAGSC 
ESLSSSSSSESSSSESSSSSSESSAAGLGALSGSPSHRTSAWLDDGDELDFSPPRCLEGLRGLDFDPLTFRCSSPTP 
GDPAPPASPAPPAPASAFPPRVTPQAISPRGPTSPAS PAALDISEPLAVSVPPAVLELLGAGGAPASATPTPALSPG 
RSLRPHLIPLLLRGAEAPLTDACQQEMCSKLRGAQGPLGPDMESPLPPPPLSLLRPGGAPPPPPKNPARLMALALAE 
RAQQVAEQQSQQECGGTPPASQSPFHRSLSLEVGGEPLGTSGSGPPPNSLAHPGAWVPGPPPYLPRQQSDGSLLRSQ 
RPMGTSRRGLRGPAQVPTPGFFSPAPRECLPPFLGVPKPGLYPLGPPSFQPSSPAPVWRSSLGPPAPLDRGENLYYE 
IGASEGSPYSGPTRSWSPFRSMPPDRLNASYGMLGQSPPLHRSPDFLLSYPPAPSCFPPDHLGYSPPSTLLGALHRL 

SPSTST 
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<210> SEQ ID NO 1446 

<211> Length : 1806 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1446 

>Collagen alpha 1 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTTGFCTNRKNSKGSDTAYRVS 

KQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIYNEHGIQQIGVEVGRSPVFLFEDHTGKPAPEDYPLF 

RTVNIADGKWHRVAISVEKKTVTMIVDCKKKTTKPLDRSERAIVDTNGITVFGTRILDEEVFEGDIQQFLITGDPKA 

AYDYCEHYSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQTEANIVDDFQE 

YNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDSQRKNSEDTLYENKEIDGRDSDLLVDGDLGEYDFYE 

YKEYEDKPTSPPNEEFGPGVPAETDITETSINGHGAYGEKGQKGEPAWEPGMLVEGPPGPAGPAGIMGPPGLQGPT 

GPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARIALRGPPGPMGLTGR 

PGPVGGPGSSGAKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDRGFDGLPGLPGDKGHRG 

ERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAGPRGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQ 

GNPGPQGLPGPQGPIGPPGEKGPQGKPGLAGLPGADGPPGHPGKEGQSGEKGALGPPGPQGPIGYPGPRGVKGADGV 

RGLKGSKGEKGEDGFPGFKGDMGLKGDRGEVGQIGPRGEDGPEGPKGRAGPTGDPGPSGQAGEKGKLGVPGLPGYPG 

RQGPKGSTGFPGFPGANGEKGARGVAGKPGPRGQRGPTGPRGSRGARGPTGKPGPKGTSGGDGPPGPPGERGPQGPQ 

GPVGFPGPKGPPGPPGKDGLPGHPGQRGETGFQGKTGPPGPGGVVGPQGPTGETGPIGERGYPGPPGPPGEQGLPGA 

AGKEGAKGDPGPQGISGKDGPAGLRGFPGERGLPGAQGAPGLKGGEGPQGPPGPVGSPGERGSAGTAGPIGLRGRPG 

PQGPPGPAGEKGAPGEKGPQGPAGRDGVQGPVGLPGPAGPAGSPGEDGDKGEIGEPGQKGSKGGKGENGPPGPPGLQ 

GPVGAPGIAGGDGEPGPRGQQGMFGQKGDEGARGFPGPPGPIGLQGLPGPPGEKGENGDVGPMGPPGPPGPRGPQGP 

NGADGPQGPPGSVGSVGGVGEKGEPGEAGNPGPPGEAGVGGPKGERGEKGEAGPPGAAGPPGAKGPPGDDGPKGNPG 

PVGFPGDPGPPGELGPAGQDGVGGDKGEDGDPGQPGPPGPSGEAGPPGPPGKRGPPGAAGAEGRQGEKGAKGEAGAE 

GPPGKTGPVGPQGPAGKPGPEGLRGIPGPVGEQGLPGAAGQDGPPGPMGPPGLPGLKGDPGSKGEKGHPGLIGLIGP 

PGEQGEKGDRGLPGTQGSPGAKGDGGIPGPAGPLGPPGPPGLPGPQGPKGNKGSTGPAGQKGDSGLPGPPGPPGPPG 

EVIQPLPILSSKKTRRHTEGMQADADDNILDYSDGMEEIFGSLNSLKQDIEHMKFPMGTQTNPARTCKDLQLSHPDF 

PDGEYWIDPNQGCSGDSFKVYCNFTSGGETCIYPDKKSEGVRISSWPKEKPGSWFSEFKRGKLLSYLDVEGNSINMV 

QMTFLKLLTASARQNFTYHCHQSAAWYDVSSGSYDKALRFLGSNDEEMSYDNNPFIKTLYDGCTSRKGYEKTVIEIN 

TPKIDQVPIVDVMISDFGDQNQKFGFEVGPVCFLG 



<210> SEQ ID NO 1447 
<211> Length : 1806 
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<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1447 

>CA1B_HUMAN_V5 

MEPWSSRWKTKRWLWDFTVTTLALTFLFQAREVRGAAPVDVLKALDFHNSPEGISKTTGFCTNRKNSKGSDTAYRVS 

KQAQLSAPTKQLFPGGTFPEDFSILFTVKPKKGIQSFLLSIYNEHGIQQIGVEVGRSPVFLFEDHTGKPAPEDYPLF 

RTVNIADGKWHRVAI SVEKKTVTMIVDCKKKTTKPLDRSERAIVDTNGITVFGTRILDEEVFEGDIQQFLITGDPKA 

AYDYCEHYSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESVTEGPTVTEETIAQTEANIVDDFQE 

YNYGTMESYQTEAPRHVSGTNEPNPVEEIFTEEYLTGEDYDSQRKNSEDTLYENKEIDGRDSDLLVDGDLGEYDFYE 

YKEYEDKPTSPPNEEFGPGVPAETDITETSINGHGAYGEKGQKGEPAVVEPGMLVEGPPGPAGPAGIMGPPGLQGPT 

GPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARIALRGPPGPMGLTGR 

PGPVGGPGSSGAKGESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDRGFDGLPGLPGDKGHRG 

ERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAGPRGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQ 

GNPGPQGLPGPQGPIGPPGEKGPQGKPGLAGLPGADGPPGHPGKEGQSGEKGALGPPGPQGPIGYPGPRGVKGADGV 

RGLKGSKGEKGEDGFPGFKGDMGLKGDRGEVGQIGPRGEDGPEGPKGRAGPTGDPGPSGQAGEKGKLGVPGLPGYPG 

RQGPKGSTGFPGFPGANGEKGARGVAGKPGPRGQRGPTGPRGSRGARGPTGKPGPKGTSGGDGPPGPPGERGPQGPQ 

GPVGFPGPKGPPGPPGKDGLPGHPGQRGETGFQGKTGPPGPGGVVGPQGPTGETGPIGERGHPGPPGPPGEQGLPGA 

AGKEGAKGDPGPQGISGKDGPAGLRGFPGERGLPGAQGAPGLKGGEGPQGPPGPVGSPGERGSAGTAGPIGLRGRPG 

PQGPPGPAGEKGAPGEKGPQGPAGRDGVQGPVGLPGPAGPAGSPGEDGDKGEIGEPGQKGSKGGKGENGPPGPPGLQ 

GPVGAPGIAGGDGEPGPRGQQGMFGQKGDEGARGFPGPPGPIGLQGLPGPPGEKGENGDVGPMGPPGPPGPRGPQGP 

NGADGPQGPPGSVGSVGGVGEKGEPGEAGNPGPPGEAGVGGPKGERGEKGEAGPPGAAGPPGAKGPPGDDGPKGNPG 

PVGFPGDPGPPGELGPAGQDGVGGDKGEDGDPGQPGPPGPSGEAGPPGPPGKRGPPGAAGAEGRQGEKGAKGEAGAE 

GPPGKTGPVGPQGPAGKPGPEGLRGIPGPVGEQGLPGAAGQDGPPGPMGPPGLPGLKGDPGSKGEKGHPGLIGLIGP 

PGEQGEKGDRGLPGTQGSPGAKGDGGIPGPAGPLGPPGPPGLPGPQGPKGNKGSTGPAGQKGDSGLPGPPGPPGPPG 

EVIQPLPILSSKKTRRHTEGMQADADDNILDYSDGMEEIFGSLNSLKQDIEHMKFPMGTQTNPARTCKDLQLSHPDF 

PDGEYWIDPNQGCSGDSFKVYCNFTSGGETCIYPDKKSEGVRISSWPKEKPGSWFSEFKRGKLLSYLDVEGNSINMV 

QMTFLKLLTASARQNFTYHCHQSAAWYDVSSGSYDKALRFLGSNDEEMSYDNNPFIKTLYDGCTSRKGYEKTVIEIN 

TPKIDQVPIVDVMI SDFGDQNQKFGFEVGPVCFLG 



<210> SEQ ID NO 1448 

<211> Length : 153 

<212> Type : PRT 

<213> Organism : Homo sapiens 
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<400> sequence : 1448 
>Myoglobin 

GLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASEDLKKHGATVLTALGGILK 
KKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG 



<210> SEQ ID NO 1449 

<211> Length : 154 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1449 

>MYG_HUMAN_V1 

MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASEDLKKHGATVLTALGGIL 
KKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG 



<210> SEQ ID NO 1450 

<211> Length : 99 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1450 

>Q8WVH6 

MKASEDLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKHPGDFGADAQGAM 
NKALELFRKDMASNYKELGFQG 



<210> SEQ ID NO 1451 

<211> Length : 702 

<212> Type : PRT 

<213> Organism : Homo sapiens 
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<400> sequence : 1451 

>Carcinoembryonic antigen-related cell adhesion molecule 5 precursor 

MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFNVAEGKEVLLLVHNLPQHLFGYSWYKGERVDGNR 

QIIGYVIGTQQATPGPAYSGREIIYPNASLLIQNI IQNDTGFYTLHVIKSDLVNEEATGQFRVYPELPKPSISSNNS 

KPVEDKDAVAFTCEPETQDATYLWWVNNQSLPVSPRLQLSNGNRTLTLFNVTRNDTASYKCETQNPVSARRSDSVIL 

NVLYGPDAPTISPLNTSYRSGENLNLSCHAASNPPAQYSWFVNGTFQQSTQELFIPNITVNNSGSYTCQAHNSDTGL 

NRTTVTTITVYAEPPKPFITSNNSNPVEDEDAVALTCEPEIQNTTYLWWVNNQSLPVSPRLQLSNDNRTLTLLSVTR 

NDVGPYECGIQNELSVDHSDPVILNVLYGPDDPTISPSYTYYRPGVNLSLSCHAASNPPAQYSWLIDGNIQQHTQEL 

FISNITEKNSGLYTCQANNSASGHSRTTVKTITVSAELPKPSISSNNSKPVEDKDAVAFTCEPEAQNTTYLWWVNGQ 

SLPVSPRLQLSNGNRTLTLFNVTRNDARAYVCGIQNSVSANRSDPVTLDVLYGPDTPIISPPDSSYLSGANLNLSCH 

SASNPSPQYSWRINGIPQQHTQVLFIAKITPNNNGTYACFVSNLATGRNNSIVKSITVSASGTSPGLSAGATVGIMI 

GVLVGVALI 



<210> SEQ ID NO 1452 

<211> Length : 495 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1452 

>Alanine aminotransferase 

ASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVKKPFTEVIRANIGDAQAMGQRP 
ITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNVF 
LSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELARALGQARDHC 
RPRALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEVYQDNVYAAGSQFHSFKKVLMEMGPPYAGQQELASF 
HSTSKGYMGECGFRGGYVEVVNMDAAVQQQMLKLMSVRLCPPVPGQALLDLVVSPPAPTDPSFAQFQAEKQAVLAEL 
AAKAKLTEQVFNEAPGISCNPVQGAMYSFPRVQLPPRAVERAQELGLAPDMFFCLRLLEETGICVVPGSGFGQREGT 
YHFRMTILPPLEKLRLLLEKLSRFHAKFTLEYS 



<210> SEQ ID NO 1453 
<211> Length : 496 
<212> Type : PRT 
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<213> Organism : Homo sapiens 
<400> sequence : 1453 
>ALAT_HUMAN_V1 

MASSTGDRSQAVRHGLRAKVLTLDGMNPRVRRVEYAVRGPIVQRALELEQELRQGVKKPFTEVIRANIGDAQAMGQR 
PITFLRQVLALCVNPDLLSSPNFPDDAKKRAERILQACGGHSLGAYSVSSGIQLIREDVARYIERRDGGIPADPNNV 
FLSTGASDAIVTVLKLLVAGEGHTRTGVLIPIPQYPLYSATLAELGAVQVDYYLDEERAWALDVAELHRALGQARDH 
CRPRALCVINPGNPTGQVQTRECIEAVIRFAFEERLFLLADEVYQDNVYAAGSQFHSFKKVLMEMGPPYAGQQELAS 
FHSTSKGYMGECGFRGGYVEVVNMDAAVQQQMLKLMSVRLCPPVPGQALLDLVVSPPAPTDPSFAQFQAEKQAVLAE 
LAAKAKLTEQVFNEAPGISCNPVQGAMYSFPRVQLPPRAVERAQELGLAPDMFFCLRLLEETGICVVPGSGFGQREG 
TYHFRMTILPPLEKLRLLLEKLSRFHAKFTLEYS 

<210> SEQ ID NO* 1454 
<211> Length : 132 
<212> Type : PRT 

<213> Organism : Homo sapiens 
<400> sequence : 1454 
>Antileukoproteinase 1 precursor 

MKSSGLFPFLVLLALGTLAPWAVEGSGKSFKAGVCPPKKSAQCLRYKKPECQSDWQCPGKKRCCPDTCGIKCLDPVD 
TPNPTRRKPGKCPVTYGQCLMLNPPNFCEMDGQCKRDLKCCMGMCGKSCVSPVKA 



<210> SEQ ID NO 1455 

<211> Length : 488 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1455 

>Stromelysin-3 precursor 

MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDVHHLHAERRGPQPWHAALPSSPAPAPATQEAPRPASSLRP 
PRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRAD 
IMIDFARYWDGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLGLQHTTAAKA 
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LMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLEPDAPPDACEASFDAVSTIRGEL 
FFFKAGFVWRLRGGQLQPGYPALASRHWQGLPSPVDAAFEDAQGHIWFFQGAQYWVYDGEKPVLGPAPLTELGLVRF 
PVHAALVWGPEKNKIYFFRGRDYWRFHPSTRRVDSPVPRRATDWRGVPSEIDAAFQDADGYAYFLRGRLYWKFDPVK 
VKALEGFPRLVGPDFFGCAEPANTFL 



<210> SEQ ID NO 1456 

<211> Length : 80 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1456 

>Trefoil factor 3 precursor 

MAARALCMLGLVLALLSSSSAEEYVGLSANQCAVPAKDRVDCGYPHVTPKE CNNRGCCFDSRIPGVPWCFKPLQEAE 
CTF 



<210> SEQ ID NO 1457 

<211> Length : 95 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1457 

>S-100P protein 

MTELETAMGMIIDVFSRYSGSEGSTQTLTKGELKVLMEKELPGFLQSGKDKDAVDKLLKDLDANGDAQVDFSEFIVF - 
VAAITSACHKYFEKAGLK 



<210> SEQ ID NO 1458 

<211> Length : 302 

<212> Type : PRT 

<213> Organism : Homo sapiens 
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<400> sequence : 1458 
>Stanniocalcin 2 precursor 

MCAERLGQFMTLALVLATFDPARGTDATNPPEGPQDRSSQQKGRLSLQNTAEIQHCLVNAGDVGCGVFECFENNSCE 
IRGLHGICMTFLHNAGKFDAQGKSFIKDALKCKAHALRHRFGCISRKCPAIREMVSQLQRECYLKHDLCAAAQENTR 
VIVEMIHFKDLLLHEPYVDLVNLLLTCGEEVKEAITHSVQVQCEQNWGSLCSILSFCTSAIQKPPTAPPERQPQVDR 
TKLSRAHHGEAGHHLPEPSSRETGRGAKGERGSKSHPNAHARGRVGGLGAQGPSGSSEWEDEQSEYSDIRR 

<210> SEQ ID NO 1459 

<211> Length : 578 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1459 

>Putative alpha-mannosidase C20orf31 precursor 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLT 
LIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMA 
EEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIVE FATLSSLTGDPVFEDVARVALMRLWESRSD 
IGLVGNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWVQMYKGTVS 
MPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRATG 
DPTLLELGRDAVESIEKISKVECGFATIKDLRDHKLDNRMESFFLAETVKYLYLLFDPTNFIHNNGSTFDAVITPYG 
ECILGAGGYIFNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQKNTVSSGPWEPPARPGTLFSPEN 
HDQARERKPAKQKVPLLSCPSQPFTSKLALLGQVFLDSS 



<210> SEQ ID NO 1460 

<211> Length : 578 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1460 



>AAH1618 4 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERVKAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLT 
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LIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNASVFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMA 
EEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSD 
IGLVGNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWVQMYKGTVS 
MPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQFGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRATG 
DPTLLELGRDAVESIEKISKVECGFATIKDLRDHKLDNRMESFFLAETVKYLYLLFDPTNFIHNNGSTFDTVITPYG 
ECILGAGGYIFNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQKNTVSSGPWEPPARPGTLFSPEN 
HDQARERKPAKQKVPLLSCPSQPFTSKLALLGQVFLDSS 



<210> SEQ ID NO 1461 

<211> Length : 541 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1461 

>AAQ88943 

MPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYSFSLTLIDALDTLLILGNVSEFQRVVEVLQDSVDFDIDVNAS 
VFETNIRVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPAFQTPTGMPYGTVNLLHGVNPGETPVTC 
TAGIGTFIVEFATLSSLTGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAQDAGIGAGVDSYFEYLVKGA 
ILLQDKKLMAMFLEYNKAIRNYTRFDDWYLWVQMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYYTVWKQ 
FGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRATGDPTLLELGRDAVESIEKISKVECGFATIKDLRDHKLD 
NRMESFFLAETVKYLYLLFDPTNFIHNNGSTFDAVITPYGECILGAGGYIFNTEAHPIDLAALHCCQRLKEEQWEVE 
DLMREFYSLKRSRSKFQKNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPSQPFTSKLALLGQVFLD 

SS 



<210> SEQ ID NO 1462 

<211> Length : 314 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1462 

>Osteopontin precursor 

MRIAVICFCLLGITCAIPVKQADSGSSEEKQLYNKYPDAVATWLNPDPSQKQNLLAPQNAVSSEETNDFKQETLPSK 
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SNESHDHMDDMDDEDDDDHVDSQDSIDSNDSDDVDDTDDS HQS DESHHSDESDELVTDFPTDL PATE VFTPVVPTVD 
TYDGRGDSVVYGLRSKSKKFRRPDIQYPDATDEDITSHMESEELNGAYKAIPVAQDLNAPSDWDSRGKDSYETSQLD 
DQSAETHSHKQSRLYKRKANDESNEHSDVIDSQELSKVSREFHSHEFHSHEDMLVVDPKSKEEDKHLKFRISHELDS 

ASSEVN 



<210> SEQ ID NO 1463 

<211> Length : 357 

<212> Type : PRT 

<213> Organism : Homo sapiens 

<400> sequence : 1463 

>NOV protein homolog precursor 

MQSVQSTSFCLRKQCLCLTFLLLHLLGQVAATQRCPPQCPGRCPATPPTCAPGVRAVLDGCSCCLVCARQRGESCSD 
LEPCDESSGLYCDRSADPSNQTGICTAVEGDNCVFDGVI YRSGEKFQPSCKFQCTCRDGQIGCVPRCQLDVLLPEPN 
CPAPRKVEVPGECCEKWICGPDEEDSLGGLTLAAYRPEATLGVEVSDSSVNCIEQTTEWTACSKSCGMGFSTRVTNR 
NRQCEMLKQTRLCMVRPCEQEPEQPTDKKGKKCLRTKKSLKAIHLQFKNCTSLHTYKPRFCGVCSDGRCCTPHNTKT 
IQAEFQCSPGQIVKKPVMVIGTCTCHTNCPKNNEAFLQELELKTTRGKM 



<210> SEQ ID NO 1464 

<211> Length : 516 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1464 



>HSU33147_PEA_1_T1 

GTGCTCACCTCCACAGCGGCTTCCTTGATCCTTGCCACCCGCGACTGAACACCGACAGCAGCAGCCTCACCATGAAG 
TTGCTGATGGTCCTCATGCTGGCGGCCCTCTCCCAGCACTGCTACGCAGGCTCTGGCTGCCCCTTATTGGAGAATGT 
GATTTCCAAGACAATCAATCCACAAGTGTCTAAGACTGAATACAAAGAACTTCTTCAAGAGTTCATAGACGACAATG 
CCACTACAAATGCCATAGATGAATTGAAGGAATGTTTTCTTAACCAAACGGATGAAACTCTGAGCAATGTTGAGCAA 
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TTAATATATGACAGCAGTCTTTGTGATTTATTTTAACTTTCTGCAAGACCTTTGGCTCACAGAACTGCAGGGTATGG 
TGAGAAACCAACTACGGATTGCTGCAAACCACACCTTCTCTTTCTTATGTCTTTTTACTACAAACTACAAGACAATT 
GTTGAAACCTGCTATACATGTTTATTTTAATAAATTGATGGCAAAAAAAAAAAT 

<210> SEQ ID NO 1465 

<211> Length : 907 

<212> Type : DNA 

<213> Organism : Homo sapiens 

<400> sequence : 1465 
>H S U3 3 1 4 7_PE A_1_T 2 

GTGCTCACCTCCACAGCGGCTTCCTTGATCCTTGCCACCCGCGACTGAACACCGACAGCAGCAGCCTCACCATGAAG 
TTGCTGATGGTCCTCATGCTGGCGGCCCTCTCCCAGCACTGCTACGCAGGCTCTGGCTGCCCCTTATTGGAGAATGT 
GATTTCCAAGACAATCAATCCACAAGTGTCTAAGACTGAATACAAAGAACTTCTTCAAGAGTTCATAGACGACAATG 
CCACTACAAATGCCATAGATGAATTGAAGGAATGTTTTCTTAACCAAACGGATGAAACTCTGAGCAATGTTGAGGTG 
TTTATGGTAATTTCATTTTCTTCCTATAAGCTTTTTAAATCCCCTGACCAGGGACAAGTGGGCTCTTCATTTCTCAC 
TGACAATGCCAAAGCCACTAGTGAACAAGCCTTTTCTTACATTGGTTAATTTAGTTGAATGGTTAGTCTAATGACTT 
TGCCATCAAGAAAAACATCCAGTGTCCCTGTGTTGTCACTCTACCCAGAGAATCCTCAGTGGATGATAAATGAATAG 
GGCAAGAGAGGAAAAGGAAAGGTCGGTAGAAGTCTTACCTATCCCCAGAGCTCTCTAATTCATGCTCACAAACACAG 
ACACAATCACACAAACACAGAAACACACATACACACATCCAGACACATGCAAACACACAGACACAGTCACAATCACA 
CAAACACACACACATTCAGACATACACAAACATAGACAGACAGGCAAAGACACAGACACAGACACAGACACAATCAC 
ACCAGCACACAATCATCCAGACACAAACACAAACACACAGACAGAACCACACAACCACAGAAACACAGAGACACACA 
CAAACACACTCAGACACACACATACAAACATATGTTCACTCTCTACAGAAAAAACAATTT 

SEQ ID NO:1466 
>HPRT1-F: 

TGACACTGGCAAAACAATGCA 

SEQ ID NO:1467 
>HPRT1-R: 

GGTCCTTTTCACCAGCAAGCT 

SEQ ID NO:1468 
>amplicon 
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TGACACTGGCAAAACAATGCAGACTTTGCTTTCCTTGGTCAGGCA 
GTATAATCCAAAGATGGTCAAGGTCGCAAGCTTGCTGGTGAAAAGGACC 

SEQ ID NO:1469 
>PBGD-F2 : 

TGAGAGTGATTCGCGTGGG 

SEQ ID NO : 1470 
>PBGD-R2 : 

CCAGGGTACGAGGCTTTCAAT 

SEQ ID NO:1471 
>amplicon 

TGAGAGTGATTCGCGTGGGTACCCGCAAGAGCCAGCTTGCTCGCATACAGACGGACAGTGTGGTGGCAACATTGAAA 
GCCTCGTACCCTGG 

SEQ ID NO:1472 
>Ubiquitin~F : 
ATTTGGGTCGCGGTTCTTG 

SEQ ID NO:1473 
>Ubiquitin~R : 
TGCCTTGACATTCTCGATGGT 

SEQ ID NO: 1474 
>Amplicon : 

ATTTGGGTCGCGGTTCTTGTTTGTGGATCGCTGTGATCGTCACTTGACAATGCAGATCTTCGTGAAGACTCTGACTG 
GTAAGACCATCACCCTCGAGG TTGAGCCCAGTGACACCATCGAGAATGTCAAGGCA 

SEQ ID NO:1475 
>SDHA— F : 

TGGGAACAAGAGGGCATCTG 

SEQ ID NO:1476 
>SDHA-R: 

C C AC C AC T G CAT C A A AT TC AT G 
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SEQ ID NO:1477 
>amplicon 

TGGGAACAAGAGGGCATCTGCTAAAGTTTCAGATTCCATTTCT 
GCTCAGTATCCAGTAGTGGATCATGAATTTGATGCAGTGGTGG 

SEQ ID NO:1478 
>Forward primer: 
AGACTCCAACCCACAGCCC 

SEQ ID NO:1479 
>Reverse primer: 
CAGCTCAGCCAACCTTGCA 

SEQ ID NO:1480 
>Amplicon : 

AGACTCCAACCCACAGCCCAGCTGTGGCTGCACAGTGAGCCTGATGGGAGGTGGGGAACAGGGACAGGGGGCCACCT 
GGGCTTCTTCACAGAGAGGTCAGCAGGAAGGCTTGGCTACAGTGCAAGGTTGGCTGAGCTG 



SEQ ID NO: 1481 

>T86235 # transcript_8 #len 1491 (Includes node 44 - TAA seg 35) 

CTGTCTCCACTAAAATTTTAAAAATTAGAAAGTGCTCTCTGGAAAAGCTGCCTAACTCTCACTGCTTCTCTGCTGCC 
CCTCCTAATGTACATCTAGGGCCTCTCAGTTAGGGGCTTCAATCCATTCCTCATGAGGGTGGGACTCAGGCTGGTCT 
TTCTCCTGCCCCAGCCTGGCTTGCTTGTGTGCCTTGTTCCTTGGTGACAGGAGCAGGTTGCCGTCCGGTTGTTTGAC 
CAGGAGAGTTGTATAAGGTCACTGGAGGGTTCTGGGAAACCACCGGTGGCCACTCCTTCTGGACCCCACTCTAACAG 
AACCCCCAGCCTCCAGGAGGTGAAGATTCAACGCATCGGTATCCTGCAACAGCTGTTGAGACAGGAAGTAGAGGGGC 
TGGTAGGGGGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTCTGGATATGGTTGAACTTCAGCCCCTGCTGACTGAG 
ATTTCTAGAACTCTGAATGCCACAGAGCATAACTCTGGGACTTCCCACCTTCCTGGACTGTTAAAACACTCAGGGCT 
GCCAAAGCCCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCCCTGCCCTCCGGCAGAGCCTGGGCCCCCAGAGGCCT 
TCTGTAGGAGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGGAACAGCTTGAAGTACCAGAGCCCTACCCTCCAGCA 
GAACCCAGGCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAGATACCGGAGTCCTCTCGCCAGGAACAGCXTGAGGT 
ACCTGAGCCCTGCCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTACTGTAGGATTGAGCCTGAGATACCGGAGTCCT 
CTCGCCAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCGGGCCCCTTCAGCCCAGCACCCAGGGG 
CAGTCTGGACCCCCAGGGCCCTGCCCTAGGGTAGAGCTGGGGGCATCAGAGCCCTGCACCCTGGAACATAGAAGTCT 
AGAGTCCAGTCTACCACCCTGCTGCAGTCAGTGGGCTCCAGCAACCACCAGCCTGATCTTCTCTTCCCAACACCCGC 
TTTGTGCCAGCCCCCCTATCTGCTCACTCCAGTCTTTGAGACCCCCAGCAGGCCAGGCAGGCCTCAGCAATCTGGCC 
CCTCGAACCCTAGCCCTGAGGGAGCGCCTCAAATCGTGTTTAACCGCCATCCACTGCTTCCACGAGGCTCGTCTGGA 
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CGATGAGTGTGCCTTTTACACCAGCCGAGCCCCTCCCTCAGGCCCCACCCGGGTCTGCACCAACCCTGTGGCTACAT 
TACTCGAATGGCAGGATGCCCTGTGTTTCATTCCAGTTGGTTCTGCTGCCCCCCAGGGCTCTCCATGATGAGACAAC 
CACTCCTGCCCTGCCGTACTTCTTCCTTTTAGCCCTTATTTATTGTCGGTCTGCCCATGGGACTGGGAGCCGCCCAC 

TTTTGTCCTCAATAAAGTTTCTAAAGTA 
SEQ ID NO:1482 

>T86235 # transcript_9 #len 1644 (Includes node 44 - TAA seg 35 ; node 32 - 
TAA seg 37) 

CTGTCTCCACTAAAATTTTAAAAATTAGAAAGTGCTCTCTGGAAAAGCTGCCTAACTCTCACTGCTTCTCTGCTGCC 

CCTCCTAATGTACATCTAGGGCCTCTCAGTTAGGGGCTTCAATCCATTCCTCATGAGGGTGGGACTCAGGCTGGTCT 

TTCTCCTGCCCCAGCCTGGCTTGCTTGTGTGCCTTGTTCCTTGGTGACAGGAGCAGGTTGCCGTCCGGTTGTTTGAC 

CAGGAGAGTTGTATAAGGTCACTGGAGGGTTCTGGGAAACCACCGGTGGCCACTCCTTCTGGACCCCACTCTAACAG 

AACCCCCAGCCTCCAGGAGGTGAAGATTCAAGTGAGTCTGTGTGGCCAACAGCTTTGATGTCTATTGAACAGTGACT 

GGGCTGAGGAAGAGGGAAAAGAGATGGGGGATCAGGAATAGGACAGTGTGGGTAGACTACTGAACGCACATCTTGAT 

GTCACACTGGGGTGCTCTCTCCCACCACAGCGCATCGGTATCCTGCAACAGCTGTTGAGACAGGAAGTAGAGGGGCT 

GGTAGGGGGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTCTGGATATGGTTGAACTTCAGCCCCTGCTGACTGAGA 

TTTCTAGAACTCTGAATGCCACAGAGCATAACTCTGGGACTTCCCACCTTCCTGGACTGTTAAAACACTCAGGGCTG 

CCAAAGCCCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCCCTGCCCTCCGGCAGAGCCTGGGCCCCCAGAGGCCTT 

CTGTAGGAGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGGAACAGCTTGAAGTACCAGAGCCCXACCCTCCAGCAG 

AACCCAGGCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAGATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGTA 

CCTGAGCCCTGCCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTACTGTAGGATTGAGCCTGAGATACCGGAGTCCTC 

TCGCCAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCGGGCCCCTTCAGCCCAGCACCCAGGGGC 

AGTCTGGACCCCCAGGGCCCTGCCCTAGGGTAGAGCTGGGGGCATCAGAGCCCTGCACCCTGGAACATAGAAGTCTA 

GAGTCCAGTCTACCACCCTGCTGCAGTCAGTGGGCTCCAGCAACCACCAGCCTGATCTTCTCTTCCCAACACCCGCT 

TTGTGCCAGCCCCCCTATCTGCTCACTCCAGTCTTTGAGACCCCCAGCAGGCCAGGCAGGCCTCAGCAATCTGGCCC 

CTCGAACCCTAGCCCTGAGGGAGCGCCTCAAATCGTGTTTAACCGCCATCCACTGCTTCCACGAGGCTCGTCTGGAC 

GATGAGTGTGCCTTTTACACCAGCCGAGCCCCTCCCTCAGGCCCCACCCGGGTCTGCACCAACCCTGTGGCTACATT 

ACTCGAATGGCAGGATGCCCTGTGTTTCATTCCAGTTGGTTCTGCTGCCCCCCAGGGCTCTCCATGATGAGACAACC 

ACTCCTGCCCTGCCGTACTTCTTCCTTTTAGCCCTTATTTATTGTCGGTCTGCCCATGGGACTGGGAGCCGCCCACT 

TTTGTCCTCAATAAAGTTTCTAAAGTA 
SEQ ID NO:1483 

>T86235 # transcript_10 fieri 1404 (Includes node 44 - TAA seg 35) 

CTGTCTCCACTAAAATTTTAAAAATTAGAAAGTGCTCTCTGGAAAAGCTGCCTAACTCTCACTGCTTCTCTGCTGCC 

CCTCCTAATGTACATCTAGGGCCTCTCAGTTAGGGGCTTCAATCCATTCCTCATGAGGGTGGGACTCAGGCTGGTCT 

TTCTCCTGCCCCAGCCTGGCTTGCTTGTGTGCCTTGTTCCTTGGTGACAGGAGCAGGTTGCCGTCCGGTTGTTTGAC 

CAGGAGAGTTGTATAAGGTCACTGGAGGGTTCTGGGAAACCACCGGTGGCCACTCCTTCTGGACCCCACTCTAACAG 

AACCCCCAGCCTCCAGGAGGTGAAGATTCAACGCATCGGTATCCTGCAACAGCTGTTGAGACAGGAAGTAGAGGGGC 
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TGGTAGGGGGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTCTGGATATGGTTGAACTTCAGCCCCTGCTGACTGAG 
ATTTCTAGAACTCTGAATGCCACAGAGCATAACTCTGGGACTTCCCACCTTCCTGGACTGTTAAAACACTCAGGGCT 
GCCAAAGCCCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCCCTGCCCTCCGGCAGAGCCTGGGCCCCCAGAGGCCT 
TCTGTAGGAGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGGAACAGCTTGAAGTACCAGAGCCCTACCCTCCAGCA 
GAACCCAGGCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAGATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGA 
ACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCGGGCCCCTTCAGCCCAGCACCCAGGGGCAGTCTGGAC 
CCCCAGGGCCCTGCCCTAGGGTAGAGCTGGGGGCATCAGAGCCCTGCACCCTGGAACATAGAAGTCTAGAGTCCAGT 
CTACCACCCTGCTGCAGTCAGTGGGCTCCAGCAACCACCAGCCTGATCTTCTCTTCCCAACACCCGCTTTGTGCCAG 
CCCCCCTATCTGCTCACTCCAGTCTTTGAGACCCCCAGCAGGCCAGGCAGGCCTCAGCAATCTGGCCCCTCGAACCC 
TAGCCCTGAGGGAGCGCCTCAAATCGTGTTTAACCGCCATCCACTGCTTCCACGAGGCTCGTCTGGACGATGAGTGT 
GCCTTTTACACCAGCCGAGCCCCTCCCTCAGGCCCCACCCGGGTCTGCACCAACCCTGTGGCTACATTACTCGAATG 
GCAGGATGCCCTGTGTTTCATTCCAGTTGGTTCTGCTGCCCCCCAGGGCTCTCCATGATGAGACAACCACTCCTGCC 
CTGCCGTACTTCTTCCTTTTAGCCCTTATTTATTGTCGGTCTGCCCATGGGACTGGGAGCCGCCCACTTTTGTCCTC 
AATAAAG TTTC T AA AG T A 

SEQ ID NO: 1484 

>T86235 # transcript_22 #len 2797 (Includes node 37 - TAA seg 42) 

CTCCAGCAGCACCCGAGAGGGTCAGGAGAAAAGCGGAGGAAGCTGGGTAGGCCCTGAGGGGCCTCGGTAAGCCATCA 
TGACCACCCGGCAAGCCACGAAGGATCCCCTCCTCCGGGGTGTATCTCCTACCCCTAGCAAGATTCCGGTACGCTCT 
CAGAAACGCACGCCTTTCCCCACTGTTACATCGTGCGCCGTGGACCAGGAGAACCAAGATCCAAGGAGATGGGTGCA 
GAAACCACCGCTCAATATTCAACGCCCCCTCGTTGATTCAGCAGGCCCCAGGCCGAAAGCCAGGCACCAGGCAGAGA 
CATCACAAAGATTGGTGGGGATCAGTCAGCCTCGGAACCCCTTGGAAGAGCTCAGGCCTAGCCCTAGGGGTCAAAAT 
GTGGGGCCTGGGCCCCCTGCCCAGACAGAGGCTCCAGGGACCATAGAGTTTGTGGCTGACCCTGCAGCCCTGGCCAC 
CATCCTGTCAGGTGAGGGTGTGAAGAGCTGTCACCTGGGGCGCCAGCCTAGTCTGGCTAAAAGAGTACTGGTTCGAG 
GAAGTCAGGGAGGCACCACCCAGAGGGTCCAGGGTGTTCGGGCCTCTGCATATTTGGCCCCCAGAACCCCCACCCAC 
CGACTGGACCCTGCCAGGGCTTCCTGCTTCTCTAGGCTGGAGGGACCAGGACCTCGAGGCCGGACATTGTGCCCCCA 
GAGGCTACAGGCTCTGATTTCACCTTCAGGACCTTCCTTTCACCCTTCCACTCGCCCCAGTTTCCAGGAGCTAAGAA 
GGGAGACAGCTGGCAGCAGCCGGACTTCAGTGAGCCAGGCCTCAGGATTGCTCCTGGAGACCCCAGTCCAGCCTGCT 
TTCTCTCTTCCTAAAGGAGAACGCGAGGTTGTCACTCACTCAGATGAAGGAGGTGTGGCCTCTCTTGGTCTGGCCCA 
GCGAGTACCATTAAGAGAAAACCGAGAAATGTCACATACCAGGGACAGCCATGACTCCCACCTGATGCCCTCCCCTG 
CCCCTGTGGCCCAGCCCTTGCCTGGCCATGTGGTGCCATGTCCATCACCCTTTGGACGGGCTCAGCGTGTACCCTCC 
CCAGGCCCTCCAACTCTGACCTCATATTCAGTGTTGCGGCGTCTCACCGTTCAACCTAAAACCCGGTTCACACCCAT 
GCCATCAACCCCCAGAGTTCAGCAGGCCCAGTGGCTGCGTGGXGTCTCCCCTCAGTCCTGCTCTGAAGATCCTGCCC 
TGCCCTGGGAGCAGGTTGCCGTCCGGTTGTTTGACCAGGAGAGTTGTATAAGGTCACTGGAGGGTTCTGGGAAACCA 
CCGGTGGCCACTCCTTCTGGACCCCACTCTAACAGAACCCCCAGCCTCCAGGAGGTGAAGATTCAACGCATCGGTAT 
CCTGCAACAGCTGTTGAGACAGGAAGTAGAGGGGCTGGTAGGGGGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTC 
TGGATATGGTTGAACTTCAGCCCCTGCTGACTGAGAXTTCTAGAACTCTGAATGCCACAGAGCATAACTCTGGGACT 
TCCCACCTTCCTGGACTGTTAAAACACTCAGGGCTGCCAAAGCCCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCC 
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CTGCCCTCCGGCAGAGCCTGGGCCCCCAGAGGCCTTCTGTAGGAGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGG 
AACAGCTTGAAGTACCAGAGCCCTACCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAG 
ATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCAGGCCCCTAGAGTC 
CTACTGTAGGATTGAGCCTGAGATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAG 
CAGAACCCGGGCCCCTTCAGCCCAGCACCCAGGGGCAGTCTGGACCCCCAGGGCCCTGCCCTAGGGTAGAGCTGGGG 
GCATCAGAGCCCTGCACCCTGGAACATAGAAGTCTAGAGTCCAGTCTACCACCCTGCTGCAGTCAGTGGGCTCCAGC 
AACCACCAGCCTGATCTTCTCTTCCCAACACCCGCTTTGTGCCAGCCCCCCTATCTGCTCACTCCAGTCTTTGAGAC 
CCCCAGCAGGCCAGGCAGGTAAGGAGTTGGCTGGGAAGGAGTGTGAACACAAGAGGTCCTCACCTCACTGTGAGCTG 
CACACCTGCCCTGCCCCTACCCCAGGCAATCTCATGCTTCCACACCTTCCACCCTGGCCCAGCCTGGCTCTCCCTCA 
GGAAGAGGGGAGGGGCTGCACTTCCAGCCCTGXGCTCCTAATTGGCTTGGCCGTTGGTGGGGGAGGAGGAGAGGACA 
GTACATGGTGGAAGTATAGGACCCCAGACCTCCCTCTAAATTTTCCATGCCCCTCAGGCCTCAGCAATCTGGCCCCT 
CGAACCCTAGCCCTGAGGGAGCGCCTCAAATCGTGTTTAACCGCCATCCACTGCTTCCACGAGGCTCGTCTGGACGA 
TGAGTGTGCCTTTTACACCAGCCGAGCCCCTCCCTCAGGCCCCACCCGGGTCTGCACCAACCCTGTGGCTACATTAC 
TCGAATGGCAGGATGCCCTGTGTTTCATTCCAGTTGGTTCTGCTGCCCCCCAGGGCTCTCCATGATGAGACAACCAC 
TCCTGCCCTGCCGTACTTCTTCCTTTTAGCCCTTATTTATTGTCGGTCTGCCCATGGGACTGGGAGCCGCCCACTTT 
TGTCCTCAATAAAGTTTCTAAAGTA 

SEQ ID NO: 1485 

>T86235 # transcript_23 tlen 2962 (Includes node 39 - TAA seg 44; node 37 - 
TAA seg 42) 

CTCCAGCAGCACCCGAGAGGGTCAGGAGAAAAGCGGAGGAAGCTGGGTAGGCCCTGAGGGGCCTCGGTAAGCCATCA 
TGACCACCCGGCAAGCCACGAAGGATCCCCTCCTCCGGGGTGTATCTCCTACCCCTAGCAAGATTCCGGTACGCTCT 
CAGAAACGCACGCCTTTCCCCACTGTTACATCGTGCGCCGTGGACCAGGAGAACCAAGATCCAAGGAGATGGGTGCA 
GAAACCACCGCTCAATATTCAACGCCCCCTCGTTGATTCAGCAGGCCCCAGGCCGAAAGCCAGGCACCAGGCAGAGA 
CATCACAAAGATTGGTGGGGATCAGTCAGCCTCGGAACCCCTTGGAAGAGCTCAGGCCTAGCCCTAGGGGTCAAAAT 
GTGGGGCCTGGGCCCCCTGCCCAGACAGAGGCTCCAGGGACCATAGAGTTTGTGGCTGACCCTGCAGCCCTGGCCAC 
CATCCTGTCAGGTGAGGGTGTGAAGAGCTGTCACCTGGGGCGCCAGCCTAGTCTGGCTAAAAGAGTACTGGTTCGAG 
GAAGTCAGGGAGGCACCACCCAGAGGGTCCAGGGTGTTCGGGCCTCTGCATATTTGGCCCCCAGAACCCCCACCCAC 
CGACTGGACCCTGCCAGGGCTTCCTGCTTCTCTAGGCTGGAGGGACCAGGACCTCGAGGCCGGACATTGTGCCCCCA 
GAGGCTACAGGCTCTGATTTCACCTTCAGGACCTTCCTTTCACCCTTCCACTCGCCCCAGTTTCCAGGAGCTAAGAA 
GGGAGACAGCTGGCAGCAGCCGGACTTCAGTGAGCCAGGCCTCAGGATTGCTCCTGGAGACCCCAGTCCAGCCTGCT 
TTCTCTCTTCCTAAAGGAGAACGCGAGGTTGTCACTCACTCAGATGAAGGAGGTGTGGCCTCTCTTGGTCTGGCCCA 
GCGAGTACCATTAAGAGAAAACCGAGAAATGTCACATACCAGGGACAGCCATGACTCCCACCTGATGCCCTCCCCTG 
CCCCTGTGGCCCAGCCCTTGCCTGGCCATGTGGTGCCATGTCCATCACCCTTTGGACGGGCTCAGCGTGTACCCTCC 
CCAGGCCCTCCAACTCTGACCTCATATTCAGTGTTGCGGCGTCTCACCGTTCAACCTAAAACCCGGTTCACACCCAT 
GCCATCAACCCCCAGAGTTCAGCAGGCCCAGTGGCTGCGTGGTGTCTCCCCTCAGTCCTGCTCTGAAGATCCTGCCC 
TGCCCTGGGAGCAGGTTGCCGTCCGGTTGTTTGACCAGGAGAGTTGTATAAGGTCACTGGAGGGTTCTGGGAAACCA 
CCGGTGGCCACTCCTTCTGGACCCCACTCTAACAGAACCCCCAGCCTCCAGGAGGTGAAGATTCAACGCATCGGTAT 
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CCTGCAACAGCTGTTGAGACAGGAAGTAGAGGGGCTGGTAGGGGGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTC 

TGGATATGGTTGAACTTCAGCCCCTGCTGACTGAGATTTCTAGAACTCTGAATGCCACAGAGCATAACTCTGGGACT 

TCCCACCTTCCTGGACTGTTAAAACACTCAGGGCTGCCAAAGCCCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCC 

CTGCCCTCCGGCAGAGCCTGGGCCCCCAGAGGCCTTCTGTAGGAGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGG 

AACAGCTTGAAGTACCAGAGCCCTACCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAG 

ATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCAGGCCCCTAGAGTC 

CTACTGTAGGATTGAGCCTGAGATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAG 

CAGAACCCGGGCCCCTTCAGCCCAGCACCCAGGGGCAGTCTGGACCCCCAGGGCCCTGCCCTAGGGTAGAGCTGGGG 

GCATCAGAGCCCTGCACCCTGGAACATAGAAGTCTAGAGTCCAGTCTACCACCCTGCTGCAGTCAGTGGGCTCCAGC 

AACCACCAGCCTGATCTTCTCTTCCCAACACCCGCTTTGTGCCAGCCCCCCTATCTGCTCACTCCAGTCTTTGAGAC 

CCCCAGCAGGCCAGGCAGGTAAGGAGTTGGCTGGGAAGGAGTGTGAACACAAGAGGTCCTCACCTCACTGTGAGCTG 

CACACCTGCCCTGCCCCTACCCCAGGCAATCTCATGCTTCCACACCTTCCACCCTGGCCCAGCCTGGCTCTCCCTCA 

GGAAGAGGGGAGGGGCTGCACTTCCAGCCCTGTGCTCCTAATTGGCTTGGCCGTTGGTGGGGGAGGAGGAGAGGACA 

GTACATGGTGGAAGTATAGGACCCCAGACCTCCCTCTAAATTTTCCATGCCCCTCAGGCCTCAGCAATCTGGCCCCT 

CGAACCCTAGCCCTGAGGGAGCGCCTCAAATCGTGTTTAACCGCCATCCACTGCTTCCACGAGGCTCGTCTGGACGA 

TGAGTGTGCCTTTTACACCAGCCGAGCCCCTCCCTCAGGCCCCACCCGGGTCTGCACCAACCCTGTGGCTACATTAC 

TCGAATGGCAGGATGCCCTGGTGAGACTCCAACCCACAGCCCAGCTGTGGCTGCACAGTGAGCCTGATGGGAGGTGG 

GGAACAGGGACAGGGGGCCACCTGGGCTTCTTCACAGAGAGGTCAGCAGGAAGGCTTGGCTACAGTGCAAGGTTGGC 

TGAGCTGTGACAAGGTCTTCTCTGTCTCCAGTGTTTCATTCCAGTTGGTTCTGCTGCCCCCCAGGGCTCTCCATGAT 

GAGACAACCACTCCTGCCCTGCCGTACTTCTTCCTTTTAGCCCTTATTTATTGTCGGTCTGCCCATGGGACTGGGAG 

CCGCCCACTTTTGTCCTCAATAAAGTTTCTAAAGTA 

SEQ ID NO:1486 

>T86235 # transcript_2 4 #len 2605 (Includes node 39 - TAA seg 44) 

CTCCAGCAGCACCCGAGAGGGTCAGGAGAAAAGCGGAGGAAGCTGGGTAGGCCCTGAGGGGCCTCGGTAAGCCATCA 

TGACCACCCGGCAAGCCACGAAGGATCCCCTCCTCCGGGGTGTATCTCCTACCCCTAGCAAGATTCCGGTACGCTCT 

CAGAAACGCACGCCTTTCCCCACTGTTACATCGTGCGCCGTGGACCAGGAGAACCAAGATCCAAGGAGATGGGTGCA 

GAAACCACCGCTCAATATTCAACGCCCCCTCGTTGATTCAGCAGGCCCCAGGCCGAAAGCCAGGCACCAGGCAGAGA 

CATCACAAAGATTGGTGGGGATCAGTCAGCCTCGGAACCCCTTGGAAGAGCTCAGGCCTAGCCCTAGGGGTCAAAAT 

GTGGGGCCTGGGCCCCCTGCCCAGACAGAGGCTCCAGGGACCATAGAGTTTGTGGCTGACCCTGCAGCCCTGGCCAC 

CATCCTGTCAGGTGAGGGTGTGAAGAGCTGTCACCTGGGGCGCCAGCCTAGTCTGGCTAAAAGAGTACTGGTTCGAG 

GAAGTCAGGGAGGCACCACCCAGAGGGTCCAGGGTGTTCGGGCCTCTGCATATTTGGCCCCCAGAACCCCCACCCAC 

CGACTGGACCCTGCCAGGGCTTCCTGCTTCTCTAGGCTGGAGGGACCAGGACCTCGAGGCCGGACATTGTGCCCCCA 

GAGGCTACAGGCTCTGATTTCACCTTCAGGACCTTCCTTTCACCCTTCCACTCGCCCCAGTTTCCAGGAGCTAAGAA 

GGGAGACAGCTGGCAGCAGCCGGACTTCAGTGAGCCAGGCCTCAGGATTGCTCCTGGAGACCCCAGTCCAGCCTGCT 

TTCTCTCTTCCTAAAGGAGAACGCGAGGTTGTCACTCACTCAGATGAAGGAGGTGTGGCCTCTCTTGGTCTGGCCCA 

GCGAGTACCATTAAGAGAAAACCGAGAAATGTCACATACCAGGGACAGCCATGACTCCCACCTGATGCCCTCCCCTG 

CCCCTGTGGCCCAGCCCTTGCCTGGCCATGTGGTGCCATGTCCATCACCCTTTGGACGGGCTCAGCGTGTACCCTCC 
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CCAGGCCCTCCAACTCTGACCTCATATTCAGTGTTGCGGCGTCTCACCGTTCAACCTAAAACCCGGTTCACACCCAT 
GCCATCAACCCCCAGAGTTCAGCAGGCCCAGTGGCTGCGTGGTGTCTCCCCTCAGTCCTGCTCTGAAGATCCTGCCC 
TGCCCTGGGAGCAGGTTGCCGTCCGGTTGTTTGACCAGGAGAGTTGTATAAGGTCACTGGAGGGTTCTGGGAAACCA 
CCGGTGGCCACTCCTTCTGGACCCCACTCTAACAGAACCCCCAGCCTCCAGGAGGTGAAGATTCAACGCATCGGTAT 
CCTGCAACAGCTGTTGAGACAGGAAGTAGAGGGGCTGGTAGGGGGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTC 
TGGATATGGTTGAACTTCAGCCCCTGCTGACTGAGATTTCTAGAACTCTGAATGCCACAGAGCATAACTCTGGGACT 
TCCCACCTTCCTGGACTGTTAAAACACTCAGGGCTGCCAAAGCCCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCC 
CTGCCCTCCGGCAGAGCCTGGGCCCCCAGAGGCCTTCTGTAGGAGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGG 
AACAGCTTGAAGTACCAGAGCCCTACCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAG 
ATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCGG 

GCCCCTTCAGCCCAGCACCCAGGGGCAGTCTGGACCCCCAGGGCCCTGCCCTAGGGTAGAGCTGGGGGCATCAGAGC 

i 

CCTGCACCCTGGAACATAGAAGTCTAGAGTCCAGTCTACCACCCTGCTGCAGTCAGTGGGCTCCAGCAACCACCAGC 
CTGATCTTCTCTTCCCAACACCCGCTTTGTGCCAGCCCCCCTATCTGCTCACTCCAGTCTTTGAGACCCCCAGCAGG 
CCAGGCAGGCCTCAGCAATCTGGCCCCTCGAACCCTAGCCCTGAGGGAGCGCCTCAAATCGTGTTTAACCGCCATCC 
ACTGCTTCCACGAGGCTCGTCTGGACGATGAGTGTGCCTTTTACACCAGCCGAGCCCCTCCCTCAGGCCCCACCCGG 
GTCTGCACCAACCCTGTGGCTACATTACTCGAATGGCAGGATGCCCTGGTGAGACTCCAACCCACAGCCCAGCTGTG 
GCTGCACAGTGAGCCTGATGGGAGGTGGGGAACAGGGACAGGGGGCCACCTGGGCTTCTTCACAGAGAGGTCAGCAG 
GAAGGCTTGGCTACAGTGCAAGGTTGGCTGAGCTGTGACAAGGTCTTCTCTGTCTCCAGTGTTTCATTCCAGTTGGT 
TCTGCTGCCC CCCAGGGCTCTCCATGATGAGACAACC ACT CCTGCCCTGCCGTACTTCTTCCTTTTAGCCCTTATTT 
ATTGTCGGTCTGCCCATGGGACTGGGAGCCGCCCACTTTTGTCCTCAATAAAGTTTCTAAAGTA 

SEQ ID NO:1487 

>T86235 # transcript_25 #len 2254 (Includes node 39 - TAA seg 44) 

CTCCAGCAGCACCCGAGAGGGTCAGGAGAAAAGCGGAGGAAGCTGGGTAGGCCCTGAGGGGCCTCGGTAAGCCATCA 

TGACCACCCGGCAAGCCACGAAGGATCCCCTCCTCCGGGGTGTATCTCCTACCCCTAGCAAGATTCCGGTACGCTCT 

CAGAAACGCACGCCTTTCCCCACTGTTACATCGTGCGCCGTGGACCAGGAGAACCAAGATCCAAGGGGTGTTCGGGC 

CTCTGCATATTTGGCCCCCAGAACCCCCACCCACCGACTGGACCCTGCCAGGGCTTCCTGCTTCTCTAGGCTGGAGG 

GACCAGGACCTCGAGGCCGGACATTGTGCCCCCAGAGGCTACAGGCTCTGATTTCACCTTCAGGACCTTCCTTTCAC 

CCTTCCACTCGCCCCAGTTTCCAGGAGCTAAGAAGGGAGACAGCTGGCAGCAGCCGGACTTCAGTGAGCCAGGCCTC 

AGGATTGCTCCTGGAGACCCCAGTCCAGCCTGCTTTCTCTCTTCCTAAAGGAGAACGCGAGGTTGTCACTCACTCAG 

ATGAAGGAGGTGTGGCCTCTCTTGGTCTGGCCCAGCGAGTACCATTAAGAGAAAACCGAGAAATGTCACATACCAGG 

GACAGCCATGACTCCCACCTGATGCCCTCCCCTGCCCCTGTGGCCCAGCCCTTGCCTGGCCATGTGGTGCCATGTCC 

ATCACCCTTTGGACGGGCTCAGCGTGTACCCTCCCCAGGCCCTCCAACTCTGACCTCATATTCAGTGTTGCGGCGTC 

TCACCGTTCAACCTAAAACCCGGTTCACACCCATGCCATCAACCCCCAGAGTTCAGCAGGCCCAGTGGCTGCGTGGT 

GTCTCCCCTCAGTCCTGCTCTGAAGATCCTGCCCTGCCCTGGGAGCAGGTTGCCGTCCGGTTGTTTGACCAGGAGAG 

TTGTATAAGGTCACTGGAGGGTTCTGGGAAACCACCGGTGGCCACTCCTTCTGGACCCCACTCTAACAGAACCCCCA 

GCCTCCAGGAGGTGAAGATTCAACGCATCGGTATCCTGCAACAGCTGTTGAGACAGGAAGTAGAGGGGCTGGTAGGG 

GGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTCTGGATATGGTTGAACTTCAGCCCCTGCTGACTGAGATTTCTAG 
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AACTCTGAATGCCACAGAGCATAACTCTGGGACTTCCCACCTTCCTGGACTGTTAAAACACTCAGGGCTGCCAAAGC 
CCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCCCTGCCCTCCGGCAGAGCCTGGGCCCCCAGAGGCCTTCTGTAGG 
AGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGGAACAGCTTGAAGTACCAGAGCCCTACCCTCCAGCAGAACCCAG 
GCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAGATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGAACAGCTTG 
AGGTACCTGAGCCCTGCCCTCCAGCAGAACCCGGGCCCCTTCAGCCCAGCACCCAGGGGCAGTCTGGACCCCCAGGG 
CCCTGCCCTAGGGTAGAGCTGGGGGCATCAGAGCCCTGCACCCTGGAACATAGAAGTCTAGAGTCCAGTCTACCACC 
CTGCTGCAGTCAGTGGGCTCCAGCAACCACCAGCCTGATCTTCTCTTCCCAACACCCGCTTTGTGCCAGCCCCCCTA 
TCTGCTCACTCCAGTCTTTGAGACCCCCAGCAGGCCAGGCAGGCCTCAGCAATCTGGCCCCTCGAACCCTAGCCCTG 
AGGGAGCGCCTCAAATCGTGTTTAACCGCCATCCACTGCTTCCACGAGGCTCGTCTGGACGATGAGTGTGCCTTTTA 
CACCAGCCGAGCCCCTCCCTCAGGCCCCACCCGGGTCTGCACCAACCCTGTGGCTACATTACTCGAATGGCAGGATG 
CCCTGGTGAGACTCCAACCCACAGCCCAGCTGTGGCTGCACAGTGAGCCTGATGGGAGGTGGGGAACAGGGACAGGG 
GGCCACCTGGGCTTCTTCACAGAGAGGTCAGCAGGAAGGCTTGGCTACAGTGCAAGGTTGGCTGAGCTGTGACAAGG 
TCTTCTCTGTCTCCAGTGTTTCATTCCAGTTGGTTCTGCTGCCCCCCAGGGCTCTCCATGATGAGACAACCACTCCT 
GCCCTGCCGTACTTCTTCCTTTTAGCCCTTATTTATTGTCGGTCTGCCCATGGGACTGGGAGCCGCCCACTTTTGTC 
CTCAATAAAGTTTCTAAAGTA 

SEQ ID NO:1488 

>T86235 # transcript_26 #len 2611 (Includes node 39 - TAA seg 44; node 37 - 
TAA seg 42) 

CTCCAGCAGCACCCGAGAGGGTCAGGAGAAAAGCGGAGGAAGCTGGGTAGGCCCTGAGGGGCCTCGGTAAGCCATCA 
TGACCACCCGGCAAGCCACGAAGGATCCCCTCCTCCGGGGTGTATCTCCTACCCCTAGCAAGATTCCGGTACGCTCT 
CAGAAACGCACGCCTTTCCCCACTGTTACATCGTGCGCCGTGGACCAGGAGAACCAAGATCCAAGGGGTGTTCGGGC 
CTCTGCATATTTGGCCCCCAGAACCCCCACCCACCGACTGGACCCTGCCAGGGCTTCCTGCTTCTCTAGGCTGGAGG 
GACCAGGACCTCGAGGCCGGACATTGTGCCCCCAGAGGCTACAGGCTCTGATTTCACCTTCAGGACCTTCCTTTCAC 
CCTTCCACTCGCCCCAGTTTCCAGGAGCTAAGAAGGGAGACAGCTGGCAGCAGCCGGACTTCAGTGAGCCAGGCCTC 
AGGATTGCTCCTGGAGACCCCAGTCCAGCCTGCTTTCTCTCTTCCTAAAGGAGAACGCGAGGTTGTCACTCACTCAG 
ATGAAGGAGGTGTGGCCTCTCTTGGTCTGGCCCAGCGAGTACCATTAAGAGAAAACCGAGAAATGTCACATACCAGG 
GACAGCCATGACTCCCACCTGATGCCCTCCCCTGCCCCTGTGGCCCAGCCCTTGCCTGGCCATGTGGTGCCATGTCC 
ATCACCCTTTGGACGGGCTCAGCGTGTACCCTCCCCAGGCCCTCCAACTCTGACCTCATATTCAGTGTTGCGGCGTC 
TCACCGTTCAACCTAAAACCCGGTTCACACCCATGCCATCAACCCCCAGAGTTCAGCAGGCCCAGTGGCTGCGTGGT 
GTCTCCCCTCAGTCCTGCTCTGAAGATCCTGCCCTGCCCTGGGAGCAGGTTGCCGTCCGGTTGTTTGACCAGGAGAG 
TTGTATAAGGTCACTGGAGGGTTCTGGGAAACCACCGGTGGCCACTCCTTCTGGACCCCACTCTAACAGAACCCCCA 
GCCTCCAGGAGGTGAAGATTCAACGCATCGGTATCCTGCAACAGCTGTTGAGACAGGAAGTAGAGGGGCTGGTAGGG 
GGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTCTGGATATGGTTGAACTTCAGCCCCTGCTGACTGAGATTTCTAG 
AACTCTGAATGCCACAGAGCATAACTCTGGGACTTCCCACCTTCCTGGACTGTTAAAACACTCAGGGCTGCCAAAGC 
CCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCCCTGCCCTCCGGCAGAGCCTGGGCCCCCAGAGGCCTTCTGTAGG 
AGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGGAACAGCTTGAAGTACCAGAGCCCTACCCTCCAGCAGAACCCAG 
GCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAGATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGTACCTGAGC 
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CCTGCCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTACTGTAGGATTGAGCCTGAGATACCGGAGTCCTCTCGCCAG 
GAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCGGGCCCCTTCAGCCCAGCACCCAGGGGCAGTCTGG 
ACCCCCAGGGCCCTGCCCTAGGGTAGAGCTGGGGGCATCAGAGCCCTGCACCCTGGAACATAGAAGTCTAGAGTCCA 
GTCTACCACCCTGCTGCAGTCAGTGGGCTCCAGCAACCACCAGCCTGATCTTCTCTTCCCAACACCCGCTTTGTGCC 
AGCCCCCCTATCTGCTCACTCCAGTCTTTGAGACCCCCAGCAGGCCAGGCAGGTAAGGAGTTGGCTGGGAAGGAGTG 
TGAACACAAGAGGTCCTCACCTCACTGTGAGCTGCACACCTGCCCTGCCCCTACCCCAGGCAATCTCATGCTTCCAC 
ACCTTCCACCCTGGCCCAGCCTGGCTCTCCCTCAGGAAGAGGGGAGGGGCTGCACTTCCAGCCCTGTGCTCCTAATT 
GGCTTGGCCGTTGGTGGGGGAGGAGGAGAGGACAGTACATGGTGGAAGTATAGGACCCCAGACCTCCCTCTAAATTT 
TCCATGCCCCTCAGGCCTCAGCAATCTGGCCCCTCGAACCCTAGCCCTGAGGGAGCGCCTCAAATCGTGTTTAACCG 
CCATCCACTGCTTCCACGAGGCTCGTCTGGACGATGAGTGTGCCTTTTACACCAGCCGAGCCCCTCCCTCAGGCCCC 
ACCCGGGTCTGCACCAACCCTGTGGCTACATTACTCGAATGGCAGGATGCCCTGGTGAGACTCCAACCCACAGCCCA 
GCTGTGGCTGCACAGTGAGCCTGATGGGAGGTGGGGAACAGGGACAGGGGGCCACCTGGGCTTCTTCACAGAGAGGT 
CAGCAGGAAGGCTTGGCTACAGTGCAAGGTTGGCTGAGCTGTGACAAGGTCTTCTCTGTCTCCAGTGTTTCATTCCA 
GTTGGTTCTGCTGCCCCCCAGGGCTCTCCATGATGAGACAACCACTCCTGCCCTGCCGTACTTCTTCCTTTTAGCCC 
TTATTTATTGTCGGTCTGCCCATGGGACTGGGAGCCGCCCACTTTTGTCCTCAATAAAGTTTCTAAAGTA 

SEQ ID NO:1489 

>T86235 # transcriptJ7 #len 2446 (Includes node 37 - TAA seg 42) 

CTCCAGCAGCACCCGAGAGGGTCAGGAGAAAAGCGGAGGAAGCTGGGTAGGCCCTGAGGGGCCTCGGTAAGCCATCA 

TGACCACCCGGCAAGCCACGAAGGATCCCCTCCTCCGGGGTGTATCTCCTACCCCTAGCAAGATTCCGGTACGCTCT 

CAGAAACGCACGCCTTTCCCCACTGTTACATCGTGCGCCGTGGACCAGGAGAACCAAGATCCAAGGGGTGTTCGGGC 

CTCTGCATATTTGGCCCCCAGAACCCCCACCCACCGACTGGACCCTGCCAGGGCTTCCTGCTTCTCTAGGCTGGAGG 

GACCAGGACCTCGAGGCCGGACATTGTGCCCCCAGAGGCTACAGGCTCTGATTTCACCTTCAGGACCTTCCTTTCAC 

CCTTCCACTCGCCCCAGTTTCCAGGAGCTAAGAAGGGAGACAGCTGGCAGCAGCCGGACTTCAGTGAGCCAGGCCTC 

AGGATTGCTCCTGGAGACCCCAGTCCAGCCTGCTTTCTCTCTTCCTAAAGGAGAACGCGAGGTTGTCACTCACTCAG 

ATGAAGGAGGTGTGGCCTCTCTTGGTCTGGCCCAGCGAGTACCATTAAGAGAAAACCGAGAAATGTCACATACCAGG 

GACAGCCATGACTCCCACCTGAXGCCCTCCCCTGCCCCTGTGGCCCAGCCCTTGCCTGGCCATGTGGTGCCATGTCC 

ATCACCCTTTGGACGGGCTCAGCGTGTACCCTCCCCAGGCCCTCCAACTCTGACCTCATATTCAGTGTTGCGGCGTC 

TCACCGTTCAACCTAAAACCCGGTTCACACCCATGCCATCAACCCCCAGAGTTCAGCAGGCCCAGTGGCTGCGTGGT 

GTCTCCCCTCAGTCCTGCTCTGAAGATCCTGCCCTGCCCTGGGAGCAGGTTGCCGTCCGGTTGTTTGACCAGGAGAG 

TTGTATAAGGTCACTGGAGGGTTCTGGGAAACCACCGGTGGCCACTCCTTCTGGACCCCACTCTAACAGAACCCCCA 

GCCTCCAGGAGGTGAAGATTCAACGCATCGGTATCCTGCAACAGCTGTTGAGACAGGAAGTAGAGGGGCTGGTAGGG 

GGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTCTGGATATGGTTGAACTTCAGCCCCTGCTGACTGAGATTTCTAG 

AACTCTGAATGCCACAGAGCATAACTCTGGGACTTCCCACCTTCCTGGACTGTTAAAACACTCAGGGCTGCCAAAGC 

CCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCCCTGCCCTCCGGCAGAGCCTGGGCCCCCAGAGGCCTTCTGTAGG 

AGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGGAACAGCTTGAAGTACCAGAGCCCTACCCTCCAGCAGAACCCAG 

GCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAGATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGTACCTGAGC 

CCTGCCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTACTGTAGGATTGAGCCTGAGATACCGGAGTCCTCTCGCCAG 
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GAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCGGGCCCCTTCAGCCCAGCACCCAGGGGCAGTCTGG 
ACCCCCAGGGCCCTGCCCTAGGGTAGAGCTGGGGGCATCAGAGCCCTGCACCCTGGAACATAGAAGTCTAGAGTCCA 
GTCTACCACCCTGCTGCAGTCAGTGGGCTCCAGCAACCACCAGCCTGATCTTCTCTTCCCAACACCCGCTTTGTGCC 
AGCCCCCCTATCTGCTCACTCCAGTCTTTGAGACCCCCAGCAGGCCAGGCAGGTAAGGAGTTGGCTGGGAAGGAGTG 
TGAACACAAGAGGTCCTCACCTCACTGTGAGCTGCACACCTGCCCTGCCCCTACCCCAGGCAATCTCATGCTTCCAC 
ACCTTCCACCCTGGCCCAGCCTGGCTCTCCCTCAGGAAGAGGGGAGGGGCTGCACTTCCAGCCCTGTGCTCCTAATT 
GGCTTGGCCGTTGGTGGGGGAGGAGGAGAGGACAGTACATGGTGGAAGTATAGGACCCCAGACCTCCCTCTAAATTT 
TCCATGCCCCTCAGGCCTCAGCAATCTGGCCCCTCGAACCCTAGCCCTGAGGGAGCGCCTCAAATCGTGTTTAACCG 
CCATCCACTGCTTCCACGAGGCTCGTCTGGACGATGAGTGTGCCTTTTACACCAGCCGAGCCCCTCCCTCAGGCCCC 
ACCCGGGTCTGCACCAACCCTGTGGCTACATTACTCGAATGGCAGGATGCCCTGTGTTTCATTCCAGTTGGTTCTGC 
TGCCCCCCAGGGCTCTCCATGATGAGACAACCACTCCTGCCCTGCCGTACTTCTTCCTTTTAGCCCTTATTTATTGT 
CGGTCTGCCCATGGGACTGGGAGCCGCCCACTTTTGTCCTCAATAAAGTTTCTAAAGTA 

SEQ ID NO: 1490 

>T86235 # transcript_29 lien 844 (Includes node 11 - TAA seg 14; node 6 - TAA 
seg 9 ) 

CTCCAGCAGCACCCGAGAGGGTCAGGAGAAAAGCGGAGGAAGCTGGGTAGGCCCTGAGGGGCCTCGGTAAGCCATCA 
TGACCACCCGGCAAGCCACGAAGGATCCCCTCCTCCGGGGTGTATCTCCTACCCCTAGCAAGATTCCGGTACGCTCT 
CAGAAACGCACGCCTTTCCCCACTGTTACATCGTGCGCCGTGGACCAGGAGAACCAAGATCCAAGGGTAAGAGGGGC 
CTAATGGGGGAAGACAGTAGTCACACCAGTAATGCACCCCAACACTAAACCTCACCTTTTTGTCCCCGCTCCCTCCC 
CTAGAGATGGGTGCAGAAACCACCGCTCAATATTCAACGCCCCCTCGTTGATTCAGCAGGCCCCAGGCCGAAAGCCA 
GGCACCAGGCAGAGACATCACAAAGATTGGTGGGGATCAGTCAGCCTCGGAACCCCTTGGAAGAGCTCAGGCCTAGC 
CCTAGGGGTCAAAATGTGGGGCCTGGGCCCCCTGCCCAGACAGGTACCTGTTGGAGCCATGGTAACACGGCCTCCAT 
GGCTGAGTAGGGGACTAGGAAGGGTAAAAGTGGGGTTTTGGGGTTTTGCACTCACTCCTGCTGTCTCCTACTTACTG 
TGATAGTCCTGGTCCCAGCTCCTGGAAAGCTCTTTGCTCTTAGAAGATCTCCCTTTCCTCCAGCAAAATGTCATCTC 
GCCAGGTGCCATGGCTTGTACCTGTAATCACAGCTACTCAGGAGGCTGAAGCAGGAGGATCACTGGAGGCCAGGAGT 
TGGAGACCAGCCTGGATGACAGAGGGAGATCCCATCTCTTTAAAAAATAAATAAATAAATAAATAATAAATATC 

SEQ ID NO:1491 

>T86235 # transcript_30 #len 752 (Includes node 11 - TAA seg 14) 

CTCCAGCAGCACCCGAGAGGGTCAGGAGAAAAGCGGAGGAAGCTGGGTAGGCCCTGAGGGGCCTCGGTAAGCCATCA 
TGACCACCCGGCAAGCCACGAAGGATCCCCTCCTCCGGGGTGTATCTCCTACCCCTAGCAAGATTCCGGTACGCTCT 
CAGAAACGCACGCCTTTCCCCACTGTTACATCGTGCGCCGTGGACCAGGAGAACCAAGATCCAAGGAGATGGGTGCA 
GAAACCACCGCTCAATATTCAACGCCCCCTCGTTGATTCAGCAGGCCCCAGGCCGAAAGCCAGGCACCAGGCAGAGA 
CATCACAAAGATTGGTGGGGATCAGTCAGCCTCGGAACCCCTTGGAAGAGCTCAGGCCTAGCCCTAGGGGTCAAAAT 
GTGGGGCCTGGGCCCCCTGCCCAGACAGGTACCTGTTGGAGCCATGGTAACACGGCCTCCATGGCTGAGTAGGGGAC 
TAGGAAGGGTAAAAGTGGGGTTTTGGGGTTTTGCACTCACTCCTGCTGTCTCCTACTTACTGTGATAGTCCTGGTCC 
CAGCTCCTGGAAAGCTCTTTGCTCTTAGAAGATCTCCCTTTCCTCCAGCAAAATGTCATCTCGCCAGGTGCCATGGC 
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TTGTACCTGTAATCACAGCTACTCAGGAGGCTGAAGCAGGAGGATCACTGGAGGCCAGGAGTTGGAGACCAGCCTGG 
AT G AC AG AGG G AG ATC C CAT C T C T T T AAAAAAT A AAT AA AT A AAT AAAT AAT AAAT AT C 

SEQ ID NO:1492 

>T8 6235_PEA_13_P8 # trn_8, trn_9 #len 314 

MVELQPLLTEISRTLNATEHNSGTSHLPGLLKHSGLPKPCLPEECGEPQPCPPAEPGPPEAFCRSEPEI PEPSLQEQ 
LEVPEPYPPAEPRPLESCCRSEPEIPESSRQEQLEVPEPCPPAEPRPLESYCRIEPEIPESSRQEQLEVPEPCPPAE 
PGPLQPSTQGQSGPPGPCPRVELGASEPCTLEHRSLESSLPPCCSQWAPATTSLIFSSQHPLCASPPICSLQSLRPP 
AGQAGLSNLAPRTLALRERLKSCLTAIHCFHEARLDDECAFYTSRAPPSGPTRVCTNPVATLLEWQDALCFIPVGSA 

APQGSP 

SEQ ID NO:1493 

>T8 62 35_PEA_13_P9 # trn_10 #len 285 

MVELQPLLTE I SRTLNATEHNSGTSHLPGLLKHSGLPKPCLPEECGEPQPCPPAEPGPPEAFCRSEPE I PEPSLQEQ 
LEVPEPYPPAEPRPLESCCRSEPEIPESSRQEQLEEQLEVPEPCPPAEPGPLQPSTQGQSGPPGPCPRVELGASEPC 
TLEHRSLESSLPPCCSQWAPATTSLIFSSQHPLCASPPICSLQSLRPPAGQAGLSNLAPRTLALRERLKSCLTAIHC 
FHEARLDDECAFYTSRAPPSGPTRVCTN PVATLLEWQDALCFIPVGSAAPQGSP 

SEQ ID NO:1494 

>T8 6235_PEA_13_P2 # trn_22 lien 8 68 

MTTRQATKDPLLRGVSPTPSKIPVRSQKRTPFPTVTSCAVDQENQDPRRWVQKPPLNIQRPLVDSAGPRPKARHQAE 
TSQRLVGISQPRNPLEELRPSPRGQNVGPGPPAQTEAPGTIEFVADPAALATILSGEGVKSCHLGRQPSLAKRVLVR 
GSQGGTTQRVQGVRASAYLAPRTPTHRLDPARASCFSRLEGPGPRGRTLCPQRLQALISPSGPSFHPSTRPSFQELR 
RETAGSSRTSVSQASGLLLETPVQPAFSLPKGEREVVTHSDEGGVASLGLAQRVPLRENREMSHTRDSHDSHLMPSP 
APVAQPLPGHVVPCPSPFGRAQRVPSPGPPTLTSYSVLRRLTVQPKTRFTPMPSTPRVQQAQWLRGVSPQSCSEDPA 
LPWEQVAVRLFDQESCIRSLEGSGKPPVATPSGPHSNRTPSLQEVKIQRIGILQQLLRQEVEGLVGGQCVPLNGGSS 
LDMVELQPLLTEISRTLNATEHNSGTSHLPGLLKHSGLPKPCLPEECGEPQPCPPAEPGPPEAFCRSEPEIPEPSLQ 
EQLEVPEPYPPAEPRPLESCCRSEPEIPESSRQEQLEVPEPCPPAEPRPLESYCRIEPEIPESSRQEQLEVPEPCPP 
AEPGPLQPSTQGQSGPPGPCPRVELGASEPCTLEHRSLESSLPPCCSQWAPATTSLIFSSQHPLCASPPICSLQSLR 
PPAGQAGKELAGKECEHKRSSPHCELHTCPAPTPGNLMLPHLPPWPSLALPQEEGRGCTSSPVLLIGLAVGGGGGED 
STWWKYRTPDLPLNFPCPSGLSNLAPRTLALRERLKSCLTAIHCFHEARLDDECAFYTSRAPPSGPTRVCTNPVATL 
LEWQDALCFIPVGSAAPQGSP 

SEQ ID NO:1495 

>T86235_PEA_13_P4 # trn_23 #len 901 

MTTRQATKDPLLRGVSPTPSKIPVRSQKRTPFPTVTSCAVDQENQDPRRWVQKPPLNIQRPLVDSAGPRPKARHQAE 
TSQRLVGISQPRNPLEELRPSPRGQNVGPGPPAQTEAPGTIEFVADPAALATILSGEGVKSCHLGRQPSLAKRVLVR 
GSQGGTTQRVQGVRASAYLAPRTPTHRLDPARASCFSRLEGPGPRGRTLCPQRLQALISPSGPSFHPSTRPSFQELR 
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RETAGSSRTSVSQASGLLLETPVQPAFSLPKGEREVVTHSDEGGVASLGLAQRVPLRENREMSHTRDSHDSHLMPSP 
APVAQPLPGHVVPCPSPFGRAQRVPSPGPPTLTSY SVLRRLTVQPKTRFTPMPSTPRVQQAQWLRGVSPQSCSEDPA 
LPWEQVAVRLFDQESC IRS LEGS GKPPVATPSGPHSNRTPSLQEVKIQRIGILQQLLRQEVEGLVGGQCVPLNGGSS 
LDMVELQPLLTEISRTLNATEHNSGTSHLPGLLKHSGLPKPCLPEECGEPQPCPPAEPGPPEAFCRSEPEIPEPSLQ 
EQLEVPEPYPPAEPRPLESCCRSEPEIPESSRQEQLEVPEPCPPAEPRPLESYCRIEPEIPESSRQEQLEVPEPCPP 
AEPGPLQPSTQGQSGPPGPCPRVELGASEPCTLEHRSLESSLPPCCSQWAPATTSLIFSSQHPLCASPPICSLQSLR 
PPAGQAGKELAGKECEHKRSSPHCELHTCPAPTPGNLMLPHLPPWPSLALPQEEGRGCTSSPVLLIGLAVGGGGGED 
STWWKYRTPDLPLNFPCPSGLSNLAPRTLALRERLKSCLTAIHCFHEARLDDECAFYTSRAPPSGPTRVCTNPVATL 
LEWQDALVRLQPTAQLWLHSEPDGRWGTGTGGHLGFFTERSAGRLGYSARLAEL 

SEQ ID NO:1496 

>T8 6235_PEA_13_P5 # trn_24 lien 782 

MTTRQATKDPLLRGVSPTPSKIPVRSQKRTPFPTVTSCAVDQENQDPRRWVQKPPLNIQRPLVDSAGPRPKARHQAE 
TSQRLVGISQPRNPLEELRPSPRGQNVGPGPPAQTEAPGTIEFVADPAALATILSGEGVKSCHLGRQPSLAKRVLVR 
GSQGGTTQRVQGVRASAYLAPRTPTHRLDPARASCFSRLEGPGPRGRTLCPQRLQALISPSGPSFHPSTRPSFQELR 
RETAGSSRTSVSQASGLLLETPVQPAFSLPKGEREVVTHSDEGGVASLGLAQRVPLRENREMSHTRDSHDSHLMPSP 
APVAQPLPGHVVPCPSPFGRAQRVPSPGPPTLTSYSVLRRLTVQPKTRFTPMPSTPRVQQAQWLRGVSPQSCSEDPA 
LPWEQVAVRLFDQESCIRSLEGSGKPPVATPSGPHSNRTPSLQEVKIQRIGILQQLLRQEVEGLVGGQCVPLNGGSS 
LDMVELQPLLTEISRTLNATEHNSGTSHLPGLLKHSGLPKPCLPEECGEPQPCPPAEPGPPEAFCRSEPEIPEPSLQ 
EQLEVPEPYPPAEPRPLESCCRSEPEIPESSRQEQLEEQLEVPEPCPPAEPGPLQPSTQGQSGPPGPCPRVELGASE 
PCTLEHRSLESSLPPCCSQWAPATTSLIFSSQHPLCASPPICSLQSLRPPAGQAGLSNLAPRTLALRERLKSCLTAI 
HCFHEARLDDECAFYTSRAPPSGPTRVCTNPVATLLEWQDALVRLQPTAQLWLHSEPDGRWGTGTGGHLGFFTERSA 

GRLGYSARLAEL 
SEQ ID NO: 1497 

>T8 62 35_PEA_13_P18 # trn_25 #len 665 

MTTRQATKDPLLRGVSPTPSKIPVRSQKRTPFPTVTSCAVDQENQDPRGVRASAYLAPRTPTHRLDPARASCFSRLE 
GPGPRGRTLCPQRLQALISPSGPSFHPSTRPSFQELRRETAGSSRTSVSQASGLLLETPVQPAFSLPKGEREVVTHS 
DEGGVASLGLAQRVPLRENREMSHTRDSHDSHLMPSPAPVAQPLPGHVVPCPSPFGRAQRVPSPGPPTLTSYSVLRR 
LTVQPKTRFTPMPSTPRVQQAQWLRGVSPQSCSEDPALPWEQVAVRLFDQESCIRSLEGSGKPPVATPSGPHSNRTP 
SLQEVKIQRIGILQQLLRQEVEGLVGGQCVPLNGGSSLDMVELQPLLTEISRTLNATEHNSGTSHLPGLLKHSGLPK 
PCLPEECGEPQPCPPAEPGPPEAFCRSEPEIPEPSLQEQLEVPEPYPPAEPRPLESCCRSEPEIPESSRQEQLEEQL 
EVPEPCPPAEPGPLQPSTQGQSGPPGPCPRVELGASEPCTLEHRSLESSLPPCCSQWAPATTSLIFSSQHPLCASPP 
ICSLQSLRP PAGQAGLSNLAPRTLALRERLKSCLTAIHCFHEARLDDECAFYTSRAPPSGPTRVCTNPVATLLEWQD 
ALVRLQPTAQLWLHSEPDGRWGTGTGGHLGFFTERSAGRLGYSARLAEL 

SEQ ID NO:1498 

>T86235 PEA 13 P19 # trn_26 #len 784 
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MTTRQATKDPLLRGVSPTPSKIPVRSQKRTPFPTVTSCAVDQENQDPRGVRASAYLAPRTPTHRLDPARASCFSRLE 
GPGPRGRTLCPQRLQALISPSGPSFHPSTRPSFQELRRETAGSSRTSVSQASGLLLETPVQPAFSLPKGEREVVTHS 
DEGGVASLGLAQRVPLRENREMSHTRDSHDSHLMPSPAPVAQPLPGHVVPCPSPFGRAQRVPSPGPPTLTSYSVLRR 
LTVQPKTRFTPMPSTPRVQQAQWLRGVSPQSCSEDPALPWEQVAVRLFDQESCIRSLEGSGKPPVATPSGPHSNRTP 
SLQEVKIQRI G I LQQLLRQEVEGLVGGQCVPLNGGSSLDMVELQPLLTE I SRTLNATEHNSGTSHLPGLLKHSGLPK 
PCLPEECGEPQPCPPAEPGPPEAFCRSEPEIPEPSLQEQLEVPEPYPPAEPRPLESCCRSEPEIPESSRQEQLEVPE 
PCPPAEPRPLESYCRIEPEIPESSRQEQLEVPEPCPPAEPGPLQPSTQGQSGPPGPCPRVELGASEPCTLEHRSLES 
SLPPCCSQWAPATTSLIFSSQHPLCASPPICSLQSLRPPAGQAGKELAGKECEHKRSSPHCELHTCPAPTPGNLMLP 
HLPPWPSLALPQEEGRGCTSSPVLLIGLAVGGGGGEDSTWWKYRTPDLPLNFPCPSGLSNLAPRTLALRERLKSCLT 
AIHCFHEARLDDECAFYTSRAPPSGPTRVCTNPVATLLEWQDALVRLQPTAQLWLHSEPDGRWGTGTGGHLGFFTER 

SAGRLGYSARLAEL 
SEQ ID NO:1499 

>T8 62 35_PEA_13_P2 0 # trn_27 #len 751 

MTTRQATKDPLLRGVSPTPSKIPVRSQKRTPFPTVTSCAVDQENQDPRGVRASAYLAPRTPTHRLDPARASCFSRLE 
GPGPRGRTLCPQRLQALISPSGPSFHPSTRPSFQELRRETAGSSRTSVSQASGLLLETPVQPAFSLPKGEREVVTHS 
DEGGVASLGLAQRVPLRENREMSHTRDSHDSHLMPSPAPVAQPLPGHVVPCPSPFGRAQRVPSPGPPTLTSYSVLRR 
LTVQPKTRFTPMPSTPRVQQAQWLRGVSPQSCSEDPALPWEQVAVRLFDQESCIRSLEGSGKPPVATPSGPHSNRTP 
SLQEVKIQRIGILQQLLRQEVEGLVGGQCVPLNGGSSLDMVELQPLLTEI SRTLNATEHNSGTSHLPGLLKHSGLPK 
PCLPEECGEPQPCPPAEPGPPEAFCRSEPEIPEPSLQEQLEVPEPYPPAEPRPLESCCRSEPEIPESSRQEQLEVPE 
PCPPAEPRPLESYCRIEPEIPESSRQEQLEVPEPCPPAEPGPLQPSTQGQSGPPGPCPRVELGASEPCTLEHRSLES 
SLPPCCSQWAPATTSLIFSSQHPLCASPPICSLQSLRPPAGQAGKELAGKECEHKRSSPHCELHTCPAPTPGNLMLP 
HLPPWPSLALPQEEGRGCTSSPVLLIGLAVGGGGGEDSTWWKYRTPDLPLNFPCPSGLSNLAPRTLALRERLKSCLT 
AIHCFHEARLDDECAFYTSRAPPSGPTRVCTNPVATLLEWQDALCFIPVGSAAPQGSP 

SEQ ID NO:1500 

>T8 62 35_PEA_13_P21 # trn_29 lien 52 

MTTRQATKDPLLRGVSPTPSKIPVRSQKRTPFPTVTSCAVDQENQDPRVRGA 
SEQ ID NO:1501 

>T8 6235_PEA_13_P13 # trnJO #len 126 

MTTRQATKDPLLRGVSPTPSKIPVRSQKRTPFPTVTSCAVDQENQDPRRWVQKPPLNIQRPLVDSAGPRPKARHQAE 
TSQRLVGISQPRNPLEELRPSPRGQNVGPGPPAQTGTCWSHGNTASMAE 



SEQ ID NO: 1502 

>T86235 # node_6 (TAA seg 9) #len 92 

GTAAGAGGGGCCTAATGGGGGAAGACAGTAGTCACACCAGTAATGCACCCCAACACTAAA 
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CCTCACCTTTTTGTCCCCGCTCCCTCCCCTAG 
SEQ ID NO:1503 

>T86235 # node_ll (TAA seg 14) #len 111 

GTACCTGTTGGAGCCATGGTAACACGGCCTCCATGGCTGAGTAGGGGACTAGGAAGGGTAAAAGTGGGGTTTTGGGG 
TTTTGCACTCACTCCTGCTGTCTCCTACTTACTG 

SEQ ID NO:1504 

>T86235 # node_44 { TAA seg 35) #len 204 

CTGTCTCCACTAAAATTTTAAAAATTAGAAAGTGCTCTCTGGAAAAGCTGCCTAACTCTCACTGCTTCTCTGCTGCC 
CCTCCTAATGTACATCTAGGGCCTCTCAGTTAGGGGCTTCAATCCATTCCTCATGAGGGTGGGACTCAGGCTGGTCT 
TTCTCCTGCCCCAGCCTGGCTTGCTTGTGTGCCTTGTTCCTTGGTGACAG 

SEQ ID NO:1505 

>T86235 # node_32 (TAA seg 37) #len 153 

GTGAGTCTGTGTGGCCAACAGCTTTGATGTCTATTGAACAGTGACTGGGCTGAGGAAGAGGGAAAAGAGATGGGGGA 
TCAGGAATAGGACAGTGTGGGTAGACTACTGAACGCACATCTTGATGTCACACTGGGGTGCTCTCTCCCACCACAG 

SEQ ID NO:1506 

>T86235 # node_37 (TAA seg 42) #len 270 

GTAAGGAGTTGGCTGGGAAGGAGTGTGAACACAAGAGGTCCTCACCTCACTGTGAGCTGCACACCTGCCCTGCCCCT 
ACCCCAGGCAATCTCATGCTTCCACACCTTCCACCCTGGCCCAGCCTGGCTCTCCCTCAGGAAGAGGGGAGGGGCTG 
CACTTCCAGCCCTGTGCTCCTAATTGGCTTGGCCGTTGGTGGGGGAGGAGGAGAGGACAGTACATGGTGGAAGTATA 
GGACCCCAGACCTCCCTCTAAATTTTCCATGCCCCTCAG 

SEQ ID NO:1507 

>T86235 # node_39 (TAA seg 44) lien 165 

GTGAGACTCCAACCCACAGCCCAGCTGTGGCTGCACAGTGAGCCTGATGGGAGGTGGGGAACAGGGACAGGGGGCCA 
CCTGGGCTTCTTCACAGAGAGGTCAGCAGGAAGGCTTGGCTACAGTGCAAGGTTGGCTGAGCTGTGACAAGGTCTTC 

TCTGTCTCCAG 
SEQ ID NO:1508 

>Unique aa coded by node_37 (TAA seg 42) [found in T8 6235_PEA_13_P2 # trn_22, 

T8 6235_PEA_13_P4 # trn_23, T8 6235_PEA_13_P20 # trn_2 7, T8 6235_PEA_1 3_P1 9 # 
trn_2 6, 

GKELAGKECEHKRSSPHCELHTCPAPTPGNLMLPHLPPWPSLALPQEEGRGCTSSPVLLIGLAVGGGGGEDSTWWKY 
RTPDLPLNFPCPS 
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SEQ ID NO: 1509 

>Unique aa coded by node_39 (TAA seg 42) [found in T8 623 5_PEA_1 3__P4 # trn_23, 
T86235_PEA_13_P5 # trn_24, T8 6235_PEA_13_P18 # trn_25, T8 62 3 5_PEA_1 3_P1 9 # 
trn_2 6] 

VRLQPTAQLWLHSEPDGRWGTGTGGHLGFFTERSAGRLGYSARLAEL 
SEQ ID NO:1510 

>Unique aa coded by node_ll (TAA seg 14) [found in T8 62 35_PEA_13_P13 # 
trn_3 0] 

GTCWSHGNTASMAE 
SEQ ID NO:1511 

>Unique aa coded by node_6 { TAA seg 9) [found in T8 62 3 5_PEA_13_P21 # trn_29] 
VRGA 

SEQ ID NO:1512 

>Oligo from Seg 14 (T86235_0_0__57365) 

CATGGTAACACGGCCTCCATGGCTGAGTAGGGGACTAGGAAGGGTAAAAG 
SEQ ID NO:1513 

>01igo from seg 35 (T86235_0_0_57371) 

TGTACATCTAGGGCCTCTCAGTTAGGGGCTTCAATCCATTCCTCATGAGG 
SEQ ID NO: 1514 

>Oligo from seg 42 (T8 6235_0_0_57 37 8 ) 

TGTGAACACAAGAGGTCCTCACCTCACTGTGAGCTGCACACCTGCCCTGC 

SEQ ID NO:1515 
>forward primer: 
GCGAAACGCGATTTGTTGTT 

SEQ ID NO:1516 
>reverse primer: 
CATCTGGAGGAGGGAGGGA 

SEQ ID NO:1517 
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>amplicon 

GCGAAACGCGATTTGTTGTTTGTGGGTCTGATTTGTGCGTGCGGCTTGGGCTCCTGCGGCTTTTGGCTCGGCCGGGG 
GCCTTGGGCAGCGAGGCTGGAGCCGGAAGAGGTGGAGGTGAAGGGCTGCCCGCCACGTCCCTCCCTCCTCCAGATG 

SEQ ID NO:1518 

>N31842 # transcript_2 lien 1172 (Includes node 8 - TAA seg 3) 

CCCTGGCTTTCGACTAGCGTCCGCTGAGCTCCAGGCTGGTGGCGCGTCACTTAGCTGGGGAAGAGGAGATAAAGGCA 
GAAAACACCACAGGAAATTGGCTGACAGCAAAGAGCGGAAGGAAGAAGAGGTGCCCCTATACTAAACACCAGACGCT 
GGAATTGGAGAAAGAATTTCTGTTCAATATGTATTTGACGCGAGAGCGCCGCCTGGAGATTAGCAAGACCATTAACC 
TTACAGACAGACAAGTCAAAATCTGGTTTCAAAATCGCAGAATGAAACTCAAGAAAATGAACCGAGAGAATCGGATC 
CGGGAACTGACCTCCAATTTTAATTTCACCTGAGAGCGCGGCCTCTCCTCCTCCCTTCCCGCTCCTTCCTCTCCCCG 
CCCCTCCTCCCTTTGTGCCTGGTGATATATTTTTTTTTCCTCCCTGAGTATAAATGCAATGCGACTGCAAAAAAGGC 
AAAGACCTCAGACTCTCCTTCCAAGGGACCTGTGGTTCGTGCTGCGAAGATGCTTCCACTTAAAGCATGAGAAATGG 
GGTGCCGGGATGTGGGGTGTGGTGTGTGCCCTCATAGATGGGGGTGGGAGTGTGGCTGGTGTGTGTGTCAAACCCTC 
ACTCACCCACGCACTCACACACAGCATTCTGTTCTCCATGCAAAGTTAAGATCGAATCCATCCGCTTGTAGGGGAAA 
AAAAGGAAAAAAATTAACCAGAGAGGGTCTGTAATCTCGCAGAGCACAGGCAGAATCGTTCCTTCCTTGCTGCATTT 
CCTCCTTAGACTAATAGACGTTTTGGAAAGTTCGGCTAGTGTTCGTGTGTTTGTCGTAGCACCCAGAGCCTCCACCA 
AACCCTCTCCATGTCTTTACCTCCCAGTCGCTCTAAGAATCTGCTTGAAGTCTCGTATTTGTACTGCTTTCTGCTTT 
TCTCCCACCCCTCCTAGCACCCCCACATCCCCCATCTAGTAACATCTCAGAAATTTCATCCAGAGGAACAAAAAAAT 
TAAAAATAGAACATAGCAAAGCAAAGACAGAATGCCCCCCCCCAAATATTGTCCTGTCCCTGTCTGGGAGTTGTGTT 
ATTTAAAGATATTCTGTATGTTGTATCTTTTGCATGTAGCTTCCTTAATGGAGAAAAAAAAACCTAATAAATTTCCA 
GAATCATAATCCTCAAA 

SEQ ID NO:1519 

>N31842 # transcript_3 #len 1489 (Includes node 9 - TAA seg 5 ; node 2 - TAA 
seg 6) 

CTCCCAGGCGCGTGCTGGCTGTGGTTTGCTTTCTCAATGCTGGTGTCCTGGGGAGCTGACGTCCCCCAGCTCAGGTC 
AGGGGCTTGCAAAAAGCCTAAAATGGCGATCTTGGGCCAGGGACTAGGGAAGGCTGGGGAGATGGGGGGAGTTCTCT 
TTACTGCGTTTTCCCAGTTGAAAATTGTTTCCTGCGAAACGCGATTTGTTGTTTGTGGGTCTGATTTGTGCGTGCGG 
CTTGGGCTCCTGCGGCTTTTGGCTCGGCCGGGGGCCTTGGGCAGCGAGGCTGGAGCCGGAAGAGGTGGAGGTGAAGG 
GCTGCCCGCCACGTCCCTCCCTCCTCCAGATGCCTGGCTTGGATGGCGTTGGAACAGGGCATTTGGAATGTTAGGAG 
ATAAAGGCAGAAAACACCACAGGAAATTGGCTGACAGCAAAGAGCGGAAGGAAGAAGAGGTGCCCCTATACTAAACA 
CCAGACGCTGGAATTGGAGAAAGAATTTCTGTTCAATATGTATTTGACGCGAGAGCGCCGCCTGGAGATTAGCAAGA 
CCATTAACCTTACAGACAGACAAGTCAAAATCTGGTTTCAAAATCGCAG7VATGAAACTCAAGAAAATGAACCGAGAG 
AATCGGATCCGGGAACTGACCTCCAATTTTAATTTCACCTGAGAGCGCGGCCTCTCCTCCTCCCTTCCCGCTCCTTC 
CTCTCCCCGCCCCTCCTCCCTTTGTGCCTGGTGATATATTTTTTTTTCCTCCCTGAGTATAAATGCAATGCGACTGC 
AAAAAAGGCAAAGACCTCAGACTCTCCTTCCAAGGGACCTGTGGTTCGTGCTGCGAAGATGCTTCCACTTAAAGCAT 
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GAGAAATGGGGTGCCGGGATGTGGGGTGTGGTGTGTGCCCTCATAGATGGGGGTGGGAGTGTGGCTGGTGTGTGTGT 
CAAACCCTCACTCACCCACGCACTCACACACAGCATTCTGTTCTCCATGCAAAGTTAAGATCGAATCCATCCGCTTG 
TAGGGGAAAAAAAGGAAAAAAATTAACCAGAGAGGGTCTGTAATCTCGCAGAGCACAGGCAGAATCGTTCCTTCCTT 
GCTGCATTTCCTCCTTAGACTAATAGACGTTTTGGAAAGTTCGGCTAGTGTTCGTGTGTTTGTCGTAGCACCCAGAG 
CCTCCACCAAACCCTCTCCATGTCTTTACCTCCCAGTCGCTCTAAGAATCTGCTTGAAGTCTCGTATTTGTACTGCT 
TTCTGCTTTTCTCCCACCCCTCCTAGCACCCCCACATCCCCCATCTAGTAACATCTCAGAAATTTCATCCAGAGGAA 
CAAAAAAATTAAAAATAGAACATAGCAAAGCAAAGACAGAATGCCCCCCCCCAAATATTGTCCTGTCCCTGTCTGGG 
AGTTGTGTTATTTAAAGATATTCTGTATGTTGTATCTTTTGCATGTAGCTTCCTTAATGGAGAAAAAAAAACCTAAT 
AAATTTCCAGAATCATAATCCTCAAA 

SEQ ID NO: 1520 

>N31842 # transcriptj #len 1183 (Includes node 9 - TAA seg 5) 

CTCCCAGGCGCGTGCTGGCTGTGGTTTGCTTTCTCAATGCTGGTGTCCTGGGGAGCTGACGTCCCCCAGCTCAGAGG 
AGATAAAGGCAGAAAACACCACAGGAAATTGGCTGACAGCAAAGAGCGGAAGGAAGAAGAGGTGCCCCTATACTAAA 
CACCAGACGCTGGAATTGGAGAAAGAATTTCTGTTCAATATGTATTTGACGCGAGAGCGCCGCCTGGAGATTAGCAA 
GACCATTAACCTTACAGACAGACAAGTCAAAATCTGGTTTCAAAATCGCAGAATGAAACTCAAGAAAATGAACCGAG 
AGAATCGGATCCGGGAACTGACCTCCAATTTTAATTTCACCTGAGAGCGCGGCCTCTCCTCCTCCCTTCCCGCTCCT 
TCCTCTCCCCGCCCCTCCTCCCTTTGTGCCTGGTGATATATTTTTTTTTCCTCCCTGAGTATAAATGCAATGCGACT 
GCAAAAAAGGCAAAGACCTCAGACTCTCCTTCCAAGGGACCTGTGGTTCGTGCTGCGAAGATGCTTCCACTTAAAGC 
ATGAGAAATGGGGTGCCGGGATGTGGGGTGTGGTGTGTGCCCTCATAGATGGGGGTGGGAGTGTGGCTGGTGTGTGT 
GTCAAACCCTCACTCACCCACGCACTCACACACAGCATTCTGTTCTCCATGCAAAGTTAAGATCGAATCCATCCGCT 
TGTAGGGGAAAAAAAGGAAAAAAATTAACCAGAGAGGGTCTGTAATCTCGCAGAGCACAGGCAGAATCGTTCCTTCC 
TTGCTGCATTTCCTCCTTAGACTAATAGACGTTTTGGAAAGTTCGGCTAGTGTTCGTGTGTTTGTCGTAGCACCCAG 
AGCCTCCACCAAACCCTCTCCATGTCTTTACCTCCCAGTCGCTCTAAGAATCTGCTTGAAGTCTCGTATTTGTACTG 
CTTTCTGCTTTTCTCCCACCCCTCCTAGCACCCCCACATCCCCCATCTAGTAACATCTCAGAAATTTCATCCAGAGG 
AACAAAAAAATTAAAAATAGAACATAGCAAAGCAAAGACAGAATGCCCCCCCCCAAATATTGTCCTGTCCCTGTCTG 
GGAGTTGTGTTATTTAAAGATATTCTGTATGTTGTATCTTTTGCATGTAGCTTCCTTAATGGAGAAAAAAAAACCTA 

AT AAAT T T C C AG AAT C AT A AT C C T C A AA 

SEQ ID NO: 1521 

>N31842__P2 # trn_2 #len 52 

MYLTRERRLEISKTINLTDRQVKIWFQNRRMKLKKMNRENRIRELTSNFNFT 

SEQ ID NO:1522 

>N31842_P4 # trn_4 #len 116 

SQARAGCGLLSQCWCPGELTSPSSEEIKAENTTGNWLTAKSGRKKRCPYTKHQTLELEKEFLFNMYLTRERRLEISK 
TINLTDRQVKIWFQNRRMKLKKMNRENRIRELTSNFNFT 
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SEQ ID NO:1523 

>Unique aa seq of N31842-P4 

SQARAGCGLLSQCWCPGELTSPSS 

SEQ ID NO: 1524 

>N31842 # node_8 (TAA segment 3) #len 63 

CCCTGGCTTTCGACTAGCGTCCGCTGAGCTCCAGGCTGGTGGCGCGTCACTTAGCTGGGGAAG 
SEQ ID NO:1525 

>N31842 # node_9 (TAA segment 5) lien 74 

CTCCCAGGCGCGTGCTGGCTGTGGTTTGCTTTCTCAATGCTGGTGTCCTGGGGAGCTGACGTCCCCCAGCTCAG 
SEQ ID NO:1526 

>N31842 # nodej (TAA segment 6) #len 306 

GTCAGGGGCTTGCAAAAAGCCTAAAATGGCGATCTTGGGCCAGGGACTAGGGAAGGCTGGGGAGATGGGGGGAGTTC 
TCTTTACTGCGTTTTCCCAGTTGAAAATTGTTTCCTGCGAAACGCGATTTGTTGTTTGTGGGTCTGATTTGTGCGTG 
CGGCTTGGGCTCCTGCGGCTTTTGGCTCGGCCGGGGGCCTTGGGCAGCGAGGCTGGAGCCGGAAGAGGTGGAGGTGA 
AGGGCTGCCCGCCACGTCCCTCCCTCCTCCAGATGCCTGGCTTGGATGGCGTTGGAACAGGGCATTTGGAATGTT 

SEQ ID NO: 1527 
>Forward primer: 
CTCGCTCCCTTGCTCACAC 

SEQ ID NO:1528 
>Reverse primer: 
AAAGGGAAAGCGGGATGTTT 

SEQ ID NO:1529 
>Amplicon 

CTCGCTCCCTTGCTCACACACACGCACACACTCAGCCTGGCCGAGCAGGAGCCACTGACCATTTTGCAAGTGTCAGG 
ACCAGCTACAGCGCGGTGGGCGCAAACATCCCGCTTTCCCTTT 

SEQ ID NO : 1530 
>Forward primer: 
ACATCCCCCTGGAACGGAT 

SEQ ID NO: 1531 
>Reverse primer: 
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CAGAAATTAGCAAAGCATTGATGG 

SEQ ID NO:1532 
>Amplicon 

ACATCCCCCTGGAACGGATATCTGTTTGGGGCACTACAATCTATCCTGTAGAACTATGGCCAAATCTCCATCAATGC 
TTTGCTAATTTCTG 

SEQ ID NO:1533 

>T06014 # transcript_3 #len 4096 (Includes node 29 - TAA seg 1 ; node 1 - TAA 
seg 3 ) 

AGGGGAGTGGGAGGGGTGGGGGTGAGGGGAGAGGCGAGAGGTTTAGCGTGTGGAGCTGCCTGCGCTCCGCCCGGGCT 

GTCAGTCCCGGCTCCAGCCGCCGCGAGACCTTCCCGGAGACGCGCGCACACACAGCGCACCCCCTGCACACCGCACA 

CCCTCGCTCCCTTGCTCACACACACGCACACACTCAGCCTGGCCGAGCAGGAGCCACTGACCATTTTGCAAGTGTCA 

GGACCAGCTACAGCGCGGTGGGCGCAAACATCCCGCTTTCCCTTTCCGGGATTTAGTCTGGGAGACGACGACGAGGG 

AAGAAGAAACCAGAGGAGTCCTGTCTGGGGGTCCCATCATTATTCGGGATACCCGCCGCCAGCGGCCTGCCTTCGGT 

TACCCACATCCCCCTGGAACGGATATCTGTTTGGGGCACTACAATCTATCCTGTAGAACTATGGCCAAATCTCCATC 

AATGCTTTGCTAATTTCTGGGACTTAACTCGTAGAATCTACATACAGGGCTGGAATTXATTCAAAATGCATCTGAAG 

AAATGACATTTTAAACCGTTTTAAAAAATATCTTGATAAAAAATTCTGTAAAACAGAATTTGATAGGTTTAAAAACA 

TGACAGCAGGCTCCCGCTGCGGCCGGGACCTCGCATCCCTGCAACGTGGCCGGGGCTGCATTTTTCATGAGCCTAGG 

GTGAACAGGXGCGAAGTGCGCTGGGAGCATCCGGCCAGCGGCCGAGCGCGGGGAACATGGAGAGCGAGCGCGACATG 

TACCGCCAGTTCCAGGACTGGTGCCTCAGGACTTACGGGGACTCAGGCAAGACCAAGACGGTGACCCGTAAAAAATA 

CGAACGGATCGTCCAGCTCCTCAATGGCTCCGAGTCGAGCTCCACGGACAACGCCAAATTTAAATTCTGGGTCAAAT 

CGAAGGGCTTCCAGCTGGGCCAGCCGGACGAGGTCCGCGGGGGAGGCGGCGGCGCCAAGCAAGTGCTCTACGTGCCT 

GTCAAGACCACGGATGGCGTAGGGGTAGATGAGAAGCTATCTTTACGACGGGTAGCTGTGGTTGAAGATTTCTTTGA 

CATTATTTATTCGATGCATGTGGAAACGGGGCCAAATGGAGAACAAATTCGGAAACACGCTGGACAAAAGAGAACTT 

ACAAAGCAATTTCAGAGAGCTATGCCTTCCTACCAAGAGAAGCGGTGACACGATTTCTAATGAGCTGCTCAGAGTGC 

CAGAAAAGAATGCATTTAAACCCAGATGGAACAGATCATAAAGATAATGGAAAACCTCCCACTTTGGTGACCAGCAT 

GATTGACTACAACATGCCAATTACCATGGCCTACATGAAACACATGAAGCTGCAGCTGCTAAACTCACAGCAAGATG 

AGGATGAAAGTTCAATAGAAAGTGATGAATTTGACATGAGTGATTCAACACGGATGTCAGCTGTGAACTCTGATCTT 

AGCTCCAATCTTGAAGAAAGAATGCAAAGTCCCCAGAATCTTCATGGCCAGCAAGATGATGATTCTGCTGCAGAGAG 

CTTTAATGGCAATGAGACTCTGGGGCACAGTTCAATTGCTTCAGGGGGAACACACAGCAGGGAGATGGGAGACTCCA 

ACAGTGATGGCAAAACTGGGCTGGAGCAAGATGAACAGCCACTGAACCTGAGTGACAGTCCCCTCTCTGCGCAGCTA 

ACTTCGGAATACAGAATAGATGATCACAACAGTAATGGGAAAAACAAGTATAAGAATCTTCTAATTTCTGACCTCAA 

GATGGAACGAGAGGCGAGAGAAAATGGAAGCAAGTCTCCTGCACATAGTTACTCCAGCTATGACTCTGGCAAAAATG 

AGAGTGTAGACCGAGGAGCTGAGGACCTCTCACTAAACAGGGGAGATGAGGACGAAGATGACCACGAGGACCATGAC 

GATTCGGAGAAAGTTAATGAGACAGACGGCGTTGAAGCCGAGCGGCTGAAAGCTTTTAATGATGAGTCTGCTCCAGC 

TGACAAACAGTGTAAACCAGAGGCGACCCAGGCCACTTACTCAACATCAGCTGTTCCAGGCTCACAGGACGTGCTGT 

ACATCAATGGAAATGGGACCTATAGTTACCATAGTTACAGAGGGCTAGGAGGGGGTCTGCTAAATCTGAATGATGCT 
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TCCAGCAGTGGACCCACTGATCTCAGCATGAAGAGACAATTGGCGACTAGCTCAGGATCCTCCAGCAGCTCAAACTC 

CAGACCCCAGCTGAGTCCAACTGAAATCAATGCCGTGAGACAGCTTGTTGCAGGATATCGAGAATCAGCTGCATTTT 

TATTGCGATCTGCAGATGAACTGGAAAATCTCATTTTACAACAGAACTGAGACAGACGACCACCATATTCACTGAGG 

TCTAAATTTGCAGTTTCCACTAATGACATTTTGATTTCCCAACAGAGATACTTCTGGTCTTACTGCACAGTCTTTTA 

AGAGAAATACTTCCATTATGCCACATTGTCCTTGATCCGTAAGTGATGTGTTAAGGTGCTTCAAAGGAACTCTGACC 

TCTGAAGTACTTGAGCTACTTTAGTATGTCCAGCCTATTGCTTTTTGTTTTAGTGXGTCACCATAAATATCAGGGGC 

ATAAAAGGCTATCTATTCTTAATTCAAGGATAAAACAGAAGAAGCTTGTGGTATAAAACAATAGTTCAAGATCCAGC 

TGAAATATTAGTGGAATTTGCXACTGACTCATTGGACTGAAAGCTGAAGTACCTGGCAAAAAAAAAAAAAAGAAAAA 

AAAAAGCCAAATTTCTTGTTGCTACAGGATATAACAACAATGAAAAGGATCTCGTATTTTAAAAAAATATGTAATTT 

TTATAAAAAGAAAACTTGTTTTTCATTCAAACTTGTCATTTTTACTTTGGTAACTTTTTCATAGGTCCTAAAAGAAA 

ACTGTTTTGAGAAACTACTGTAAGTACCTTTTCCACATCCCTTTGCCTTCTCCTCTTTCCAAATTCTTTCTACAAAA 

ATAACACTTGATGCTGGAAAAACCCTTGCCTACGTTCTTTCAATCGTCACATCAGGAACTACTTCCAAGAGAAGCCT 

GCATTTCTGCTCTCATGCTGATCTCAAAAACCCCACTCAACACTGCAACTTTATCATAGCAGTTTTCATCCCAGAAT 

TTTTTTTTTAATAATGACAAGACATGTTGTTGAAAAAAAATCACACCTTGGTTTCTTAGAGCTGCTCGTTCCTGATT 

GCCGCTGCTGTCTCCAGGCATCCCTCTAGCAGCACCTGGATGTAGATGACTGAATGTTAAGAGGTTGCAAGTGACAA 

TCTGAAAATTTGCACTCTTGTGTGTAGTTTTCTTTTCATTTCTTTCAGAAATAGTTTCCAAAAAGACCATTACATCT 

CCTGATATGATTTGTATAATTTTCAGTTCTAGCTAAAAATAATGTAAGGAACTCTCAGCGGATGCAGCTGCAACTTA 

CAATGAACTGTGCCCTCCTATCCCCCATACTTTACCCTTCTTTCTTATTTTATAGTGTGGGATACACATGAGTGATG 

TTTTCTTTGTGCACTGAGACAAGCCTATTTTTTAAATATTTAGGGAGAAGTACTTTAGTTCATGCTTCTTATACAAC 

TTTTTTCTGTTGTTTAGCTTTGGTTGGATTACAAATTCTTTGTGCATTCCTGAATTTGCCTTATTTCATGTAAAATT 

TATGTCATTCAGTTTTTGACAATGAGTTTGAGGCATCAGTGATATTTCTTATCTACTTGTTACATATAGTTTTTCAA 

GTAATGACTGTGATTGTGACCGAGTAATGTGCACTTTTTCTTGTAACTGTGGACATTGCTATGCTTTTTTCTTCTAG 

TGTTTCTAGAATTACTGTTCCTTACAATTATGTAAACAAAAAACAAAAAAAAAACTTTTGTGATACTGTTGGTGAAT 

ATAATGTGAAAAATCTTATTGAAATATGAGTATTTTGGAAATACATAGCTGCACAAACATCTTTTAAGATGTGGATT 

TAGAGTTTGCTTATTTAAATGAAAATTCAAAAATTGAGGGCTGGTATAATTTTCTCTGTTTTGTTTGGTTTAATAAA 

CAGATTTCTGTGTTA 

SEQ ID NO: 1534 

>T06014 # transcript_5 lien 3081 (Includes node 34 - TAA seg 24) 

CTCTTATTTTCCAAAGCTCCAGGTGCTGTTTTCCTTGGGACTCATATACAGAGTCTTTGGTTCAAATGCTATGGTAC 
CCAGGCTGAAACCTTGAGAGAAGAACCACCGCCTTGGCCCAGTGAACCTGAAGGCCATACCCACAGACCTAAATACT 
GCACCTTCTGTGTAGCAGCTACTCAGGATGGACCAAACTATGTTGGATGATTCTGCTGCAGAGAGCTTTAATGGCAA 
TGAGACTCTGGGGCACAGTTCAATTGCTTCAGGGGGAACACACAGCAGGGAGATGGGAGACTCCAACAGTGATGGCA 
AAACTGGGCTGGAGCAAGATGAACAGCCACTGAACCTGAGTGACAGTCCCCTCTCTGCGCAGCTAACTTCGGAATAC 
AGAATAGATGATCACAACAGTAATGGGAAAAACAAGTATAAGAATCTTCTAATTTCTGACCTCAAGATGGAACGAGA 
GGCGAGAGAAAATGGAAGCAAGTCTCCTGCACATAGTTACTCCAGCTATGACTCTGGCAAAAATGAGAGTGTAGACC 
GAGGAGCTGAGGACCTCTCACTAAACAGGGGAGATGAGGACGAAGATGACCACGAGGACCATGACGATTCGGAGAAA 
GTTAATGAGACAGACGGCGTTGAAGCCGAGCGGCTGAAAGCTTTTAATATGTTTGTCAGGCTGTTTGTAGATGAAAA 



WO 2006/131783 



PCT/IB2005/004037 



648 

CTTGGACCGAATGGTCCCAATCTCTAAGCAGCCCAAAGAAAAGATCCAGGCTATCATTGACTCATGCAGGCGACAAT 

TCCCTGAGTATCAAGAGCGTGCCAGAAAACGTATACGTACTTACCTCAAGTCCTGCAGGCGGATGAAAAGAAGTGGT 

TTTGAGATGTCTCGACCTATTCCTTCCCACCTTACTTCAGCAGTTGCAGAGAGTATCTTGGCTTCAGCTTGTGAGAG 

TGAGAGTAGAAATGCCGCCAAGAGGATGCGTCTGGAGAGACAGCAGGATGAGTCTGCTCCAGCTGACAAACAGTGTA 

AACCAGAGGCGACCCAGGCCACTTACTCAACATCAGCTGTTCCAGGCTCACAGGACGTGCTGTACATCAATGGAAAT 

GGGACCTATAGTTACCATAGTTACAGAGGGCTAGGAGGGGGTCTGCTAAATCTGAATGATGCTTCCAGCAGTGGACC 

CACTGATCTCAGCATGAAGAGACAATTGGCGACTAGCTCAGGATCCTCCAGCAGCTCAAACTCCAGACCCCAGCTGA 

GTCCAACTGAAATCAATGCCGTGAGACAGCTTGTTGCAGGATATCGAGAATCAGCTGCATTTTTATTGCGATCTGCA 

GATGAACTGGAAAATCTCATTTTACAACAGAACTGAGACAGACGACCACCATATTCACTGAGGTCTAAATTTGCAGT 

TTCCACTAATGACATTTTGATTTCCCAACAGAGATACTTCTGGTCTTACTGCACAGTCTTTTAAGAGAAATACTTCC 

ATTATGCCACATTGTCCTTGATCCGTAAGTGATGTGTTAAGGTGCTTCAAAGGAACTCTGACCTCTGAAGTACTTGA 

GCTACTTTAGTATGTCCAGCCTATTGCTTTTTGTTTTAGTGTGTCACCATAAATATCAGGGGCATAAAAGGCTATCT 

ATTCTTAATTCAAGGATAAAACAGAAGAAGCTTGTGGTATAAAACAATAGTTCAAGATCCAGCTGAAATATTAGTGG 

AATTTGCTACTGACTCATTGGACTGAAAGCTGAAGTACCTGGCAAAAAAAAAAAAAAGAAAAAAAAAAGCCAAATTT 

CTTGTTGCTACAGGATATAACAACAATGAAAAGGATCTCGTATTTTAAAAAAATATGTAATTTTTATAAAAAGAAAA 

CTTGTTTTTCATTCAAACTTGTCATTTTTACTTTGGTAACTTTTTCATAGGTCCTAAAAGAAAACTGTTTTGAGAAA 

CTACTGTAAGTACCTTTTCCACATCCCTTTGCCTTCTCCTCTTTCCAAATTCTTTCTACAAAAATAACACTTGATGC 

TGGAAAAACCCTTGCCTACGTTCTTTCAATCGTCACATCAGGAACTACTTCCAAGAGAAGCCTGCATTTCTGCTCTC 

ATGCTGATCTCAAAAACCCCACTCAACACTGCAACTTTATCATAGCAGTTTTCATCCCAGAATTTTTTTTTTAATAA 

TGACAAGACATGTTGTTGAAAAAAAATCACACCTTGGTTTCTTAGAGCTGCTCGTTCCTGATTGCCGCTGCTGTCTC 

CAGGCATCCCTCTAGCAGCACCTGGATGTAGATGACTGAATGTTAAGAGGTTGCAAGTGACAATCTGAAAATTTGCA 

CTCTTGTGTGTAGTTTTCTTTTCATTTCTTTCAGAAATAGTTTCCAAAAAGACCATTACATCTCCTGATATGATTTG 

TATAATTTTCAGTTCTAGCTAAAAATAATGTAAGGAACTCTCAGCGGATGCAGCTGCAACTTACAATGAACTGTGCC 

CTCCTATCCCCCATACTTTACCCTTCTTTCTTATTTTATAGTGTGGGATACACATGAGTGATGTTTTCTTTGTGCAC 

TGAGACAAGCCTATTTTTTAAATATTTAGGGAGAAGTACTTTAGTTCATGCTTCTTATACAACTTTTTTCTGTTGTT 

TAGCTTTGGTTGGATTACAAATTCTTTGTGCATTCCTGAATTTGCCTTATTTCATGTAAAATTTATGTCATTCAGTT 

TTTGACAATGAGTTTGAGGCATCAGTGATATTTCTTATCTACTTGTTACATATAGTTTTTCAAGTAATGACTGTGAT 

TGTGACCGAGTAATGTGCACTTTTTCTTGTAACTGTGGACATTGCTATGCTTTTTTCTTCTAGTGTTTCTAGAATTA 

CTGTTCCTTACAATTATGTAAACAAAAAACAAAAAAAAAACTTTTGTGATACTGTTGGTGAATATAATGTGAAAAAT 

CTTATTGAAATATGAGTATTTTGGAAATACATAGCTGCACAAACATCTTTTAAGATGTGGATTTAGAGTTTGCTTAT 

TTAAATGAAAATTCAAAAATTGAGGGCTGGTATAATTTTCTCTGTTTTGTTTGGTTTAATAAACAGATTTCTGTGTT 

A 

SEQ ID NO: 1535 

>T06014 # transcript_6 #len 3021 (Includes node 33 - TAA seg 22) 

GAGAATTTAGGGGAAGTCAGACTAATCCCTGGGTGCACTTCGAGAGAAAGGTTGGACCTCGCCTGAGGGATGCTGGA 
TCTGCAATGATAGAATTGGCCTGCCACAGACATTGGTATTTAAGTTGAAGAGCAGAATTCACAATGATTCTGCTGCA 
GAGAGCTTTAATGGCAATGAGACTCTGGGGCACAGTTCAATTGCTTCAGGGGGAACACACAGCAGGGAGATGGGAGA 
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CTCCAACAGTGATGGCAAAACTGGGCTGGAGCAAGATGAACAGCCACTGAACCTGAGTGACAGTCCCCTCTCTGCGC 

AGCTAACTTCGGAATACAGAATAGATGATCACAACAGTAATGGGAAAAACAAGTATAAGAATCTTCTAATTTCTGAC 

CTCAAGATGGAACGAGAGGCGAGAGAAAATGGAAGCAAGTCTCCTGCACATAGTTACTCCAGCTATGACTCTGGCAA 

AAATGAGAGTGTAGACCGAGGAGCTGAGGACCTCTCACTAAACAGGGGAGATGAGGACGAAGATGACCACGAGGACC 

ATGACGATTCGGAGAAAGTTAATGAGACAGACGGCGTTGAAGCCGAGCGGCTGAAAGCTTTTAATATGTTTGTCAGG 

CTGTTTGTAGATGAAAACTTGGACCGAATGGTCCCAATCTCTAAGCAGCCCAAAGAAAAGATCCAGGCTATCATTGA 

CTCATGCAGGCGACAATTCCCTGAGTATCAAGAGCGTGCCAGAAAACGTATACGTACTTACCTCAAGTCCTGCAGGC 

GGATGAAAAGAAGTGGTTTTGAGATGTCTCGACCTATTCCTTCCCACCTTACTTCAGCAGTTGCAGAGAGTATCTTG 

GCTTCAGCTTGTGAGAGTGAGAGTAGAAATGCCGCCAAGAGGATGCGTCTGGAGAGACAGCAGGATGAGTCTGCTCC 

AGCTGACAAACAGTGTAAACCAGAGGCGACCCAGGCCACTTACTCAACATCAGCTGTTCCAGGCTCACAGGACGTGC 

TGTACATCAATGGAAATGGGACCTATAGTTACCATAGTTACAGAGGGCTAGGAGGGGGTCTGCTAAATCTGAATGAT 

GCTTCCAGCAGTGGACCCACTGATCTCAGCATGAAGAGACAATTGGCGACTAGCTCAGGATCCTCCAGCAGCTCAAA 

CTCCAGACCCCAGCTGAGTCCAACTGAAATCAATGCCGTGAGACAGCTTGTTGCAGGATATCGAGAATCAGCTGCAT 

TTTTATTGCGATCTGCAGATGAACTGGAAAATCTCATTTTACAACAGAACTGAGACAGACGACCACCATATTCACTG 

AGGTCTAAATTTGCAGTTTCCACTAATGACATTTTGATTTCCCAACAGAGATACTTCTGGTCTTACTGCACAGTCTT 

TTAAGAGAAATACTTCCATTATGCCACATTGTCCTTGATCCGTAAGTGATGTGTTAAGGTGCTTCAAAGGAACTCTG 

ACCTCTGAAGTACTTGAGCTACTTTAGTATGTCCAGCCTATTGCTTTTTGTTTTAGTGTGTCACCATAAATATCAGG 

GGCATAAAAGGCTATCTATTCTTAATTCAAGGATAAAACAGAAGAAGCTTGTGGTATAAAACAATAGTTCAAGATCC 

AGCTGAAATATTAGTGGAATTTGCTACTGACTCATTGGACTGAAAGCTGAAGTACCTGGCAAAAAAAAAAAAAAGAA 

AAAAAAAAGCCAAATTTCTTGTTGCTACAGGATATAACAACAATGAAAAGGATCTCGTATTTTAAAAAAATATGTAA 

TTTTTATAAAAAGAAAACTTGTTTTTCATTCAAACTTGTCATTTTTACTTTGGTAACTTTTTCATAGGTCCTAAAAG 

AAAACTGTTTTGAGAAACTACTGTAAGTACCTTTTCCACATCCCTTTGCCTTCTCCTCTTTCCAAATTCTTTCTACA 

AAAATAACACTTGATGCTGGAAAAACCCTTGCCTACGTTCTTTCAATCGTCACATCAGGAACTACTTCCAAGAGAAG 

CCTGCATTTCTGCTCTCATGCTGATCTCAAAAACCCCACTCAACACTGCAACTTTATCATAGCAGTTTTCATCCCAG 

AATTTTTTTTTTAATAATGACAAGACATGTTGTTGAAAAAAAATCACACCTTGGTTTCTTAGAGCTGCTCGTTCCTG 

ATTGCCGCTGCTGTCTCCAGGCATCCCTCTAGCAGCACCTGGATGTAGATGACTGAATGTTAAGAGGTTGCAAGTGA 

CAATCTGAAAATTTGCACTCTTGTGTGTAGTTTTCTTTTCATTTCTTTCAGAAATAGTTTCCAAAAAGACCATTACA 

TCTCCTGATATGATTTGTATAATTTTCAGTTCTAGCTAAAAATAATGTAAGGAACTCTCAGCGGATGCAGCTGCAAC 

TTACAATGAACTGTGCCCTCCTATCCCCCATACTTTACCCTTCTTTCTTATTTTATAGTGTGGGATACACATGAGTG 

ATGTTTTCTTTGTGCACTGAGACAAGCCTATTTTTTAAATATTTAGGGAGAAGTACTTTAGTTCATGCTTCTTATAC 

AACTTTTTTCTGTTGTTTAGCTTTGGTTGGATTACAAATTCTTTGTGCATTCCTGAATTTGCCTTATTTCATGTAAA 

ATTTATGTCATTCAGTTTTTGACAATGAGTTTGAGGCATCAGTGATATTTCTTATCTACTTGTTACATATAGTTTTT 

CAAGTAATGACTGTGATTGTGACCGAGTAATGTGCACTTTTTCTTGTAACTGTGGACATTGCTATGCTTTTTTCTTC 

TAGTGTTTCTAGAATTACTGTTCCTTACAATTATGTAAACAAAAAACAAAAAAAAAACTTTTGTGATACTGTTGGTG 

AATATAATGTGAAAAATCTTATTGAAATATGAGTATTTTGGAAATACATAGCTGCACAAACATCTTTTAAGATGTGG 

ATTTAGAGTTTGCTTATTTAAATGAAAATTCAAAAATTGAGGGCTGGTATAATTTTCTCTGTTTTGTTTGGTTTAAT 

AAACAGATTTCTGTGTTA 
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SEQ ID NO:1536 

>T06014 # transcriptj #len 2889 (Includes node 34 - TAA seg 24) 

CTCTTATTTTCCAAAGCTCCAGGTGCTGTTTTCCTTGGGACTCATATACAGAGTCTTTGGTTCAAATGCTATGGTAC 

CCAGGCTGAAACCTTGAGAGAAGAACCACCGCCTTGGCCCAGTGAACCTGAAGGCCATACCCACAGACCTAAATACT 

GCACCTTCTGTGTAGCAGCTACTCAGGATGGACCAAACTATGTTGGATGATTCTGCTGCAGAGAGCTTTAATGGCAA 

TGAGACTCTGGGGCACAGTTCAATTGCTTCAGGGGGAACACACAGCAGGGAGATGGGAGACTCCAACAGTGATGGCA 

AAACTGGGCTGGAGCAAGATGAACAGCCACTGAACCTGAGTGACAGTCCCCTCTCTGCGCAGCTAACTTCGGAATAC 

AGAATAGATGATCACAACAGTAATGGGAAAAACAAGTATAAGAATCTTCTAATTTCTGACCTCAAGATGGAACGAGA 

GGCGAGAGAAAATGGAAGCAAGTCTCCTGCACATAGTTACTCCAGCTATGACTCTGGCAAAAATGAGAGTGTAGACC 

GAGGAGCTGAGGACCTCTCACTAAACAGGGGAGATGAGGACGAAGATGACCACGAGGACCATGACGATTCGGAGAAA 

GTTAATGAGACAGACGGCGTTGAAGCCGAGCGGCTGAAAGCTTTTAATTCTCGACCTATTCCTTCCCACCTTACTTC 

AGCAGTTGCAGAGAGTATCTTGGCTTCAGCTTGTGAGAGTGAGAGTAGAAATGCCGCCAAGAGGATGCGTCTGGAGA 

GACAGCAGGATGAGTCTGCTCCAGCTGACAAACAGTGTAAACCAGAGGCGACCCAGGCCACTTACTCAACATCAGCT 

GTTCCAGGCTCACAGGACGTGCTGTACATCAATGGAAATGGGACCTATAGTTACCATAGTTACAGAGGGCTAGGAGG 

GGGTCTGCTAAATCTGAATGATGCTTCCAGCAGTGGACCCACTGATCTCAGCATGAAGAGACAATTGGCGACTAGCT 

CAGGATCCTCCAGCAGCTCAAACTCCAGACCCCAGCTGAGTCCAACTGAAATCAATGCCGTGAGACAGCTTGTTGCA 

GGATATCGAGAATCAGCTGCATTTTTATTGCGATCTGCAGATGAACTGGAAAATCTCATTTTACAACAGAACTGAGA 

CAGACGACCACCATATTCACTGAGGTCTAAATTTGCAGTTTCCACTAATGACATTTTGATTTCCCAACAGAGATACT 

TCTGGTCTTACTGCACAGTCTTTTAAGAGAAATACTTCCATTATGCCACATTGTCCTTGATCCGTAAGTGATGTGTT 

AAGGTGCTTCAAAGGAACTCTGACCTCTGAAGTACTTGAGCTACTTTAGTATGTCCAGCCTATTGCTTTTTGTTTTA 

GTGTGTCACCATAAATATCAGGGGCATAAAAGGCTATCTATTCTTAATTCAAGGATAAAACAGAAGAAGCTTGTGGT 

ATAAAACAATAGTTCAAGATCCAGCTGAAATATTAGTGGAATTTGCTACTGACTCATTGGACTGAAAGCTGAAGTAC 

CTGGCAAAAAAAAAAAAAAGAAAAAAAAAAGCCAAATTTCTTGTTGCTACAGGATATAACAACAATGAAAAGGATCT 

CGTATTTTAAAAAAATATGTAATTTTTATAAAAAGAAAACTTGTTTTTCATTCAAACTTGTCATTTTTACTTTGGTA 

ACTTTTTCATAGGTCCTAAAAGAAAACTGTTTTGAGAAACTACTGTAAGTACCTTTTCCACATCCCTTTGCCTTCTC 

CTCTTTCCAAATTCTTTCTACAAAAATAACACTTGATGCTGGAAAAACCCTTGCCTACGTTCTTTCAATCGTCACAT 

CAGGAACTACTTCCAAGAGAAGCCTGCATTTCTGCTCTCATGCTGATCTCAAAAACCCCACTCAACACTGCAACTTT 

ATCATAGCAGTTTTCATCCCAGAATTTTTTTTTTAATAATGACAAGACATGTTGTTGAAAAAAAATCACACCTTGGT 

TTCTTAGAGCTGCTCGTTCCTGATTGCCGCTGCTGTCTCCAGGCATCCCTCTAGCAGCACCTGGATGTAGATGACTG 

AATGTTAAGAGGTTGCAAGTGACAATCTGAAAATTTGCACTCTTGTGTGTAGTTTTCTTTTCATTTCTTTCAGAAAT 

AGTTTCCAAAAAGACCATTACATCTCCTGATATGATTTGTATAATTTTCAGTTCTAGCTAAAAATAATGTAAGGAAC 

TCTCAGCGGATGCAGCTGCAACTTACAATGAACTGTGCCCTCCTATCCCCCATACTTTACCCTTCTTTCTTATTTTA 

TAGTGTGGGATACACATGAGTGATGTTTTCTTTGTGCACTGAGACAAGCCTATTTTTTAAATATTTAGGGAGAAGTA 

CTTTAGTTCATGCTTCTTATACAACTTTTTTCTGTTGTTTAGCTTTGGTTGGATTACAAATTCTTTGTGCATTCCTG 

AATTTGCCTTATTTCATGTAAAATTTATGTCATTCAGTTTTTGACAATGAGTTTGAGGCATCAGTGATATTTCTTAT 

CTACTTGTTACATATAGTTTTTCAAGTAATGACTGTGATTGTGACCGAGTAATGTGCACTTTTTCTTGTAACTGTGG 

ACATTGCTATGCTTTTTTCTTCTAGTGTTTCTAGAATTACTGTTCCTTACAATTATGTAAACAAAAAACAAAAAAAA 

AACTTTTGTGATACTGTTGGTGAATATAATGTGAAAAATCTTATTGAAATATGAGTATTTTGGAAATACATAGCTGC 
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ACAAACATCTTTTAAGATGTGGATTTAGAGTTTGCTTATTTAAATGAAAATTCAAAAATTGAGGGCTGGTATAATTT 
TCTCTGTTTTGTTTGGTTTAATAAACAGATTTCTGTGTTA 

SEQ ID NO:1537 

XT06014 # transcript_ll #len 4406 {Includes node 29 - TAA seg 1 ; node 1 - 
TAA seg 3) 

AGGGGAGTGGGAGGGGTGGGGGTGAGGGGAGAGGCGAGAGGTTTAGCGTGTGGAGCTGCCTGCGCTCCGCCCGGGCT 

GTCAGTCCCGGCTCCAGCCGCCGCGAGACCTTCCCGGAGACGCGCGCACACACAGCGCACCCCCTGCACACCGCACA 

CCCTCGCTCCCTTGCTCACACACACGCACACACTCAGCCTGGCCGAGCAGGAGCCACTGACCATTTTGCAAGTGTCA 

GGACCAGCTACAGCGCGGTGGGCGCAAACATCCCGCTTTCCCTTTCCGGGATTTAGTCTGGGAGACGACGACGAGGG 

AAGAAGAAACCAGAGGAGTCCTGTCTGGGGGTCCCATCATTATTCGGGATACCCGCCGCCAGCGGCCTGCCTTCGGT 

TACCCACATCCCCCTGGAACGGATATCTGTTTGGGGCACTACAATCTATCCTGTAGAACTATGGCCAAATCTCCATC 

AATGCTTTGCTAATTTCTGGGACTTAACTCGTAGAATCTACATACAGGGCTGGAATTTATTCAAAATGCATCTGAAG 

AAATGACATTTTAAACCGTTTTAAAAAATATCTTGATAAAAAATTCTGTAAAACAGAATTTGATAGGTTTAAAAACA 

TGACAGCAGGCTCCCGCTGCGGCCGGGACCTCGCATCCCTGCAACGTGGCCGGGGCTGCATTTTTCATGAGCCTAGG 

GTGAACAGGTGCGAAGTGCGCTGGGAGCATCCGGCCAGCGGCCGAGCGCGGGGAACATGGAGAGCGAGCGCGACATG 

TACCGCCAGTTCCAGGACTGGTGCCTCAGGACTTACGGGGACTCAGGCAAGACCAAGACGGTGACCCGTAAAAAATA 

CGAACGGATCGTCCAGCTCCTCAATGGCTCCGAGTCGAGCTCCACGGACAACGCCAAATTTAAATTCTGGGTCAAAT 

CGAAGGGCTTCCAGCTGGGCCAGCCGGACGAGGTCCGCGGGGGAGGCGGCGGCGCCAAGCAAGTGCTCTACGTGCCT 

GTCAAGACCACGGATGGCGTAGGGGTAGATGAGAAGCTATCTTTACGACGGGTAGCTGTGGTTGAAGATTTCTTTGA 

CATTATTTATTCGATGCATGTGGAAACGGGGCCAAATGGAGAACAAATTCGGAAACACGCTGGACAAAAGAGAACTT 

ACAAAGCAATTTCAGAGAGCTATGCCTTCCTACCAAGAGAAGCGGTGACACGATTTCTAATGAGCTGCTCAGAGTGC 

CAGAAAAGAATGCATTTAAACCCAGATGGAACAGATCATAAAGATAATGGAAAACCTCCCACTTTGGTGACCAGCAT 

GATTGACTACAACATGCCAATTACCATGGCCTACATGAAACACATGAAGCTGCAGCTGCTAAACTCACAGCAAGATG 

AGGATGAAAGTTCAATAGAAAGTGATGAATTTGACATGAGTGATTCAACACGGATGTCAGCTGTGAACTCTGATCTT 

AGCTCCAATCTTGAAGAAAGAATGCAAAGTCCCCAGAATCTTCATGGCCAGCAAGATGATGATTCTGCTGCAGAGAG 

CTTTAATGGCAATGAGACTCTGGGGCACAGTTCAATTGCTTCAGGGGGAACACACAGCAGGGAGATGGGAGACTGCA 

ACAGTGATGGCAAAACTGGGCTGGAGCAAGATGAACAGCCACTGAACCTGAGTGACAGTCCCCTCTCTGCGCAGCTA 

ACTTCGGAATACAGAATAGATGATCACAACAGTAATGGGAAAAACAAGTATAAGAATCTTCTAATTTCTGACCTCAA 

GATGGAACGAGAGGCGAGAGAAAATGGAAGCAAGTCTCCTGCACATAGTTACTCCAGCTATGACTCTGGCAAAAATG 

AGAGTGTAGACCGAGGAGCTGAGGACCTCTCACTAAACAGGGGAGATGAGGACGAAGATGACCACGAGGACCATGAC 

GATTCGGAGAAAGTTAATGAGACAGACGGCGTTGAAGCCGAGCGGCTGAAAGCTTTTAATATGTTTGTCAGGCTGTT 

TGTAGATGAAAACTTGGACCGAATGGTCCCAATCTCTAAGCAGCCCAAAGAAAAGATCCAGGCTATCATTGACTCAT 

GCAGGCGACAATTCCCTGAGTATCAAGAGCGTGCCAGAAAACGTATACGTACTTACCTCAAGTCCTGCAGGCGGATG 

AAAAGAAGTGGTTTTGAGATGTCTCGACCTATTCCTTCCCACCTTACTTCAGCAGTTGCAGAGAGTATCTTGGCTTC 

AGCTTGTGAGAGTGAGAGTAGAAATGCCGCCAAGAGGATGCGTCTGGAGAGACAGCAGGATGAGTCTGCTCCAGCTG 

ACAAACAGTGTAAACCAGAGGCGACCCAGGCCACTTACTCAACATCAGCTGTTCCAGGCTCACAGGACGTGCTGTAC 

ATCAATGGAAATGGGACCTATAGTTACCATAGTTACAGAGGGCTAGGAGGGGGTCTGCTAAATCTGAATGATGCTTC 
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CAGCAGTGGACCCACTGATCTCAGCATGAAGAGACAATTGGCGACTAGCTCAGGATCCTCCAGCAGCTCAAACTCCA 

GACCCCAGCTGAGTCCAACTGAAATCAATGCCGTGAGACAGCTTGTTGCAGGATATCGAGAATCAGCTGCATTTTTA 

TTGCGATCTGCAGATGAACTGGAAAATCTCATTTTACAACAGAACTGAGACAGACGACCACCATATTCACTGAGGTC 

TAAATTTGCAGTTTCCACTAATGACATTTTGATTTCCCAACAGAGATACTTCTGGTCTTACTGCACAGTCTTTTAAG 

AGAAATACTTCCATTATGCCACATTGTCCTTGATCCGTAAGTGATGTGTTAAGGTGCTTCAAAGGAACTCTGACCTC 

TGAAGTACTTGAGCTACTTTAGTATGTCCAGCCTATTGCTTTTTGTTTTAGTGTGTCACCATAAATATCAGGGGCAT 

AAAAGGCTATCTATTCTTAATTCAAGGATAAAACAGAAGAAGCTTGTGGTATAAAACAATAGTTCAAGATCCAGCTG 

AAATATTAGTGGAATTTGCTACTGACTCATTGGACTGAAAGCTGAAGTACCTGGCAAAAAAAAAAAAAAGAAAAAAA 

AAAGCCAAATTTCTTGTTGCTACAGGATATAACAACAATGAAAAGGATCTCGTATTTTAAAAAAATATGTAATTTTT 

ATAAAAAGAAAACTTGTTTTTCATTCAAACTTGTCATTTTTACTTTGGTAACTTTTTCATAGGTCCTAAAAGAAAAC 

TGTTTTGAGAAACTACTGTAAGTACCTTTTCCACATCCCTTTGCCTTCTCCTCTTTCCAAATTCTTTCTACAAAAAT 

AACACTTGATGCTGGAAAAACCCTTGCCTACGTTCTTTCAATCGTCACATCAGGAACTACTTCCAAGAGAAGCCTGC 

ATTTCTGCTCTCATGCTGATCTCAAAAACCCCACTCAACACTGCAACTTTATCATAGCAGTTTTCATCCCAGAATTT 

TTTTTTTAATAATGACAAGACATGTTGTTGAAAAAAAATCACACCTTGGTTTCTTAGAGCTGCTCGTTCCTGATTGC 

CGCTGCTGTCTCCAGGCATCCCTCTAGCAGCACCTGGATGTAGATGACTGAATGTTAAGAGGTTGCAAGTGACAATC 

TGAAAATTTGCACTCTTGTGTGTAGTTTTCTTTTCATTTCTTTCAGAAATAGTTTCCAAAAAGACCATTACATCTCC 

TGATATGATTTGTATAATTTTCAGTTCTAGCTAAAAATAATGTAAGGAACTCTCAGCGGATGCAGCTGCAACTTACA 

ATGAACTGTGCCCTCCTATCCCCCATACTTTACCCTTCTTTCTTATTTTATAGTGTGGGATACACATGAGTGATGTT 

TTCTTTGTGCACTGAGACAAGCCTATTTTTTAAATATTTAGGGAGAAGTACTTTAGTTCATGCTTCTTATACAACTT 

TTTTCTGTTGTTTAGCTTTGGTTGGATTACAAATTCTTTGTGCATTCCTGAATTTGCCTTATTTCATGTAAAATTTA 

TGTCATTCAGTTTTTGACAATGAGTTTGAGGCATCAGTGATATTTCTTATCTACTTGTTACATATAGTTTTTCAAGT 

AATGACTGTGATTGTGACCGAGTAATGTGCACTTTTTCTTGTAACTGTGGACATTGCTATGCTTTTTTCTTCTAGTG 

TTTCTAGAATTACTGTTCCTTACAATTATGTAAACAAAAAACAAAAAAAAAACTTTTGTGATACTGTTGGTGAATAT 

AATGTGAAAAATCTTATTGAAATATGAGTATTTTGGAAATACATAGCTGCACAAACATCTTTTAAGATGTGGATTTA 

GAGTTTGCTTATTTAAATGAAAATTCAAAAATTGAGGGCTGGTATAATTTTCTCTGTTTTGTTTGGTTTAATAAACA 

GATTTCTGTGTTAAAAA 

SEQ ID NO:1538 

>T06014 # transcript_12 #len 4214 (Includes node 29 - TAA seg 1 ; node 1 - 
TAA seg 3) 

AGGGGAGTGGGAGGGGTGGGGGTGAGGGGAGAGGCGAGAGGTTTAGCGTGTGGAGCTGCCTGCGCTCCGCCCGGGCT 
GTCAGTCCCGGCTCCAGCCGCCGCGAGACCTTCCCGGAGACGCGCGCACACACAGCGCACCCCCTGCACACCGCACA 
CCCTCGCTCCCTTGCTCACACACACGCACACACTCAGCCTGGCCGAGCAGGAGCCACTGACCATTTTGCAAGTGTCA 
GGACCAGCTACAGCGCGGTGGGCGCAAACATCCCGCTTTCCCTTTCCGGGATTTAGTCTGGGAGACGACGACGAGGG 
AAGAAGAAACCAGAGGAGTCCTGTCTGGGGGTCCCATCATTATTCGGGATACCCGCCGCCAGCGGCCTGCCTTCGGT 
TACCCACATCCCCCTGGAACGGATATCTGTTTGGGGCACTACAATCTATCCTGTAGAACTATGGCCAAATCTCCATC 
AATGCTTTGCTAATTTCTGGGACTTAACTCGTAGAATCTACATACAGGGCTGGAATTTATTCAAAATGCATCTGAAG 
AAATGACATTTTAAACCGTTTTAAAAAATATCTTGATAAAAAATTCTGTAAAACAGAATTTGATAGGTTTAAAAACA 
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TGACAGCAGGCTCCCGCTGCGGCCGGGACCTCGCATCCCTGCAACGTGGCCGGGGCTGCATTTTTCATGAGCCTAGG 

GTGAACAGGTGCGAAGTGCGCTGGGAGCATCCGGCCAGCGGCCGAGCGCGGGGAACATGGAGAGCGAGCGCGACATG 

TACCGCCAGTTCCAGGACTGGTGCCTCAGGACTTACGGGGACTCAGGCAAGACCAAGACGGTGACCCGTAAAAAATA 

CGAACGGATCGTCCAGCTCCTCAATGGCTCCGAGTCGAGCTCCACGGACAACGCCAAATTTAAATTCTGGGTCAAAT 

CGAAGGGCTTCCAGCTGGGCCAGCCGGACGAGGTCCGCGGGGGAGGCGGCGGCGCCAAGCAAGTGCTCTACGTGCCT 

GTCAAGACCACGGATGGCGTAGGGGTAGATGAGAAGCTATCTTTACGACGGGTAGCTGTGGTTGAAGATTTCTTTGA 

CATTATTTATTCGATGCATGTGGAAACGGGGCCAAATGGAGAACAAATTCGGAAACACGCTGGACAAAAGAGAACTT 

ACAAAGCAATTTCAGAGAGCTATGCCTTCCTACCAAGAGAAGCGGTGACACGATTTCTAATGAGCTGCTCAGAGTGC 

CAGAAAAGAATGCATTTAAACCCAGATGGAACAGATCATAAAGATAATGGAAAACCTCCCACTTTGGTGACCAGCAT 

GATTGACTACAACATGCCAATTACCATGGCCTACATGAAACACATGAAGCTGCAGCTGCTAAACTCACAGCAAGATG 

AGGATGAAAGTTCAATAGAAAGTGATGAATTTGACATGAGTGATTCAACACGGATGTCAGCTGTGAACTCTGATCTT 

AGCTCCAATCTTGAAGAAAGAATGCAAAGTCCCCAGAATCTTCATGGCCAGCAAGATGATGATTCTGCTGCAGAGAG 

CTTTAATGGCAATGAGACTCTGGGGCACAGTTCAATTGCTTCAGGGGGAACACACAGCAGGGAGATGGGAGACTCCA 

ACAGTGATGGCAAAACTGGGCTGGAGCAAGATGAACAGCCACTGAACCTGAGTGACAGTCCCCTCTCTGCGCAGCTA 

ACTTCGGAATACAGAATAGATGATCACAACAGTAATGGGAAAAACAAGTATAAGAATCTTCTAATTTCTGACCTCAA 

GATGGAACGAGAGGCGAGAGAAAATGGAAGCAAGTCTCCTGCACATAGTTACTCCAGCTATGACTCTGGCAAAAATG 

AGAGTGTAGACCGAGGAGCTGAGGACCTCTCACTAAACAGGGGAGATGAGGACGAAGATGACCACGAGGACCATGAC 

GATTCGGAGAAAGTTAATGAGACAGACGGCGTTGAAGCCGAGCGGCTGAAAGCTTTTAATTCTCGACCTATTCCTTC 

CCACCTTACTTCAGCAGTTGCAGAGAGTATCTTGGCTTCAGCTTGTGAGAGTGAGAGTAGAAATGCCGCCAAGAGGA 

TGCGTCTGGAGAGACAGCAGGATGAGTCTGCTCCAGCTGACAAACAGTGTAAACCAGAGGCGACCCAGGCCACTTAC 

TCAACATCAGCTGTTCCAGGCTCACAGGACGTGCTGTACATCAATGGAAATGGGACCTATAGTTACCATAGTTACAG 

AGGGCTAGGAGGGGGTCTGCTAAATCTGAATGATGCTTCCAGCAGTGGACCCACTGATCTCAGCATGAAGAGACAAT 

TGGCGACTAGCTCAGGATCCTCCAGCAGCTCAAACTCCAGACCCCAGCTGAGTCCAACTGAAATCAATGCCGTGAGA 

CAGCTTGTTGCAGGATATCGAGAATCAGCTGCATTTTTATTGCGATCTGCAGATGAACTGGAAAATCTCATTTTACA 

ACAGAACTGAGACAGACGACCACCATATTCACTGAGGTCTAAATTTGCAGTTTCCACTAATGACATTTTGATTTCCC 

AACAGAGATACTTCTGGTCTTACTGCACAGTCTTTTAAGAGAAATACTTCCATTATGCCACATTGTCCTTGATCCGT 

AAGTGATGTGTTAAGGTGCTTCAAAGGAACTCTGACCTCTGAAGTACTTGAGCTACTTTAGTATGTCCAGCCTATTG 

CTTTTTGTTTTAGTGTGTCACCATAAATATCAGGGGCATAAAAGGCTATCTATTCTTAATTCAAGGATAAAACAGAA 

GAAGCTTGTGGTATAAAACAATAGTTCAAGATCCAGCTGAAATATTAGTGGAATTTGCTACTGACTCATTGGACTGA 

AAGCTGAAGTACCTGGCAAAAAAAAAAAAAAGAAAAAAAAAAGCCAAATTTCTTGTTGCTACAGGATATAACAACAA 

TGAAAAGGATCTCGTATTTTAAAAAAATATGTAATTTTTATAAAAAGAAAACTTGTTTTTCATTCAAACTTGTCATT 

TTTACTTTGGTAACTTTTTCATAGGTCCTAAAAGAAAACTGTTTTGAGAAACTACTGTAAGTACCTTTTCCACATCC 

CTTTGCCTTCTCCTCTTTCCAAATTCTTTCTACAAAAATAACACTTGATGCTGGAAAAACCCTTGCCTACGTTCTTT 

CAATCGTCACATCAGGAACTACTTCCAAGAGAAGCCTGCATTTCTGCTCTCATGCTGATCTCAAAAACCCCACTCAA 

CACTGCAACTTTATCATAGCAGTTTTCATCCCAGAATTTTTTTTTTAATAATGACAAGACATGTTGTTGAAAAAAAA 

TCACACCTTGGTTTCTTAGAGCTGCTCGTTCCTGATTGCCGCTGCTGTCTCCAGGCATCCCTCTAGCAGCACCTGGA 

TGTAGATGACTGAATGTTAAGAGGTTGCAAGTGACAATCTGAAAATTTGCACTCTTGTGTGTAGTTTTCTTTTCATT 

TCTTTCAGAAATAGTTTCCAAAAAGACCATTACATCTCCTGATATGATTTGTATAATTTTCAGTTCTAGCTAAAAAT 
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AATGTAAGGAACTCTCAGCGGATGCAGCTGCAACTTACAATGAACTGTGCCCTCCTATCCCCCATACTTTACCCTTC 
TTTCTTATTTTATAGTGTGGGATACACATGAGTGATGTTTTCTTTGTGCACTGAGACAAGCCTATTTTTTAAATATT 
TAGGGAGAAGTACTTTAGTTCATGCTTCTTATACAACTTTTTTCTGTTGTTTAGCTTTGGTTGGATTACAAATTCTT 
TGTGCATTCCTGAATTTGCCTTATTTCATGTAAAATTTATGTCATTCAGTTTTTGACAATGAGTTTGAGGCATCAGT 
GATATTTCTTATCTACTTGTTACATATAGTTTTTCAAGTAATGACTGTGATTGTGACCGAGTAATGTGCACTTTTTC 
TTGTAACTGTGGACATTGCTATGCTTTTTTCTTCTAGTGTTTCTAGAATTACTGTTCCTTACAATTATGTAAACAAA 
AAACAAAAAAAAAACTTTTGTGATACTGTTGGTGAATATAATGTGAAAAATCTTATTGAAATATGAGTATTTTGGAA 
ATACATAGCTGCACAAACATCTTTTAAGATGTGGATTTAGAGTTTGCTTATTTAAATGAAAATTCAAAAATTGAGGG 
CTGGTATAATTTTCTCTGTTTTGTTTGGTTTAATAAACAGATTTCTGTGTTAAAAA 

SEQ ID NO:1539 

>T06014 # transcript_13 lien 2775 (Includes node 34 - TAA seg 24) 

CTCTTATTTTCCAAAGCTCCAGGTGCTGTTTTCCTTGGGACTCATATACAGAGTCTTTGGTTCAAATGCTATGGTAC 

CCAGGCTGAAACCTTGAGAGAAGAACCACCGCCTTGGCCCAGTGAACCTGAAGGCCATACCCACAGACCTAAATACT 

GCACCTTCTGTGTAGCAGCTACTCAGGATGGACCAAACTATGTTGGATGATTCTGCTGCAGAGAGCTTTAATGGCAA 

TGAGACTCTGGGGCACAGTTCAATTGCTTCAGGGGGAACACACAGCAGGGAGATGGGAGACTCCAACAGTGATGGCA 

AAACTGGGCTGGAGCAAGATGAACAGCCACTGAACCTGAGTGACAGTCCCCTCTCTGCGCAGCTAACTTCGGAATAC 

AGAATAGATGATCACAACAGTAATGGGAAAAACAAGTATAAGAATCTTCTAATTTCTGACCTCAAGATGGAACGAGA 

GGCGAGAGAAAATGGAAGCAAGTCTCCTGCACATAGTTACTCCAGCTATGACTCTGGCAAAAATGAGAGTGTAGACC 

GAGGAGCTGAGGACCTCTCACTAAACAGGGGAGATGAGGACGAAGATGACCACGAGGACCATGACGATTCGGAGAAA 

GTTAATGAGACAGACGGCGTTGAAGCCGAGCGGCTGAAAGCTTTTAATGATGAGTCTGCTCCAGCTGACAAACAGTG 

TAAACCAGAGGCGACCCAGGCCACTTACTCAACATCAGCTGTTCCAGGCTCACAGGACGTGCTGTACATCAATGGAA 

ATGGGACCTATAGTTACCATAGTTACAGAGGGCTAGGAGGGGGTCTGCTAAATCTGAATGATGCTTCCAGCAGTGGA 

CCCACTGATCTCAGCATGAAGAGACAATTGGCGACTAGCTCAGGATCCTCCAGCAGCTCAAACTCCAGACCCCAGCT 

GAGTCCAACTGAAATCAATGCCGTGAGACAGCTTGTTGCAGGATATCGAGAATCAGCTGCATTTTTATTGCGATCTG 

CAGATGAACTGGAAAATCTCATTTTACAACAGAACTGAGACAGACGACCACCATATTCACTGAGGTCTAAATTTGCA 

GTTTCCACTAATGACATTTTGATTTCCCAACAGAGATACTTCTGGTCTTACTGCACAGTCTTTTAAGAGAAATACTT 

CCATTATGCCACATTGTCCTTGATCCGTAAGTGATGTGTTAAGGTGCTTCAAAGGAACTCTGACCTCTGAAGTACTT 

GAGCTACTTTAGTATGTCCAGCCTATTGCTTTTTGTTTTAGTGTGTCACCATAAATATCAGGGGCATAAAAGGCTAT 

CTATTCTTAATTCAAGGATAAAACAGAAGAAGCTTGTGGTATAAAACAATAGTTCAAGATCCAGCTGAAATATTAGT 

GGAATTTGCTACTGACTCATTGGACTGAAAGCTGAAGTACCTGGCAAAAAAAAAAAAAAGAAAAAAAAAAGCCAAAT 

TTCTTGTTGCTACAGGATATAACAACAATGAAAAGGATCTCGTATTTTAAAAAAATATGTAATTTTTATAAAAAGAA 

AACTTGTTTTTCATTCAAACTTGTCATTTTTACTTTGGTAACTTTTTCATAGGTCCTAAAAGAAAACTGTTTTGAGA 

AACTACTGTAAGTACCTTTTCCACATCCCTTTGCCTTCTCCTCTTTCCAAATTCTTTCTACAAAAATAACACTTGAT 

GCTGGAAAAACCCTTGCCTACGTTCTTTCAATCGTCACATCAGGAACTACTTCCAAGAGAAGCCTGCATTTCTGCTC 

TCATGCTGATCTCAAAAACCCCACTCAACACTGCAACTTTATCATAGCAGTTTTCATCCCAGAATTTTTTTTTTAAT 

AATGACAAGACATGTTGTTGAAAAAAAATCACACCTTGGTTTCTTAGAGCTGCTCGTTCCTGATTGCCGCTGCTGTC 

TCCAGGCATCCCTCTAGCAGCACCTGGATGTAGATGACTGAATGTTAAGAGGTTGCAAGTGACAATCTGAAAATTTG 
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CACTCTTGTGTGTAGTTTTCTTTTCATTTCTTTCAGAAATAGTTTCCAAAAAGACCATTACATCTCCTGATATGATT 
TGTATAATTTTCAGTTCTAGCTAAAAATAATGTAAGGAACTCTCAGCGGATGCAGCTGCAACTTACAATGAACTGTG 
CCCTCCTATCCCCCATACTTTACCCTTCTTTCTTATTTTATAGTGTGGGATACACATGAGTGATGTTTTCTTTGTGC 
ACTGAGACAAGCCTATTTTTTAAATATTTAGGGAGAAGTACTTTAGTTCATGCTTCTTATACAACTTTTTTCTGTTG 
TTTAGCTTTGGTTGGATTACAAATTCTTTGTGCATTCCTGAATTTGCCTTATTTCATGTAAAATTTATGTCATTCAG 
TTTTTGACAATGAGTTTGAGGCATCAGTGATATTTCTTATCTACTTGTTACATATAGTTTTTCAAGTAATGACTGTG 
ATTGTGACCGAGTAATGTGCACTTTTTCTTGTAACTGTGGACATTGCTATGCTTTTTTCTTCTAGTGTTTCTAGAAT 
TACTGTTCCTTACAATTATGTAAACAAAAAACAAAAAAAAAACTTTTGTGATACTGTTGGTGAATATAATGTGAAAA 
ATCTTATTGAAATATGAGTATTTTGGAAATACATAGCTGCACAAACATCTTTTAAGATGTGGATTTAGAGTTTGCTT 
ATTTAAATGAAAATTCAAAAATTGAGGGCTGGTATAATTTTCTCTGTTTTGTTTGGTTTAATAAACAGATTTCTGTG 

TTA 

SEQ ID NO:1540 

>T06014 # transcript__14 lien 2829 (Includes node 33 - TAA seg 22) 

GAGAATTTAGGGGAAGTCAGACTAATCCCTGGGTGCACTTCGAGAGAAAGGTTGGACCTCGCCTGAGGGATGCTGGA 

TCTGCAATGATAGAATTGGCCTGCCACAGACATTGGTATTTAAGTTGAAGAGCAGAATTCACAATGATTCTGCTGCA 

GAGAGCTTTAATGGCAATGAGACTCTGGGGCACAGTTCAATTGCTTCAGGGGGAACACACAGCAGGGAGATGGGAGA 

CTCCAACAGTGATGGCAAAACTGGGCTGGAGCAAGATGAACAGCCACTGAACCTGAGTGACAGTCCCCTCTCTGCGC 

AGCTAACTTCGGAATACAGAATAGATGATCACAACAGTAATGGGAAAAACAAGTATAAGAATCTTCTAATTTCTGAC 

CTCAAGATGGAACGAGAGGCGAGAGAAAATGGAAGCAAGTCTCCTGCACATAGTTACTCCAGCTATGACTCTGGCAA 

AAATGAGAGTGTAGACCGAGGAGCTGAGGACCTCTCACTAAACAGGGGAGATGAGGACGAAGATGACCACGAGGACC 

ATGACGATTCGGAGAAAGTTAATGAGACAGACGGCGTTGAAGCCGAGCGGCTGAAAGCTTTTAATTCTCGACCTATT 

CCTTCCCACCTTACTTCAGCAGTTGCAGAGAGTATCTTGGCTTCAGCTTGTGAGAGTGAGAGTAGAAATGCCGCCAA 

GAGGATGCGTCTGGAGAGACAGCAGGATGAGTCTGCTCCAGCTGACAAACAGTGTAAACCAGAGGCGACCCAGGCCA 

CTTACTCAACATCAGCTGTTCCAGGCTCACAGGACGTGCTGTACATCAATGGAAATGGGACCTATAGTTACCATAGT 

TACAGAGGGCTAGGAGGGGGTCTGCTAAATCTGAATGATGCTTCCAGCAGTGGACCCACTGATCTCAGCATGAAGAG 

ACAATTGGCGACTAGCTCAGGATCCTCCAGCAGCTCAAACTCCAGACCCCAGCTGAGTCCAACTGAAATCAATGCCG 

TGAGACAGCTTGTTGCAGGATATCGAGAATCAGCTGCATTTTTATTGCGATCTGCAGATGAACTGGAAAATCTCATT 

TTACAACAGAACTGAGACAGACGACCACCATATTCACTGAGGTCTAAATTTGCAGTTTCCACTAATGACATTTTGAT 

TTCCCAACAGAGATACTTCTGGTCTTACTGCACAGTCTTTTAAGAGAAATACTTCCATTATGCCACATTGTCCTTGA 

TCCGTAAGTGATGTGTTAAGGTGCTTCAAAGGAACTCTGACCTCTGAAGTACTTGAGCTACTTTAGTATGTCCAGCC 

TATTGCTTTTTGTTTTAGTGTGTCACCATAAATATCAGGGGCATAAAAGGCTATCTATTCTTAATTCAAGGATAAAA 

CAGAAGAAGCTTGTGGTATAAAACAATAGTTCAAGATCCAGCTGAAATATTAGTGGAATTTGCTACTGACTCATTGG 

ACTGAAAGCTGAAGTACCTGGCAAAAAAAA2VAAAAAGAAAAAAAAAAGCCAAATTTCTTGTTGCTACAGGATATAAC 

AACAATGAAAAGGATCTCGTATTTTAAAAAAATATGTAATTTTTATAAAAAGAAAACTTGTTTTTCATTCAAACTTG 

TCATTTTTACTTTGGTAACTTTTTCATAGGTCCTAAAAGAAAACTGTTTTGAGAAACTACTGTAAGTACCTTTTCCA 

CATCCCTTTGCCTTCTCCTCTTTCCAAATTCTTTCTACAAAAATAACACTTGATGCTGGAAAAACCCTTGCCTACGT 

TCTTTCAATCGTCACATCAGGAACTACTTCCAAGAGAAGCCTGCATTTCTGCTCTCATGCTGATCTCAAAAACCCCA 
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CTCAACACTGCAACTTTATCATAGCAGTTTTCATCCCAGAATTTTTTTTTTAATAATGACAAGACATGTTGTTGAAA 
AAAAATCACACCTTGGTTTCTTAGAGCTGCTCGTTCCTGATTGCCGCTGCTGTCTCCAGGCATCCCTCTAGCAGCAC 
CTGGATGTAGATGACTGAATGTTAAGAGGTTGCAAGTGACAATCTGAAAATTTGCACTCTTGTGTGTAGTTTTCTTT 
TCATTTCTTTCAGAAATAGTTTCCAAAAAGACCATTACATCTCCTGATATGATTTGTATAATTTTCAGTTCTAGCTA 
AAAATAATGTAAGGAACTCTCAGCGGATGCAGCTGCAACTTACAATGAACTGTGCCCTCCTATCCCCCATACTTTAC 
CCTTCTTTCTTATTTTATAGTGTGGGATACACATGAGTGATGTTTTCTTTGTGCACTGAGACAAGCCTATTTTTTAA 
ATATTTAGGGAGAAGTACTTTAGTTCATGCTTCTTATACAACTTTTTTCTGTTGTTTAGCTTTGGTTGGATTACAAA 
TTCTTTGTGCATTCCTGAATTTGCCTTATTTCATGTAAAATTTATGTCATTCAGTTTTTGACAATGAGTTTGAGGCA 
TCAGTGATATTTCTTATCTACTTGTTACATATAGTTTTTCAAGTAATGACTGTGATTGTGACCGAGTAATGTGCACT 
TTTTCTTGTAACTGTGGACATTGCTATGCTTTTTTCTTCTAGTGTTTCTAGAATTACTGTTCCTTACAATTATGTAA 
ACAAAAAACAAAAAAAAAACTTTTGTGATACTGTTGGTGAATATAATGTGAAAAATCTTATTGAAATATGAGTATTT 
TGGAAATACATAGCTGCACAAACATCTTTTAAGATGTGGATTTAGAGTTTGCTTATTTAAATGAAAATTCAAAAATT 
GAGGGCTGGTATAATTTTCTCTGTTTTGTTTGGTTTAATAAACAGATTTCTGTGTTA 

SEQ ID NO:1541 

>T06014 # transcript_15 #len 2715 (Includes node 33 - TAA seg 22) 

GAGAATTTAGGGGAAGTCAGACTAATCCCTGGGTGCACTTCGAGAGAAAGGTTGGACCTCGCCTGAGGGATGCTGGA 

TCTGCAATGATAGAATTGGCCTGCCACAGACATTGGTATTTAAGTTGAAGAGCAGAATTCACAATGATTCTGCTGCA 

GAGAGCTTTAATGGCAATGAGACTCTGGGGCACAGTTCAATTGCTTCAGGGGGAACACACAGCAGGGAGATGGGAGA 

CTCCAACAGTGATGGCAAAACTGGGCTGGAGCAAGATGAACAGCCACTGAACCTGAGTGACAGTCCCCTCTCTGCGC 

AGCTAACTTCGGAATACAGAATAGATGATCACAACAGTAATGGGAAAAACAAGTATAAGAATCTTCTAATTTCTGAC 

CTCAAGATGGAACGAGAGGCGAGAGAAAATGGAAGCAAGTCTCCTGCACATAGTTACTCCAGCTATGACTCTGGCAA 

AAATGAGAGTGTAGACCGAGGAGCTGAGGACCTCTCACTAAACAGGGGAGATGAGGACGAAGATGACCACGAGGACC 

ATGACGATTCGGAGAAAGTTAATGAGACAGACGGCGTTGAAGCCGAGCGGCTGAAAGCTTTTAATGATGAGTCTGCT 

CCAGCTGACAAACAGTGTAAACCAGAGGCGACCCAGGCCACTTACTCAACATCAGCTGTTCCAGGCTCACAGGACGT 

GCTGTACATCAATGGAAATGGGACCTATAGTTACCATAGTTACAGAGGGCTAGGAGGGGGTCTGCTAAATCTGAATG 

ATGCTTCCAGCAGTGGACCCACTGATCTCAGCATGAAGAGACAATTGGCGACTAGCTCAGGATCCTCCAGCAGCTCA 

AACTCCAGACCCCAGCTGAGTCCAACTGAAATCAATGCCGTGAGACAGCTTGTTGCAGGATATCGAGAATCAGCTGC 

ATTTTTATTGCGATCTGCAGATGAACTGGAAAATCTCATTTTACAACAGAACTGAGACAGACGACCACCATATTCAC 

TGAGGTCTAAATTTGCAGTTTCCACTAATGACATTTTGATTTCCCAACAGAGATACTTCTGGTCTTACTGCACAGTC 

TTTTAAGAGAAATACTTCCATTATGCCACATTGTCCTTGATCCGTAAGTGATGTGTTAAGGTGCTTCAAAGGAACTC 

TGACCTCTGAAGTACTTGAGCTACTTTAGTATGTCCAGCCTATTGCTTTTTGTTTTAGTGTGTCACCATAAATATCA 

GGGGCATAAAAGGCTATCTATTCTTAATTCAAGGATAAAACAGAAGAAGCTTGTGGTATAAAACAATAGTTCAAGAT 

CCAGCTGAAATATTAGTGGAATTTGCTACTGACTCATTGGACTGAAAGCTGAAGTACCTGGCAAAAAAAAAAAAAAG 

AAAAAAAAAAGCCAAATTTCTTGTTGCTACAGGATATAACAACAATGAAAAGGATCTCGTATTTTAAAAAAATATGT 

AATTTTTATAAAAAGAAAACTTGTTTTTCATTCAAACTTGTCATTTTTACTTTGGTAACTTTTTCATAGGTCCTAAA 

AGAAAACTGTTTTGAGAAACTACTGTAAGTACCTTTTCCACATCCCTTTGCCTTCTCCTCTTTCCAAATTCTTTCTA 

CAAAAATAACACTTGATGCTGGAAAAACCCTTGCCTACGTTCTTTCAATCGTCACATCAGGAACTACTTCCAAGAGA 
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AGCCTGCATTTCTGCTCTCATGCTGATCTCAAAAACCCCACTCAACACTGCAACTTTATCATAGCAGTTTTCATCCC 
AGAATTTTTTTTTTAATAATGACAAGACATGTTGTTGAAAAAAAATCACACCTTGGTTTCTTAGAGCTGCTCGTTCC 
TGATTGCCGCTGCTGTCTCCAGGCATCCCTCTAGCAGCACCTGGATGTAGATGACTGAATGTTAAGAGGTTGCAAGT 
GACAATCTGAAAATTTGCACTCTTGTGTGTAGTTTTCTTTTCATTTCTTTCAGAAATAGTTTCCAAAAAGACCATTA 
CATCTCCTGATATGATTTGTATAATTTTCAGTTCTAGCTAAAAATAATGTAAGGAACTCTCAGCGGATGCAGCTGCA 
ACTTACAATGAACTGTGCCCTCCTATCCCCCATACTTTACCCTTCTTTCTTATTTTATAGTGTGGGATACACATGAG 
TGATGTTTTCTTTGTGCACTGAGACAAGCCTATTTTTTAAATATTTAGGGAGAAGTACTTTAGTTCATGCTTCTTAT 
ACAACTTTTTTCTGTTGTTTAGCTTTGGTTGGATTACAAATTCTTTGTGCATTCCTGAATTTGCCTTATTTCATGTA 
AAATTTATGTCATTCAGTTTTTGACAATGAGTTTGAGGCATCAGTGATATTTCTTATCTACTTGTTACATATAGTTT 
TTCAAGTAATGACTGTGATTGTGACCGAGTAATGTGCACTTTTTCTTGTAACTGTGGACATTGCTATGCTTTTTTCT 
TCTAGTGTTTCTAGAATTACTGTTCCTTACAATTATGTAAACAAAAAACAAAAAAAAAACTTTTGTGATACTGTTGG 
TGAATATAATGTGAAAAATCTTATTGAAATATGAGTATTTTGGAAATACATAGCTGCACAAACATCTTTTAAGATGT 
GGATTTAGAGTTTGCTTATTTAAATGAAAATTCAAAAATTGAGGGCTGGTATAATTTTCTCTGTTTTGXTTGGTTTA 

ATAAACAGATTTCTGTGTTA 
SEQ ID NO:1542 

>T 0 6 0 1 4_PEA_2_P2 # trn_3 lien 536 

MESERDMYRQFQDWCLRTYGDSGKTKTVTRKKYERIVQLLNGSESSSTDNAKFKFWVKSKGFQLGQPDEVRGGGGGA 
KQVLYVPVKTTDGVGVDEKLSLRRVAVVEDFFDIIYSMHVETGPNGEQIRKHAGQKRTYKAISESYAFLPREAVTRF 
LMSCSECQKRMHLNPDGTDHKDNGKPPTLVTSMIDYNMPITMAYMKHMKLQLLNSQQDEDESSIESDEFDMSDSTRM 
SAVNSDLSSNLEERMQSPQNLHGQQDDDSAAESFNGNETLGHSSIASGGTHSREMGDSNSDGKTGLEQDEQPLNLSD 
SPLSAQLTSEYRIDDHNSNGKNKYKNLLISDLKMEREARENGSKSPAHSYSSYDSGKNESVDRGAEDLSLNRGDEDE 
DDHEDHDDSEKVNETDGVEAERLKAFNDESAPADKQCKPEATQATYSTSAVPGSQDVLYINGNGTYSYHSYRGLGGG 
LLNLNDASSSGPTDLSMKRQLATSSGSSSSSNSRPQLSPTEINAVRQLVAGYRESAAFLLRSADELENLILQQN 

SEQ ID NO:1543 

> T0 6014_PEA_2_P4 # trn_5 #len 387 

MDQTMLDDSAAESFNGNETLGHSSIASGGTHSREMGDSNSDGKTGLEQDEQPLNLSDSPLSAQLTSEYRIDDHNSNG 
KNKYKNLLISDLKMEREARENGSKSPAHSYSSYDSGKNESVDRGAEDLSLNRGDEDEDDHEDHDDSEKVNETDGVEA 

ERLKAFNMFVRLFVDENLDRMVP I SKQPKEKIQAI I D SCRRQFPE YQERARKRI RT YLKS CRRMKRS GFEMSRP IPS 
HLTSAVAESILASACESESRNAAKRMRLERQQDESAPADKQCKPEATQATYSTSAVPGSQDVLYINGNGTYSYHSYR 
GLGGGLLNLNDASSSGPTDLSMKRQLATSSGSSSSSNSRPQLSPTEINAVRQLVAGYRESAAFLLRSADELENLILQ 

QN 

SEQ ID NO:1544 

>Unique region of T0 6014_PEA_2JP4 and P6 (coded by Node 34-TAA seg 24) 
MDQTML 
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SEQ ID NO:1545 

>T0 6014_PEA_2_P5 # trn_6 lien 353 

MGDSNSDGKTGLEQDEQPLNLSDSPLSAQLTSEYRIDDHNSNGKNKYKNLLISDLKMEREARENGSKSPAHSYSSYD 
SGKNESVDRGAEDLSLNRGDE DEDDHEDHDDSEKVNETDGVEAERLKAFNMFVRLFVDENLDRMVPISKQPKEKIQA 
IIDSCRRQFPEYQERARKRIRTYLKSCRRMKRSGFEMSRPIPSHLTSAVAESILASACESESRNAAKRMRLERQQDE 
SAPADKQCKPEATQATYSTSAVPGSQDVLYINGNGTYSYHSYRGLGGGLLNLNDASSSGPTDLSMKRQLATSSGSSS 
SSNSRPQLSPTEINAVRQLVAGYRESAAFLLRSADELENLILQQN 

SEQ ID NO: 1546 

>T0 6014_PEA_2_P6 # trn_7 lien 323 

MDQTMLDDSAAESFNGNETLGHSSIASGGTHSREMGDSNSDGKTGLEQDEQPLNLSDSPLSAQLTSEYRIDDHNSNG 
KNKYKNLLISDLKMEREARENGSKSPAHSYSSYDSGKNESVDRGAEDLSLNRGDEDEDDHEDHDDSEKVNETDGVEA 
ERLKAFNSRPIPSHLTSAVAESILASACESESRNAAKRMRLERQQDESAPADKQCKPEATQATYSTSAVPGSQDVLY 
INGNGTYSYHSYRGLGGGLLNLNDASSSGPTDLSMKRQLATSSGSSSSSNSRPQLSPTEINAVRQLVAGYRESAAFL 

LRSADELENLILQQN 
SEQ ID NO:1547 

>T0 6014_PEA_2_P1 # trn_ll #len 638 

MESERDMYRQFQDWCLRTYGDSGKTKTVTRKKYERIVQLLNGSESSSTDNAKFKFWVKSKGFQLGQPDEVRGGGGGA 
KQVLYVPVKTTDGVGVDEKLSLRRVAVVEDFFDII YSMHVETGPNGEQIRKHAGQKRTYKAISESYAFLPREAVTRF 
LMSCSECQKRMHLNPDGTDHKDNGKPPTLVTSMIDYNMPITMAYMKHMKLQLLNSQQDEDESSIESDEFDMSDSTRM 
SAVNSDLSSNLEERMQSPQNLHGQQDDDSAAESFNGNETLGHSS IASGGTHSREMGDSNSDGKTGLEQDEQPLNLSD 
SPLSAQLTSEYRIDDHNSNGKNKYKNLLISDLKMEREARENGSKSPAHSYSSYDSGKNESVDRGAEDLSLNRGDEDE 
DDHEDHDDSEKVNETDGVEAERLKAFNMFVRLFVDENLDRMVPISKQPKEKIQAIIDSCRRQFPEYQERARKRIRTY 
LKSCRRMKRSGFEMSRPIPSHLTSAVAESILASACESESRNAAKRMRLERQQDESAPADKQCKPEATQATYSTSAVP 
GSQDVLYINGNGTYSYHSYRGLGGGLLNLNDASSSGPTDLSMKRQLATSSGSSSSSNSRPQLSPTEINAVRQLVAGY 

RESAAFLLRSADELENLILQQN 
SEQ ID NO:1548 

>T06014__PEA_2_P7 # trn_12 #len 574 

MESERDMYRQFQDWCLRTYGDSGKTKTVTRKKYERIVQLLNGSESSSTDNAKFKFWVKSKGFQLGQPDEVRGGGGGA 
KQVLYVPVKTTDGVGVDEKLSLRRVAVVEDFFDIIYSMHVETGPNGEQIRKHAGQKRTYKAISESYAFLPREAVTRF 
LMSCSECQKRMHLNPDGTDHKDNGKPPTLVTSMIDYNMPITMAYMKHMKLQLLNSQQDEDESSIESDEFDMSDSTRM 
SAVNSDLSSNLEERMQSPQNLHGQQDDDSAAESFNGNETLGHSSIASGGTHSREMGDSNSDGKTGLEQDEQPLNLSD 
SPLSAQLTSEYRIDDHNSNGKNKYKNLLISDLKMEREARENGSKSPAHSYSSYDSGKNESVDRGAEDLSLNRGDEDE 
DDHEDHDDSEKVNETDGVEAERLKAFNSRPIPSHLTSAVAE SILAS ACESESRNAAKRMRLERQQDESAPADKQCKP 
EATQATYSTSAVPGSQDVLYINGNGTYSYHSYRGLGGGLLNLNDASSSGPTDLSMKRQLATSSGSSSSSNSRPQLSP 
TEINAVRQLVAGYRESAAFLLRSADELENLILQQN 
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SEQ ID NO:1549 

>T06014_PEA__2_P8 # trn_13 lien 285 

MDQTMLDDSAAESFNGNETLGHSSIASGGTHSREMGDSNSDGKTGLEQDEQPLNLSDSPLSAQLTSEYRIDDHNSNG 
KNKYKNLLISDLKMEREARENGSKSPAHSYSSYDSGKNESVDRGAEDLSLNRGDEDEDDHEDHDDSEKVNETDGVEA 
ERLKAFNDESAPADKQCKPEATQATYSTSAVPGSQDVLYINGNGTYSYHSYRGLGGGLLNLNDASSSGPTDLSMKRQ 
LATSSGSSSSSNSRPQLSPTEINAVRQLVAGYRESAAFLLRSADELENLILQQN 

SEQ ID NO: 1550 

>T06014_PEA_2_P9 # trn_14 lien 289 

MGDSNSDGKTGLEQDEQPLNLSDSPLSAQLTSEYRIDDHNSNGKNKYKNLLISDLKMEREARENGSKSPAHSYSSYD 
SGKNESVDRGAEDLSLNRGDEDEDDHEDHDDSEKVNETDGVEAERLKAFNSRPIPSHLTSAVAESILASACESESRN 
AAKRMRLERQQDESAPADKQCKPEATQATYSTSAVPGSQDVLYINGNGTYSYHSYRGLGGGLLNLNDASSSGPTDLS 
MKRQLATSSGSSSSSNSRPQLSPTEINAVRQLVAGYRESAAFLLRSADELENLILQQN 

SEQ ID NO: 1551 

>T0 6014_PEA__2_P10 # trn_15 #len 251 

MGDSNSDGKTGLEQDEQPLNLSDSPLSAQLTSEYRIDDHNSNGKNKYKNLLISDLKMEREARENGSKSPAHSYSSYD 
SGKNESVDRGAEDLSLNRGDEDEDDHEDHDDSEKVNETDGVEAERLKAFNDESAPADKQCKPEATQATYSTSAVPGS 
QDVLYINGNGTYSYHSYRGLGGGLLNLNDASSSGPTDLSMKRQLATSSGSSSSSNSRPQLSPTEINAVRQLVAGYRE 

SAAFLLRSADELENLILQQN 
SEQ ID NO: 1552 

>T06014 # node_29 (TAA seg 1) #len 314 

AGGGGAGTGGGAGGGGTGGGGGTGAGGGGAGAGGCGAGAGGTTTAGCGTGTGGAGCTGCCTGCGCTCCGCCCGGGCT 
GTCAGTCCCGGCTCCAGCCGCCGCGAGACCTTCCCGGAGACGCGCGCACACACAGCGCACCCCCTGCACACCGCACA 
CCCTCGCTCCCTTGCTCACACACACGCACACACTCAGCCTGGCCGAGCAGGAGCCACTGACCATTTTGCAAGTGTCA 
GGACCAGCTACAGCGCGGTGGGCGCAAACATCCCGCTTTCCCTTTCCGGGATTTAGTCTGGGAGACGACGACGAGGG 

AAGAAG 

SEQ ID NO: 1553 

>T06014 # nodejl (TAA seg 3) #len 308 

AAACCAGAGGAGTCCTGTCTGGGGGTCCCATCATTATTCGGGATACCCGCCGCCAGCGGCCTGCCTTCGGTTACCCA 
CATCCCCCTGGAACGGATATCTGTTTGGGGCACTACAATCTATCCTGTAGAACTATGGCCAAATCTCCATCAATGCT 
TTGCTAATTTCTGGGACTTAACTCGTAGAATCTACATACAGGGCTGGAATTTATTCAAAATGCATCTGAAGAAATGA 
CATTTTAAACCGTTTTAAAAAATATCTTGATAAAAAATTCTGTAAAACAGAATTTGATAGGTTTAAAAACATGACAG 



SEQ ID NO:1554 
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>T06014 # node_33 (TAA segment 22) lien 140 

GAGAATTTAGGGGAAGTCAGACTAATCCCTGGGTGCACTTCGAGAGAAAGGTTGGACCTCGCCTGAGGGATGCTGGA 
TCTGCAATGATAGAATTGGCCTGCCACAGACATTGGTATTTAAGTTGAAGAGCAGAATTCACA 

SEQ ID NO:1555 

>T06014 # node_34 (TAA segment 24) #len 200 

CTCTTATTTTCCAAAGCTCCAGGTGCTGTTTTCCTTGGGACTCATATACAGAGTCTTTGGTTCAAATGCTATGGTAC 
CCAGGCTGAAACCTTGAGAGAAGAACCACCGCCTTGGCCCAGTGAACCTGAAGGCCATACCCACAGACCTAAATACT 
GCACCTTCTGTGTAGCAGCTACTCAGGATGGACCAAACTATGTTGG 

SEQ ID NO: 1556 
>forward primer: 
GGTTCGGATGGACTACACTTTGTC 

SEQ ID NO: 1557 
>Reverse primer: 
CCACGTACTTCTGGGTGATGTC 

SEQ ID NO: 1558 
>Amplicon : 

GGTTCGGATGGACTACACTTTGTCCGTACCCACCACGTAGCAGAGAAAACCACCTTGTATGACATGGACATTGACAT 
CACCCAGAAGTACGTGG 

SEQ ID NO:1559 

>AA281370 # transcript_0 #len 4783 (Includes node 31 - TAA seg 10) 
CGGCGTGACGATGGCGGCCGTAGGGTCCGGAGGCTATGCGCGGAACGATGCAGGGGAGAAGCTGCCCTCTGTCATGG 

CGGGAGTTCCGGCGCGGAGGGGCCAGTCCTCCCCGCCCCCCGCCCCACCAATCTGCCTACGGCGGCGGACGCGACTC 

TCGACGGCCTCCGAGGAGACGGTGCAGAACCGGGTGTCACTCGAGAAGGTGCTTGGCATCACAGCCCAGAACAGCAG 

TGGCCTAACCTGTGACCCCGGCACAGGCCATGTGGCCTACCTGGCAGGCTGTGTGGTGGTGATTTTGGACCCCAAGG 

AGAACAAGCAGCAGCACATCTTTAACACCGCCAGGAAGTCTCTCAGTGCTCTGGCCTTCTCCCCTGATGGGAAGTAC 

ATAGTGACAGGGGAGAATGGGCATAGGCCTGCTGTGCGCATCTGGGATGTGGAGGAGAAGAATCAGGTGGCGGAGAT 

GCTAGGCCACAAGTATGGTGTGGCGTGTGTGGCCTTCTCACCCAATATGAAGCACATCGTGTCCATGGGCTACCAAC 

ATGACATGGTGCTCAACGTCTGGGACTGGAAGAAAGACATCGTAGTGGCCTCCAACAAGGTATCTTGTAGAGTCATT 

GCCCTCTCCTTCTCAGAGGACAGCAGCTATTTTGTCACTGTTGGGAACCGCCATGTGAGGTTCTGGTTCTTGGAAGT 

CTCCACTGAGACAAAGGTGACGAGCACAGTGCCCCTTGTAGGGCGCTCGGGCATCCTGGGCGAGCTGCACAACAACA 

TCTTCTGTGGTGTGGCCTGCGGTCGGGGCCGGATGGCGGGCAGTACCTTCTGTGTGTCCTACTCGGGCCTCCTCTGC 

CAGTTCAATGAGAAGAGGGTGCTGGAGAAGTGGATCAACCTGAAGGTCTCCCTGTCTTCCTGCCTCTGTGTCAGCCA 

GGAGCTCATCTTCTGTGGCTGCACAGATGGGATAGTCCGCATCTTCCAGGCCCATAGCCTGCACTACCTCGCCAACC 
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TGCCCAAGCCACACTACCTTGGGGTAGACGTGGCACAGGGCCTGGAGCCCAGGAAGGCGGAAGCAGTCTACCCAGAT 

ACAGTGGCACTGACCTTCGACCCCATCCACCAGTGGCTGTCCTGCGTGTATAAGGACCACAGCATCTACATCTGGGA 

TGTCAAGGACATCAACAGAGTGGGCAAGGTGTGGTCAGAGCTCTTCCACAGCTCCTACGTTTGGAACGTGGAGGTGT 

ATCCTGAGTTTGAAGACCAGAGAGCTTGTTTGCCATCAGGATCCTTTCTGACTTGTTCTTCAGACAACACCATTCGC 

TTCTGGAACTTGGACAGCAGCCCTGATTCTCACTGGCAGAAAAACATCTTCAGCAATTTGGCTCCTGCTCTGCAGAT 

GTGTGGACGTGGCCACTCCCGGCAGCCCAACACACCCAGCCCAGGTGAAATTGCTTCCTAAGTCTTGAGATGGAATT 

CCCTGGGAGAGCACCTGATGAGCCAGCCCATTCAGCTGCCACACAACCCTGCTGAAGGTCGTGTACGTGGAGAATGA 

CATCCAGCACCTGCAGGACATGTCACACTTCCCAGACCGGGGGAGCGAGAATGGGACACCCATGGACGTGAAAGCCG 

GGGTGCGGGTCATGCAGGTCAGTCCTGACGGCCAGCATTTGGCTTCAGGCGACCGAAGTGGAAATCTGAGGATCCAC 

GAGCTGCACTTCATGGACGAGCTGGTCAAGGTGGAGGCCCATGATGCTGAGGTGCTGTGCCTGGAGTACTCCAAGCC 

AGAGACGGGGCTGACCTTGCTGGCCTCAGCCAGTCGGGACCGGCTGATCCATGTGCTGAACGTGGAGAAGAACTACA 

ACCTGGAGCAGACGCTGGATGACCACTCCTCCTCCATCACCGCCATCAAGTTCGCTGGCAACAGAGACATCCAGATG 

ATCAGCTGTGGGGCTGACAAGAGCATCTACTTTCGCAGTGCCCAGCAGGGTTCGGATGGACTACACTTTGTCCGTAC 

CCACCACGTAGCAGAGAAAACCACCTTGTATGACATGGACATTGACATCACCCAGAAGTACGTGGCCGTGGCCTGCC 

AGGACCGCAATGTGAGAGTCTACAACACTGTGAACGGGAAGCAGAAGAAGTGCTACAAGGGCTCCCAGGGTGACGAA 

GGGTCCTTGCTGAAGGTCCATGTGGACCCCTCAGGCACCTTCCTGGCCACCAGCTGCTCTGACAAAAGCATCTCAGT 

GATTGACTTTTACTCGGGCGAGTGCATTGCCAAGATGTTTGGCCATTCAGAAATTATTACCAGCATGAAGTTCACCT 

ATGACTGTCATCACTTGATCACAGTATCTGGAGACAGCTGCGTGTTCATCTGGCACCTGGGCCCGGAGATCACCAAC 

TGCATGAAGCAGCACTTGCTGGAGATTGACCACCGGCAGCAGCAGCAGCACACAAATGACAAGAAGCGGAGTGGCCA 

CCCCAGGCAGGATACGTATGTGTCCACACCTAGTGAGATTCACTCCCTGAGCCCTGGAGAGCAAACAGAGGATGATC 

TGGAGGAAGAGTGTGAGCCAGAAGAGATGCTGAAGACACCATCCAAAGATAGCTTGGATCCAGATCCTCGTTGCCTG 

CTAACCAACGGCAAGCTGCCACTGTGGGCAAAGCGGCTGCTAGGGGACGATGATGTGGCAGATGGCTCGGCCTTCCA 

CGCCAAGCGCAGCTACCAGCCCCACGGCCGCTGGGCAGAGCGGGCCGGCCAAGAGCCCCTCAAGACCATCCTGGATG 

CCCAGGACCTGGATTGCTACTTTACCCCCATGAAGCCCGAGAGTCTGGAGAACTCCATTCTGGATTCACTGGAGCCA 

CAGAGCCTGGCCAGCCTGCTGAGTGAGTCAGAGAGTCCCCAGGAAGCTGGCCGCGGGCACCCCTCCTTCCTGCCCCA 

GCAGAAGGAATCATCTGAGGCCAGTGAGCTCATCCTCTACTCTCTGGAGGCAGAAGTGACAGTCACAGGGACAGACA 

GCCAGTATTGCAGGAAGGAGGTGGAGGCCGGGCCTGGAGACCAGCAGGGCGACTCCTACCTCAGGGTGTCCTCCGAC 

AGCCCAAAGGACCAGAGCCCGCCTGAGGACTCGGGGGAGTCAGAGGCCGACCTGGAGTGCAGCTTCGCAGCCATCCA 

CTCCCCAGCTCCGCCTCCTGACCCTGCCCCTCGGTTTGCCACGTCGCTGCCCCATTTCCCAGGATGCGCAGGTCCCA 

CAGAAGATGAGCTGTCCCTGCCCGAGGGACCCAGCGTCCCCAGCAGCTCCCTACCCCAGACTCCGGAGCAGGAGAAG 

TTCCTCCGCCACCACTTTGAGACACTGACTGAGTCCCCCTGCAGAGCTCTGGGAGACGTGGAGGCCTCTGAAGCTGA 

AGACCACTTCTTCAACCCACGCCTGAGTATCTCCACGCAGTTCCTCTCAAGCCTCCAGAAGGCATCCAGGTTCACCC 

ATACCTTCCCTCCCCGGGCAACCCAGTGCCTTGTGAAGTCTCCAGAGGTCAAGCTCATGGACCGAGGCGGAAGCCAG 

CCCAGAGCAGGTACTGGCTACGCCTCCCCAGACAGGACCCACGTCCTCGCTGCAGGGAAGGCTGAAGAGACCCTGGA 

GGCCTGGCGCCCACCACCTCCCTGCCTTACGAGCCTGGCGTCCTGTGTCCCTGCTTCCTCCGTGCTGCCCACAGACA 

GGAATCTCCCAACGCCCACATCTGCACCCACCCCAGGCCTGGCTCAGGGTGTCCATGCCCCCTCCACCTGTTCCTAC 

ATGGAGGCCACTGCCAGCTCCCGTGCCAGGATATCACGCAGCATCTCCCTCGGTGACAGTGAGGGCCCTATCGTGGC 

CACACTGGCCCAGCCCCTCCGTAGGCCATCGTCCGTTGGGGAGCTGGCCTCCTTGGGCCAGGAGCTTCAGGCCATCA 
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CCACCGCGACAACACCCAGTTTGGACAGTGAGGGCCAAGAGCCTGCCCTGCGTTCCTGGGGCAACCACGAGGCCCGG 
GCCAACCTGAGACTGACCCTGTCAAGTGCCTGTGATGGGCTCCTGCAGCCCCCCGTGGATACCCAGCCTGGCGTCAC 
CGTCCCTGCAGTGAGCTTCCCAGCCCCTAGCCCTGTGGAAGAGAGCGCCCTGAGGCTCCACGGCTCTGCCTTTCGCC 
CAAGTCTCCCAGCTCCTGAGTCCCCTGGCCTTCCTGCCCACCCCAGTAACCCCCAGCTTCCAGAGGCCCGGCCTGGC 
ATCCCTGGCGGCACTGCCTCCCTCCTGGAGCCCACCTCCGGTGCACTTGGTCTGTTCCAGGGCAGCCCTGCCCGCTG 
GAGTGAGCCCTGGGTGCCGGTTGAAGCCCTGCCCCCATCTCCCCTTGAGCTGAGCAGGGTGGGGAACATCTTGCACA 
GGCTGCAGACCACCTTCCAAGAAGCCCTCGACCTTTACCGTGTGTTGGTCTCCAGTGGCCAGGTGGACACCGGGCAG 
CAGCAGGCACGGACTGAGCTGGTCTCCACCTTCCTGTGGATCCACAGCCAGCTGGAGGCTGAATGCCTGGTGGGGAC 
TAGTGTGGCCCCAGCCCAGGCTCTGCCCAGCCCAGGACCCCCGTCCCCACCGACGCTGTACCCCCTGGCCAGCCCAG 
ACCTGCAGGCCCTGCTGGAACACTACTCGGAGCTGCTGGTGCAGGCCGTGCGGAGGAAGGCACGGGGGCACTGAGGG 
CGCAGCCCCTCCACCGCAGCCCTGCTGCTTCTGAGGACTTAGGTATTTTAAGCGAATAAACTGACAGCTTTGAGGAA 

TGGTTCCTG 

SEQ ID NO:1560 

>AA281370 # transcript_l lien 4656 {Includes node 31 - TAA seg 10) 

CGGCGTGACGATGGCGGCCGTAGGGTCCGGAGGCTATGCGCGGAACGATGCAGGGGAGAAGCTGCCCTCTGTCATGG 

CGGGAGTTCCGGCGCGGAGGGGCCAGTCCTCCCCGCCCCCCGCCCCACCAATCTGCCTACGGCGGCGGACGCGACTC 

TCGACGGCCTCCGAGGAGACGGTGCAGAACCGGGTGTCACTCGAGAAGGTGCTTGGCATCACAGCCCAGAACAGCAG 

TGGCCTAACCTGTGACCCCGGCACAGGCCATGTGGCCTACCTGGCAGGCTGTGTGGTGGTGATTTTGGACCCCAAGG 

AGAACAAGCAGCAGCACATCTTTAACACCGCCAGGAAGTCTCTCAGTGCTCTGGCCTTCTCCCCTGATGGGAAGTAC 

ATAGTGACAGGGGAGAATGGGCATAGGCCTGCTGTGCGCATCTGGGATGTGGAGGAGAAGAATCAGGTGGCGGAGAT 

GCTAGGCCACAAGTATGGTGTGGCGTGTGTGGCCTTCTCACCCAATATGAAGCACATCGTGTCCATGGGCTACCAAC 

ATGACATGGTGCTCAACGTCTGGGACTGGAAGAAAGACATCGTAGTGGCCTCCAACAAGGTATCTTGTAGAGTCATT 

GCCCTCTCCTTCTCAGAGGACAGCAGCTATTTTGTCACTGTTGGGAACCGCCATGTGAGGTTCTGGTTCTTGGAAGT 

CTCCACTGAGACAAAGGTGACGAGCACAGTGCCCCTTGTAGGGCGCTCGGGCATCCTGGGCGAGCTGCACAACAACA 

TCTTCTGTGGTGTGGCCTGCGGTCGGGGCCGGATGGCGGGCAGTACCTTCTGTGTGTCCTACTCGGGCCTCCTCTGC 

CAGTTCAATGAGAAGAGGGTGCTGGAGAAGTGGATCAACCTGAAGGTCTCCCTGTCTTCCTGCCTCTGTGTCAGCCA 

GGAGCTCATCTTCTGTGGCTGCACAGATGGGATAGTCCGCATCTTCCAGGCCCATAGCCTGCACTACCTCGCCAACC 

TGCCCAAGCCACACTACCTTGGGGTAGACGTGGCACAGGGCCTGGAGCCCAGCTTCCTCTTCCACAGGAAGGCGGAA 

GCAGTCTACCCAGATACAGTGGCACTGACCTTCGACCCCATCCACCAGTGGCTGTCCTGCGTGTATAAGGACCACAG 

CATCTACATCTGGGATGTCAAGGACATCAACAGAGTGGGCAAGGTGTGGTCAGAGCTCTTCCACAGCTCCTACGTTT 

GGAACGTGGAGGTGTATCCTGAGTTTGAAGACCAGAGAGCTTGTTTGCCATCAGGATCCTTTCTGACTTGTTCTTCA 

GACAACACCATTCGCTTCTGGAACTTGGACAGCAGCCCTGATTCTCACTGGCAGAAAAACATCTTCAGCAATACCCT 

GCTGAAGGTCGTGTACGTGGAGAATGACATCCAGCACCTGCAGGACATGTCACACTTCCCAGACCGGGGGAGCGAGA 

ATGGGACACCCATGGACGTGAAAGCCGGGGTGCGGGTCATGCAGGTCAGTCCTGACGGCCAGCATTTGGCTTCAGGC 

GACCGAAGTGGAAATCTGAGGATCCACGAGCTGCACTTCATGGACGAGCTGGTCAAGGTGGAGGCCCATGATGCTGA 

GGTGCTGTGCCTGGAGTACTCCAAGCCAGAGACGGGGCTGACCTTGCTGGCCTCAGCCAGTCGGGACCGGCTGATCC 

ATGTGCTGAACGTGGAGAAGAACTACAACCTGGAGCAGACGCTGGATGACCACTCCTCCTCCATCACCGCCATCAAG 



WO 2006/131783 



PCT/IB2005/004037 



663 

TTCGCTGGCAACAGAGACATCCAGATGATCAGCTGTGGGGCTGACAAGAGCATCTACTTTCGCAGTGCCCAGCAGGG 

TTCGGATGGACTACACTTTGTCCGTACCCACCACGTAGCAGAGAAAACCACCTTGTATGACATGGACATTGACATCA 

CCCAGAAGTACGTGGCCGTGGCCTGCCAGGACCGCAATGTGAGAGTCTACAACACTGTGAACGGGAAGCAGAAGAAG 

TGCTACAAGGGCTCCCAGGGTGACGAAGGGTCCTTGCTGAAGGTCCATGTGGACCCCTCAGGCACCTTCCTGGCCAC 

CAGCTGCTCTGACAAAAGCATCTCAGTGATTGACTTTTACTCGGGCGAGTGCATTGCCAAGATGTTTGGCCATTCAG 

AAATTATTACCAGCATGAAGTTCACCTATGACTGTCATCACTTGATCACAGTATCTGGAGACAGCTGCGTGTTCATC 

TGGCACCTGGGCCCGGAGATCACCAACTGCATGAAGCAGCACTTGCTGGAGATTGACCACCGGCAGCAGCAGCAGCA 

CACAAATGACAAGAAGCGGAGTGGCCACCCCAGGCAGGATACGTATGTGTCCACACCTAGTGAGATTCACTCCCTGA 

GCCCTGGAGAGCAAACAGAGGATGATCTGGAGGAAGAGTGTGAGCCAGAAGAGATGCTGAAGACACCATCCAAAGAT 

AGCTTGGATCCAGATCCTCGTTGCCTGCTAACCAACGGCAAGCTGCCACTGTGGGCAAAGCGGCTGCTAGGGGACGA 

TGATGTGGCAGATGGCTCGGCCTTCCACGCCAAGCGCAGCTACCAGCCCCACGGCCGCTGGGCAGAGCGGGCCGGCC 

AAGAGCCCCTCAAGACCATCCTGGATGCCCAGGACCTGGATTGCTACTTTACCCCCATGAAGCCCGAGAGTCTGGAG 

AACTCCATTCTGGATTCACTGGAGCCACAGAGCCTGGCCAGCCTGCTGAGTGAGTCAGAGAGTCCCCAGGAAGCTGG 

CCGCGGGCACCCCTCCTTCCTGCCCCAGCAGAAGGAATCATCTGAGGCCAGTGAGCTCATCCTCTACTCTCTGGAGG 

CAGAAGTGACAGTCACAGGGACAGACAGCCAGTATTGCAGGAAGGAGGTGGAGGCCGGGCCTGGAGACCAGCAGGGC 

GACTCCTACCTCAGGGTGTCCTCCGACAGCCCAAAGGACCAGAGCCCGCCTGAGGACTCGGGGGAGTCAGAGGCCGA 

CCTGGAGTGCAGCTTCGCAGCCATCCACTCCCCAGCTCCGCCTCCTGACCCTGCCCCTCGGTTTGCCACGTCGCTGC 

CCCATTTCCCAGGATGCGCAGGTCCCACAGAAGATGAGCTGTCCCTGCCCGAGGGACCCAGCGTCCCCAGCAGCTCC 

CTACCCCAGACTCCGGAGCAGGAGAAGTTCCTCCGCCACCACTTTGAGACACTGACTGAGTCCCCCTGCAGAGCTCT 

GGGAGACGTGGAGGCCTCTGAAGCTGAAGACCACTTCTTCAACCCACGCCTGAGTATCTCCACGCAGTTCCTCTCAA 

GCCTCCAGAAGGCATCCAGGTTCACCCATACCTTCCCTCCCCGGGCAACCCAGTGCCTTGTGAAGTCTCCAGAGGTC 

AAGCTCATGGACCGAGGCGGAAGCCAGCCCAGAGCAGGTACTGGCTACGCCTCCCCAGACAGGACCCACGTCCTCGC 

TGCAGGGAAGGCTGAAGAGACCCTGGAGGCCTGGCGCCCACCACCTCCCTGCCTTACGAGCCTGGCGTCCTGTGTCC 

CTGCTTCCTCCGTGCTGCCCACAGACAGGAATCTCCCAACGCCCACATCTGCACCCACCCCAGGCCTGGCTCAGGGT 

GTCCATGCCCCCTCCACCTGTTCCTACATGGAGGCCACTGCCAGCTCCCGTGCCAGGATATCACGCAGCATCTCCCT 

CGGTGACAGTGAGGGCCCTATCGTGGCCACACTGGCCCAGCCCCTCCGTAGGCCATCGTCCGTTGGGGAGCTGGCCT 

CCTTGGGCCAGGAGCTTCAGGCCATCACCACCGCGACAACACCCAGTTTGGACAGTGAGGGCCAAGAGCCTGCCCTG 

CGTTCCTGGGGCAACCACGAGGCCCGGGCCAACCTGAGACTGACCCTGTCAAGTGCCTGTGATGGGCTCCTGCAGCC 

CCCCGTGGATACCCAGCCTGGCGTCACCGTCCCTGCAGTGAGCTTCCCAGCCCCTAGCCCTGTGGAAGAGAGCGCCC 

TGAGGCTCCACGGCTCTGCCTTTCGCCCAAGTCTCCCAGCTCCTGAGTCCCCTGGCCTTCCTGCCCACCCCAGTAAC 

CCCCAGCTTCCAGAGGCCCGGCCTGGCATCCCTGGCGGCACTGCCTCCCTCCTGGAGCCCACCTCCGGTGCACTTGG 

TCTGTTCCAGGGCAGCCCTGCCCGCTGGAGTGAGCCCTGGGTGCCGGTTGAAGCCCTGCCCCCATCTCCCCTTGAGC 

TGAGCAGGGTGGGGAACATCTTGCACAGGCTGCAGACCACCTTCCAAGAAGCCCTCGACCTTTACCGTGTGTTGGTC 

TCCAGTGGCCAGGTGGACACCGGGCAGCAGCAGGCACGGACTGAGCTGGTCTCCACCTTCCTGTGGATCCACAGCCA 

GCTGGAGGCTGAATGCCTGGTGGGGACTAGTGTGGCCCCAGCCCAGGCTCTGCCCAGCCCAGGACCCCCGTCCCCAC 

CGACGCTGTACCCCCTGGCCAGCCCAGACCTGCAGGCCCTGCTGGAACACTACTCGGAGCTGCTGGTGCAGGCCGTG 

CGGAGGAAGGCACGGGGGCACTGAGGGCGCAGCCCCTCCACCGCAGCCCTGCTGCTTCTGAGGACTTAGGTATTTTA 

AGCGAATAAACTGACAGCTTTGAGGAATGGTTCCTG 
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*SEQ ID NO:1561 

>AA281370 # transcriptj lien 4798 (Includes node 31 - TAA seg 10; node 56) 

CGGCGTGACGATGGCGGCCGTAGGGTCCGGAGGCTATGCGCGGAACGATGCAGGGGAGAAGCTGCCCTCTGTCATGG 

CGGGAGTTCCGGCGCGGAGGGGCCAGTCCTCCCCGCCCCCCGCCCCACCAATCTGCCTACGGCGGCGGACGCGACTC 

TCGACGGCCTCCGAGGAGACGGTGCAGAACCGGGTGTCACTCGAGAAGGTGCTTGGCATCACAGCCCAGAACAGCAG 

TGGCCTAACCTGTGACCCCGGCACAGGCCATGTGGCCTACCTGGCAGGCTGTGTGGTGGTGATTTTGGACCCCAAGG 

AGAACAAGCAGCAGCACATCTTTAACACCGCCAGGAAGTCTCTCAGTGCTCTGGCCTTCTCCCCTGATGGGAAGTAC 

ATAGTGACAGGGGAGAATGGGCATAGGCCTGCTGTGCGCATCTGGGATGTGGAGGAGAAGAATCAGGTGGCGGAGAT 

GCTAGGCCACAAGTATGGTGTGGCGTGTGTGGCCTTCTCACCCAATATGAAGCACATCGTGTCCATGGGCTACCAAC 

ATGACATGGTGCTCAACGTCTGGGACTGGAAGAAAGACATCGTAGTGGCCTCCAACAAGGTATCTTGTAGAGTCATT 

GCCCTCTCCTTCTCAGAGGACAGCAGCTATTTTGTCACTGTTGGGAACCGCCATGTGAGGTTCTGGTTCTTGGAAGT 

CTCCACTGAGACAAAGGTGACGAGCACAGTGCCCCTTGTAGGGCGCTCGGGCATCCTGGGCGAGCTGCACAACAACA 

TCTTCTGTGGTGTGGCCTGCGGTCGGGGCCGGATGGCGGGCAGTACCTTCTGTGTGTCCTACTCGGGCCTCCTCTGC 

CAGTTCAATGAGAAGAGGGTGCTGGAGAAGTGGATCAACCTGAAGGTCTCCCTGTCTTCCTGCCTCTGTGTCAGCCA 

GGAGCTCATCTTCTGTGGCTGCACAGATGGGATAGTCCGCATCTTCCAGGCCCATAGCCTGCACTACCTCGCCAACC 

TGCCCAAGCCACACTACCTTGGGGTAGACGTGGCACAGGGCCTGGAGCCCAGGAAGGCGGAAGCAGTCTACCCAGAT 

ACAGTGGCACTGACCTTCGACCCCATCCACCAGTGGCTGTCCTGCGTGTATAAGGACCACAGCATCTACATCTGGGA 

TGTCAAGGACATCAACAGAGTGGGCAAGGTGTGGTCAGAGCTCTTCCACAGCTCCTACGTTTGGAACGTGGAGGTGT 

ATCCTGAGTTTGAAGACCAGAGAGCTTGTTTGCCATCAGGATCCTTTCTGACTTGTTCTTCAGACAACACCATTCGC 

TTCTGGAACTTGGACAGCAGCCCTGATTCTCACTGGCAGAAAAACATCTTCAGCAATTTGGCTCCTGCTCTGCAGAT 

GTGTGGACGTGGCCACTCCCGGCAGCCCAACACACCCAGCCCAGGTGAAATTGCTTCCTAAGTCTTGAGATGGAATT 

CCCTGGGAGAGCACCTGATGAGCCAGCCCATTCAGCTGCCACACAACCCTGCTGAAGGTCGTGTACGTGGAGAATGA 

CATCCAGCACCTGCAGGACATGTCACACTTCCCAGACCGGGGGAGCGAGAATGGGACACCCATGGACGTGAAAGCCG 

GGGTGCGGGTCATGCAGGTCAGTCCTGACGGCCAGCATTTGGCTTCAGGCGACCGAAGTGGAAATCTGAGGATCCAC 

GAGCTGCACTTCATGGACGAGCTGGTCAAGGTGGAGGCCCATGATGCTGAGGTGCTGTGCCTGGAGTACTCCAAGCC 

AGAGACGGGGCTGACCTTGCTGGCCTCAGCCAGTCGGGACCGGCTGATCCATGTGCTGAACGTGGAGAAGAACTACA 

ACCTGGAGCAGACGCTGGATGACCACTCCTCCTCCATCACCGCCATCAAGTTCGCTGGCAACAGAGACATCCAGATG 

ATCAGCTGTGGGGCTGACAAGAGCATCTACTTTCGCAGTGCCCAGCAGGGTTCGGATGGACTACACTTTGTCCGTAC 

CCACCACGTAGCAGAGAAAACCACCTTGTATGACATGGACATTGACATCACCCAGAAGTACGTGGCCGTGGCCTGCC 

AGGACCGCAATGTGAGAGTCTACAACACTGTGAACGGGAAGCAGAAGAAGTGCTACAAGGGCTCCCAGGGTGACGAA 

GGGTCCTTGCTGAAGGTCCATGTGGACCCCTCAGGCACCTTCCTGGCCACCAGCTGCTCTGACAAAAGCATCTCAGT 

GATTGACTTTTACTCGGGCGAGTGCATTGCCAAGATGTTTGGCCATTCAGAAATTATTACCAGCATGAAGTTCACCT 

ATGACTGTCATCACTTGATCACAGTATCTGGAGACAGCTGCGTGTTCATCTGGCACCTGGGCCCGGAGATCACCAAC 

TGCATGAAGCAGCACTTGCTGGAGATTGACCACCGGCAGCAGCAGCAGCACACAAATGACAAGAAGCGGAGTGGCCA 

CCCCAGGCAGGATACGTATGTGTCCACACCTAGTGAGATTCACTCCCTGAGCCCTGGAGAGCAAACAGA.GGATGATC 

TGGAGGAAGAGTGTGAGCCAGAAGAGATGCTGAAGACACCATCCAAAGATAGCTTGGATCCAGATCCTCGTTGCCTG 

CTAACCAACGGCAAGCTGCCACTGTGGGCAAAGCGGCTGCTAGGGGACGATGATGTGGCAGATGGCTCGGCCTTCCA 
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CGCCAAGCGCAGCTACCAGCCCCACGGCCGCTGGGCAGAGCGGGCCGGCCAAGAGCCCCTCAAGACCATCCTGGATG 

CCCAGGACCTGGATTGCTACTTTACCCCCATGAAGCCCGAGAGTCTGGAGAACTCCATTCTGGATTCACTGGAGCCA 

CAGAGCCTGGCCAGCCTGCTGAGTGAGTCAGAGAGTCCCCAGGAAGCTGGCCGCGGGCACCCCTCCTTCCTGCCCCA 

GCAGAAGGAATCATCTGAGGCCAGTGAGCTCATCCTCTACTCTCTGGAGGCAGAAGTGACAGTCACAGGGACAGACA 

GCCAGTATTGCAGGAAGGAGGTGGAGGCCGGGCCTGGAGACCAGCAGGGCGACTCCTACCTCAGGGTGTCCTCCGAC 

AGCCCAAAGGACCAGAGCCCGCCTGAGGACTCGGGGGAGTCAGAGGCCGACCTGGAGTGCAGCTTCGCAGCCATCCA 

CTCCCCAGCTCCGCCTCCTGACCCTGCCCCTCGGTTTGCCACGTCGCTGCCCCATTTCCCAGGATGCGCAGGTCCCA 

CAGAAGATGAGCTGTCCCTGCCCGAGGGACCCAGCGTCCCCAGCAGCTCCCTACCCCAGACTCCGGAGCAGGAGAAG 

TTCCTCCGCCACCACTTTGAGACACTGACTGAGTCCCCCTGCAGAGAGCTCTTCCCCGCAGCTCTGGGAGACGTGGA 

GGCCTCTGAAGCTGAAGACCACTTCTTCAACCCACGCCTGAGTATCTCCACGCAGTTCCTCTCAAGCCTCCAGAAGG 

CATCCAGGTTCACCCATACCTTCCCTCCCCGGGCAACCCAGTGCCTTGTGAAGTCTCCAGAGGTCAAGCTCATGGAC 

CGAGGCGGAAGCCAGCCCAGAGCAGGTACTGGCTACGCCTCCCCAGACAGGACCCACGTCCTCGCTGCAGGGAAGGC 

TGAAGAGACCCTGGAGGCCTGGCGCCCACCACCTCCCTGCCTTACGAGCCTGGCGTCCTGTGTCCCTGCTTCCTCCG 

TGCTGCCCACAGACAGGAATCTCCCAACGCCCACATCTGCACCCACCCCAGGCCTGGCTCAGGGTGTCCATGCCCCC 

TCCACCTGTTCCTACATGGAGGCCACTGCCAGCTCCCGTGCCAGGATATCACGCAGCATCTCCCTCGGTGACAGTGA 

GGGCCCTATCGTGGCCACACTGGCCCAGCCCCTCCGTAGGCCATCGTCCGTTGGGGAGCTGGCCTCCTTGGGCCAGG 

AGCTTCAGGCCATCACCACCGCGACAACACCCAGTTTGGACAGTGAGGGCCAAGAGCCTGCCCTGCGTTCCTGGGGC 

AACCACGAGGCCCGGGCCAACCTGAGACTGACCCTGTCAAGTGCCTGTGATGGGCTCCTGCAGCCCCCCGTGGATAC 

CCAGCCTGGCGTCACCGTCCCTGCAGTGAGCTTCCCAGCCCCTAGCCCTGTGGAAGAGAGCGCCCTGAGGCTCCACG 

GCTCTGCCTTTCGCCCAAGTCTCCCAGCTCCTGAGTCCCCTGGCCTTCCTGCCCACCCCAGTAACCCCCAGCTTCCA 

GAGGCCCGGCCTGGCATCCCTGGCGGCACTGCCTCCCTCCTGGAGCCCACCTCCGGTGCACTTGGTCTGTTCCAGGG 

CAGCCCTGCCCGCTGGAGTGAGCCCTGGGTGCCGGTTGAAGCCCTGCCCCCATCTCCCCTTGAGCTGAGCAGGGTGG 

GGAACATCTTGCACAGGCTGCAGACCACCTTCCAAGAAGCCCTCGACCTTTACCGTGTGTTGGTCTCCAGTGGCCAG 

GTGGACACCGGGCAGCAGCAGGCACGGACTGAGCTGGTCTCCACCTTCCTGTGGATCCACAGCCAGCTGGAGGCTGA 

ATGCCTGGTGGGGACTAGTGTGGCCCCAGCCCAGGCTCTGCCCAGCCCAGGACCCCCGTCCCCACCGACGCTGTACC 

CCCTGGCCAGCCCAGACCTGCAGGCCCTGCTGGAACACTACTCGGAGCTGCTGGTGCAGGCCGTGCGGAGGAAGGCA 

CGGGGGCACTGAGGGCGCAGCCCCTCCACCGCAGCCCTGCTGCTTCTGAGGACTTAGGTATTTTAAGCGAATAAACT 

GACAGCTTTGAGGAATGGTTCCTG 

SEQ ID NO: 1562 

>AA281370 # transcript_5 #len 4671 (Includes node 31 - TAA seg 10; node 56) 

CGGCGTGACGATGGCGGCCGTAGGGTCCGGAGGCTATGCGCGGAACGATGCAGGGGAGAAGCTGCCCTCTGTCATGG 

CGGGAGTTCCGGCGCGGAGGGGCCAGTCCTCCCCGCCCCCCGCCCCACCAATCTGCCTACGGCGGCGGACGCGACTC 

TCGACGGCCTCCGAGGAGACGGTGCAGAACCGGGTGTCACTCGAGAAGGTGCTTGGCATCACAGCCCAGAACAGCAG 

TGGCCTAACCTGTGACCCCGGCACAGGCCATGTGGCCTACCTGGCAGGCTGTGTGGTGGTGATTTTGGACCCCAAGG 

AGAACAAGCAGCAGCACATCTTTAACACCGCCAGGAAGTCTCTCAGTGCTCTGGCCTTCTCCCCTGATGGGAAGTAC 

ATAGTGACAGGGGAGAATGGGCATAGGCCTGCTGTGCGCATCTGGGATGTGGAGGAGAAGAATCAGGTGGCGGAGAT 

GCTAGGCCACAAGTATGGTGTGGCGTGTGTGGCCTTCTCACCCAATATGAAGCACATCGTGTCCATGGGCTACCAAC 
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ATGACATGGTGCTCAACGTCTGGGACTGGAAGAAAGACATCGTAGTGGCCTCCAACAAGGTATCTTGTAGAGTCATT 

GCCCTCTCCTTCTCAGAGGACAGCAGCTATTTTGTCACTGTTGGGAACCGCCATGTGAGGTTCTGGTTCTTGGAAGT 

CTCCACTGAGACAAAGGTGACGAGCACAGTGCCCCTTGTAGGGCGCTCGGGCATCCTGGGCGAGCTGCACAACAACA 

TCTTCTGTGGTGTGGCCTGCGGTCGGGGCCGGATGGCGGGCAGTACCTTCTGTGTGTCCTACXCGGGCCTCCTCTGC 

CAGTTCAATGAGAAGAGGGTGCTGGAGAAGTGGATCAACCTGAAGGTCTCCCTGTCTTCCTGCCTCTGTGTCAGCCA 

GGAGCTCATCTTCTGTGGCTGCACAGATGGGATAGTCCGCATCTTCCAGGCCCATAGCCTGCACTACCTCGCCAACC 

TGCCCAAGCCACACTACCTTGGGGTAGACGTGGCACAGGGCCTGGAGCCCAGCTTCCTCTTCCACAGGAAGGCGGAA 

GCAGTCTACCCAGATACAGTGGCACTGACCTTCGACCCCATCCACCAGTGGCTGTCCTGCGTGTATAAGGACCACAG 

CATCTACATCTGGGATGTCAAGGACATCAACAGAGTGGGCAAGGTGTGGTCAGAGCTCTTCCACAGCTCCTACGTTT 

GGAACGTGGAGGTGTATCCTGAGTTTGAAGACCAGAGAGCTTGTTTGCCATCAGGATCCTTTCTGACTTGTTCTTCA 

GACAACACCATTCGCTTCTGGAACTTGGACAGCAGCCCTGATTCTCACTGGCAGAAAAACATCTTCAGCAATACCCT 

GCTGAAGGTCGTGTACGTGGAGAATGACATCCAGCACCTGCAGGACATGTCACACTTCCCAGACCGGGGGAGCGAGA 

ATGGGACACCCATGGACGTGAAAGCCGGGGTGCGGGTCATGCAGGTCAGTCCTGACGGCCAGCATTTGGCTTCAGGC 

GACCGAAGTGGAAATCTGAGGATCCACGAGCTGCACTTCATGGACGAGCTGGTCAAGGTGGAGGCCCATGATGCTGA 

GGTGCTGTGCCTGGAGTACTCCAAGCCAGAGACGGGGCTGACCTTGCTGGCCTCAGCCAGTCGGGACCGGCTGATCC 

ATGTGCTGAACGTGGAGAAGAACTACAACCTGGAGCAGACGCTGGATGACCACTCCTCCTCCATCACCGCCATCAAG 

TTCGCTGGCAACAGAGACATCCAGATGATCAGCTGTGGGGCTGACAAGAGCATCTACTTTCGCAGTGCCCAGCAGGG 

TTCGGATGGACTACACTTTGTCCGTACCCACCACGTAGCAGAGAAAACCACCTTGTATGACATGGACATTGACATCA 

CCCAGAAGTACGTGGCCGTGGCCTGCCAGGACCGCAATGTGAGAGTCTACAACACTGTGAACGGGAAGCAGAAGAAG 

TGCTACAAGGGCTCCCAGGGTGACGAAGGGTCCTTGCTGAAGGTCCATGTGGACCCCTCAGGCACCTTCCTGGCCAC 

CAGCTGCTCTGACAAAAGCATCTCAGTGATTGACTTTTACTCGGGCGAGTGCATTGCCAAGATGTTTGGCCATTCAG 

AAATTATTACCAGCATGAAGTTCACCTATGACTGTCATCACTTGATCACAGTATCTGGAGACAGCTGCGTGTTCATC 

TGGCACCTGGGCCCGGAGATCACCAACTGCATGAAGCAGCACTTGCTGGAGATTGACCACCGGCAGCAGCAGCAGCA 

CACAAATGACAAGAAGCGGAGTGGCCACCCCAGGCAGGATACGTATGTGTCCACACCTAGTGAGATTCACTCCCTGA 

GCCCTGGAGAGCAAACAGAGGATGATCTGGAGGAAGAGTGTGAGCCAGAAGAGATGCTGAAGACACCATCCAAAGAT 

AGCTTGGATCCAGATCCTCGTTGCCTGCTAACCAACGGCAAGCTGCCACTGTGGGCAAAGCGGCTGCTAGGGGACGA 

TGATGTGGCAGATGGCTCGGCCTTCCACGCCAAGCGCAGCTACCAGCCCCACGGCCGCTGGGCAGAGCGGGCCGGCC 

AAGAGCCCCTCAAGACCATCCTGGATGCCCAGGACCTGGATTGCTACTTTACCCCCATGAAGCCCGAGAGTCTGGAG 

AACTCCATTCTGGATTCACTGGAGCCACAGAGCCTGGCCAGCCTGCTGAGTGAGTCAGAGAGTCCCCAGGAAGCTGG 

CCGCGGGCACCCCTCCTTCCTGCCCCAGCAGAAGGAATCATCTGAGGCCAGTGAGCTCATCCTCTACTCTCTGGAGG 

CAGAAGTGACAGTCACAGGGACAGACAGCCAGTATTGCAGGAAGGAGGTGGAGGCCGGGCCTGGAGACCAGCAGGGC 

GACTCCTACCTCAGGGTGTCCTCCGACAGCCCAAAGGACCAGAGCCCGCCTGAGGACTCGGGGGAGTCAGAGGCCGA 

CCTGGAGTGCAGCTTCGCAGCCATCCACTCCCCAGCTCCGCCTCCTGACCCTGCCCCTCGGTTTGCCACGTCGCTGC 

CCCATTTCCCAGGATGCGCAGGTCCCACAGAAGATGAGCTGTCCCTGCCCGAGGGACCCAGCGTCCCCAGCAGCTCC 

CTACCCCAGACTCCGGAGCAGGAGAAGTTCCTCCGCCACCACTTTGAGACACTGACTGAGTCCCCCTGCAGAGAGCT 

CTTCCCCGCAGCTCTGGGAGACGTGGAGGCCTCTGAAGCTGAAGACCACTTCTTCAACCCACGCCTGAGTATCTCCA 

CGCAGTTCCTCTCAAGCCTCCAGAAGGCATCCAGGTTCACCCATACCTTCCCTCCCCGGGCAACCCAGTGCCTTGTG 

AAGTCTCCAGAGGTCAAGCTCATGGACCGAGGCGGAAGCCAGCCCAGAGCAGGTACTGGCTACGCCTCCCCAGACAG 
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GACCCACGTCCTCGCTGCAGGGAAGGCTGAAGAGACCCTGGAGGCCTGGCGCCCACCACCTCCCTGCCTTACGAGCC 

TGGCGTCCTGTGTCCCTGCTTCCTCCGTGCTGCCCACAGACAGGAATCTCCCAACGCCCACATCTGCACCCACCCCA 

GGCCTGGCTCAGGGTGTCCATGCCCCCTCCACCTGTTCCTACATGGAGGCCACTGCCAGCTCCCGTGCCAGGATATC 

ACGCAGCATCTCCCTCGGTGACAGTGAGGGCCCTATCGTGGCCACACTGGCCCAGCCCCTCCGTAGGCCATCGTCCG 

TTGGGGAGCTGGCCTCCTTGGGCCAGGAGCTTCAGGCCATCACCACCGCGACAACACCCAGTTTGGACAGTGAGGGC 

CAAGAGCCTGCCCTGCGTTCCTGGGGCAACCACGAGGCCCGGGCCAACCTGAGACTGACCCTGTCAAGTGCCTGTGA 

TGGGCTCCTGCAGCCCCCCGTGGATACCCAGCCTGGCGTCACCGTCCCTGCAGTGAGCTTCCCAGCCCCTAGCCCTG 

TGGAAGAGAGCGCCCTGAGGCTCCACGGCTCTGCCTTTCGCCCAAGTCTCCCAGCTCCTGAGTCCCCTGGCCTTCCT 

GCCCACCCCAGTAACCCCCAGCTTCCAGAGGCCCGGCCTGGCATCCCTGGCGGCACTGCCTCCCTCCTGGAGCCCAC 

CTCCGGTGCACTTGGTCTGTTCCAGGGCAGCCCTGCCCGCTGGAGTGAGCCCTGGGTGCCGGTTGAAGCCCTGCCCC 

CATCTCCCCTTGAGCTGAGCAGGGTGGGGAACATCTTGCACAGGCTGCAGACCACCTTCCAAGAAGCCCTCGACCTT 

TACCGTGTGTTGGTCTCCAGTGGCCAGGTGGACACCGGGCAGCAGCAGGCACGGACTGAGCTGGTCTCCACCTTCCT 

GTGGATCCACAGCCAGCTGGAGGCTGAATGCCTGGTGGGGACTAGTGTGGCCCCAGCCCAGGCTCTGCCCAGCCCAG 

GACCCCCGTCCCCACCGACGCTGTACCCCCTGGCCAGCCCAGACCTGCAGGCCCTGCTGGAACACTACTCGGAGCTG 

CTGGTGCAGGCCGTGCGGAGGAAGGCACGGGGGCACTGAGGGCGCAGCCCCTCCACCGCAGCCCTGCTGCTTCTGAG 

GACTTAGGTATTTTAAGCGAATAAACTGACAGCTTTGAGGAATGGTTCCTG 

SEQ ID NO:1563 

> AA281370 pi # trn_0 #len 1044 

MSHFPDRGSENGTPMDVKAGVRVMQVSPDGQHLASGDRSGNLRIHELHFMDELVKVEAHDAEVLCLEYSKPETGLTL 
LASASRDRLIHVLNVEKNYNLEQTLDDHSSSITAIKFAGNRDIQMI SCGADKSIYFRSAQQGSDGLHFVRTHHVAEK 
TTLYDMDIDITQKYVAVACQDRNVRVYNTVNGKQKKCYKGSQGDEGSLLKVHVDPSGTFLATSCSDKSISVIDFYSG 
ECIAKMFGHSEIITSMKFTYDCHHLITVSGDSCVFIWHLGPEITNCMKQHLLEIDHRQQQQHTNDKKRSGHPRQDTY 
VSTPSEIHSLSPGEQTEDDLEEECEPEEMLKTPSKDSLDPDPRCLLTNGKLPLWAKRLLGDDDVADGSAFHAKRSYQ 
PHGRWAERAGQEPLKTILDAQDLDCYFTPMKPESLENSILDSLEPQSLASLLSESESPQEAGRGHPSFLPQQKESSE 
ASELILYSLEAEVTVTGTDSQYCRKEVEAGPGDQQGDSYLRVSSDSPKDQSPPEDSGESEADLECSFAAIHSPAPPP 
DPAPRFATSLPHFPGCAGPTEDELSLPEGPSVPSSSLPQTPEQEKFLRHHFETLTESPCRALGDVEASEAEDHFFNP 
RLSISTQFLSSLQKASRFTHTFPPRATQCLVKSPEVKLMDRGGSQPRAGTGYASPDRTHVLAAGKAEETLEAWRPPP 
PCLTSLASCVPASSVLPTDRNLPTPTSAPTPGLAQGVHAPSTCSYMEATASSRARISRSISLGDSEGPIVATLAQPL 
RRPSSVGELASLGQELQAITTATTPSLDSEGQEPALRSWGNHEARANLRLTLSSACDGLLQPPVDTQPGVTVPAVSF 
PAPSPVEESALRLHGSAFRPSLPAPESPGLPAHPSNPQLPEARPGIPGGTASLLEPTSGALGLFQGSPARWSEPWVP 
VEALPPSPLELSRVGNILHRLQTTFQEALDLYRVLVSSGQVDTGQQQARTELVSTFLWIHSQLEAECLVGTSVAPAQ 

ALPSPGPPSPPTLYPLASPDLQALLEHYSELLVQAVRRKARGH 
SEQ ID NO:1564 

> AA281370 p2 # trn_l Hen 1521 

GVTMAAVGSGGYARNDAGEKLPSViyiAGVPARRGQSSPPPAPPICLRRRTRLSTASEETVQNRVSLEKVLGITAQNSS 
GLTCDPGTGHVAYLAGCWVILDPKENKQQHIFNTARKSLSALAFSPDGKYIVTGENGHRPAVRIWDVEEKNQVAEM 
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LGHKYGVACVAFSPNMKHIVSMGYQHDMVLNVWDWKKDIVVASNKVSCRVIALSFSEDSSYFVTVGNRHVRFWFLEV 
STETKVTSTVPLVGRSGILGELHNNIFCGVACGRGRMAGSTFCVSYSGLLCQFNEKRVLEKWINLKVSLSSCLCVSQ 
ELIFCGCTDGIVRIFQAHSLHYLANLPKPHYLGVDVAQGLEPSFLFHRKAEAVYPDTVALTFDPIHQWLSCVYKDHS 
IYIWDVKDINRVGKVWSELFHSSYVWNVEVYPEFEDQRACLPSGSFLTCSSDNTIRFWNLDSSPDSHWQKNIFSNTL 
LKVVYVENDIQHLQDMSHFPDRGSENGTPMDVKAGVRVMQVSPDGQHLASGDRSGNLRIHELHFMDELVKVEAHDAE 
VLCLEYSKPETGLTLLASASRDRLIHVLNVEKNYNLEQTLDDHSSSITAIKFAGNRDIQMISCGADKSIYFRSAQQG 
SDGLHFVRTHHVAEKTTLYDMDI DITQKYVAVACQDRNVRVYNTVNGKQKKCYKGSQGDEGSLLKVHVDPSGTFLAT 
SCSDKSISVIDFYSGECIAKMFGHSEIITSMKFTYDCHHLITVSGDSCVFIWHLGPEITNCMKQHLLEIDHRQQQQH 
TNDKKRSGHPRQDTYVSTPSEIHSLSPGEQTEDDLEEECEPEEMLKTPSKDSLDPDPRCLLTNGKLPLWAKRLLGDD 
DVADGSAFHAKRSYQPHGRWAERAGQEPLKTILDAQDLDCYFTPMKPESLENSILDSLEPQSLASLLSESESPQEAG 
RGHPSFLPQQKESSEASELILYSLEAEVTVTGTDSQYCRKEVEAGPGDQQGDSYLRVSSDSPKDQSPPEDSGESEAD 
LECSFAAIHSPAPPPDPAPRFATSLPHFPGCAGPTEDELSLPEGPSVPSSSLPQTPEQEKFLRHHFETLTESPCRAL 
GDVEASEAEDHFFNPRLSISTQFLSSLQKASRFTHTFPPRATQCLVKSPEVKLMDRGGSQPRAGTGYASPDRTHVLA 
AGKAEETLEAWRPPPPCLTSLASCVPASSVLPTDRNLPTPTSAPTPGLAQGVHAPSTCSYMEATASSRARISRSISL 
GDSEGPIVATLAQPLRRPSSVGELASLGQELQAITTATTPSLDSEGQEPALRSWGNHEARANLRLTLSSACDGLLQP 
PVDTQPGVTVPAVSFPAPSPVEESALRLHGSAFRPSLPAPESPGLPAHPSNPQLPEARPGIPGGTASLLEPTSGALG 
LFQGSPARWSEPWVPVEALPPSPLELSRVGNILHRLQTTFQEALDLYRVLVSSGQVDTGQQQARTELVSTFLWIHSQ 
LEAECLVGTSVAPAQALPSPGPPSPPTLYPLASPDLQALLEHYSELLVQAVRRKARGH 

SEQ ID NO:1565 

> AA281370 p5 # trn_4 #len 1049 

MSHFPDRGSENGTPMDVKAGVRVMQVSPDGQHLASGDRSGNLRIHELHFMDELVKVEAHDAEVLCLEYSKPETGLTL 
LASASRDRLIHVLNVEKNYNLEQTLDDHSSSITAIKFAGNRDIQMISCGADKSIYFRSAQQGSDGLHFVRTHHVAEK 
TTLYDMDIDITQKYVAVACQDRNVRVYNTVNGKQKKCYKGSQGDEGSLLKVHVDPSGTFLATSCSDKSISVIDFYSG 
ECIAKMFGHSEIITSMKFTYDCHHLITVSGDSCVFIWHLGPEITNCMKQHLLEIDHRQQQQHTNDKKRSGHPRQDTY 
VSTPSEIHSLSPGEQTEDDLEEECEPEEMLKTPSKDSLDPDPRCLLTNGKLPLWAKRLLGDDDVADGSAFHAKRSYQ 
PHGRWAERAGQEPLKTILDAQDLDCYFTPMKPESLENSILDSLEPQSLASLLSESESPQEAGRGHPSFLPQQKESSE 
ASELILYSLEAEVTVTGTDSQYCRKEVEAGPGDQQGDSYLRVSSDSPKDQSPPEDSGESEADLECSFAAIHSPAPPP 
DPAPRFATSLPHFPGCAGPTEDELSLPEGPSVPSSSLPQTPEQEKFLRHHFETLTESPCRELFPAALGDVEASEAED 
HFFNPRLSISTQFLSSLQKASRFTHTFPPRATQCLVKSPEVKLMDRGGSQPRAGTGYASPDRTHVLAAGKAEETLEA 
WRPPPPCLTSLASCVPASSVLPTDRNLPTPTSAPTPGLAQGVHAPSTCSYMEATASSRARISRSISLGDSEGPIVAT 
LAQPLRRPSSVGELASLGQELQAITTATTPSLDSEGQEPALRSWGNHEARANLRLTLSSACDGLLQPPVDTQPGVTV 
PAVSFPAPSPVEESALRLHGSAFRPSLPAPESPGLPAHPSNPQLPEARPGIPGGTASLLEPTSGALGLFQGSPARWS 
EPWVPVEALPPSPLELSRVGNILHRLQTTFQEALDLYRVLVSSGQVDTGQQQARTELVSTFLWIHSQLEAECLVGTS 
VAP AQ AL PSPGPPSPP T L Y PLAS P DLQ ALLEH Y S ELL VQ AVRRKARG H 

SEQ ID NO:1566 

> AA281370 p6 # trn_5 #len 1526 
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GVTMAAVGSGGYARNDAGEKLPSVMAGVPARRGQSSPPPAPPICLRRRTRLSTASEETVQNRVSLEKVLGITAQNSS 

GLTCDPGTGHVAYLAGCVVVIIjDPKENKQQHI FNTARKSLSALAFSPDGKYIVTGENGHRPAVRIWDVEEKNQVAEM 

LGHKYGVACVAFSPNMKHIVSMGYQHDMVLNVWDWKKDIVVASNKVSCRVIALSFSEDSSYFVTVGNRHVRFWFLEV 

STETKVTSTVPLVGRSGILGELHNNIFCGVACGRGRMAGSTFCVSYSGLLCQFNEKRVLEKWINLKVSLSSCLCVSQ 

ELIFCGCTDGIVRIFQAHSLHYLANLPKPHYLGVDVAQGLEPSFLFHRKAEAVYPDTVALTFDPIHQWLSCVYKDHS 

IYIWDVKDINRVGKVWSELFHSSYVWNVEVYPEFEDQRACLPSGSFLTCSSDNTIRFWNLDSSPDSHWQKNIFSNTL 

LKVVYVENDIQHLQDMSHFPDRGSENGTPMDVKAGVRVMQVSPDGQHLASGDRSGNLRIHELHFMDELVKVEAHDAE 

VLCLEYSKPETGLTLLASASRDRLIHVLNVEKNYNLEQTLDDHSSSITAIKFAGNRDIQMISCGADKSIYFRSAQQG 

SDGLHFVRTHHVAEKTTLYDMDIDITQKYVAVACQDRNVRVYNTVNGKQKKCYKGSQGDEGSLLKVHVDPSGTFLAT 

SCSDKSISVIDFYSGECIAKMFGHSEIITSMKFTYDCHHLITVSGDSCVFIWHLGPEITNCMKQHLLEIDHRQQQQH 

TNDKKRSGHPRQDTYVSTPSEIHSLSPGEQTEDDLEEECEPEEMLKTPSKDSLDPDPRCLLTNGKLPLWAKRLLGDD 

DVADGSAFHAKRSYQPHGRWAERAGQEPLKTILDAQDLDCYFTPMKPESLENSILDSLEPQSLASLLSESESPQEAG 

RGHPSFLPQQKESSEASELILYSLEAEVTVTGTDSQYCRKEVEAGPGDQQGDSYLRVSSDSPKDQSPPEDSGESEAD 

LECSFAAIHSPAPPPDPAPRFATSLPHFPGCAGPTEDELSLPEGPSVPSSSLPQTPEQEKFLRHHFETLTESPCREL 

FPAALGDVEASEAEDHFFNPRLSISTQFLSSLQKASRFTHTFPPRATQCLVKSPEVKLMDRGGSQPRAGTGYASPDR 

THVLAAGKAEETLEAWRPPPPCLTSLASCVPASSVLPTDRNLPTPTSAPTPGLAQGVHAPSTCSYMEATASSRARIS 

RSISLGDSEGPIVATLAQPLRRPSSVGELASLGQELQAITTATTPSLDSEGQEPALRSWGNHEARANLRLTLSSACD 

GLLQPPVDTQPGVTVPAVSFPAPSPVEESALRLHGSAFRPSLPAPESPGLPAHPSNPQLPEARPGIPGGTASLLEPT 

SGALGLFQGSPARWSEPWVPVEALPPSPLELSRVGNILHRLQTTFQEALDLYRVLVSSGQVDTGQQQARTELVSTFL 

WIHSQLEAECLVGTSVAPAQALPSPGPPSPPTLYPLASPDLQALLEHYSELLVQAVRRKARGH 

SEQ ID NO:1567 

>AA281370 # node_31 (TAA seg 10) #len 122 

GGTTCGGATGGACTACACTTTGTCCGTACCCACCACGTAGCAGAGAAAACCACCTTGTAT 
GACATGGACATTGACATCACCCAGAAGTACGTGGCCGTGGCCTGCCAGGACCGCAATGTG 
AG 

SEQ ID NO:1568 

>AA281370 # node_56 (no TAA seg) #len 15 
AGCTCTTCCCCGCAG 

SEQ ID NO:1569 

>AA281370 - Unique aa seq coded by node 56 
ELFPA 

SEQ ID NO:1570 
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>AA281370 - The elongated unique aa seq not found in SwissProt 
(Q96AD9) encoded by Tl and 5. Note:even after the elongation the protein is 
probably partial. 

GVTMAAVGSGGYARNDAGEKLPSVMAGVPARRGQSSPPPAPPICLRRRTRLSTASEETVQNRVSLEKVLGITAQNSS 
GLTCDPGTGHVAYLAGCVVVILDPKENKQQHIFNTARKSLSALAFSPDGKYIVTGENGHRPAVRIWDVEEKNQVAEM 
LGHKYGVACVAFSPNMKHIVSMGYQHDMVLNVWDWKKDIVVASNKVSCRVIALSFSEDSSYFVTVGNRHVRFWFLEV 
STETKVTSTVPLVGRSGILGELHNNIFCGVACGRGRMAGSTFCVSYSGLLCQFNEKRVLEKWINLKVSLSSCLCVSQ 
ELIFCGCTDGIVRIFQAHSLHYLANLPKPHYLGVDVAQGLEPSFLFHRKAEAVYPDTVALTFDPIHQWLSCVYKDHS 
IYIWDVKDINRVGKVWSELFHSSYVWNVEVYPEFEDQRACLPSGSFLTCSSDNTIRFWNLDSSPDSHWQKNIFSNTL 

LK V VY VEND I QH LQDMS HF P DRG S ENGT PMDVK AGVRVMQ VS P DGQHLA S G DRS GNLR I HE LH FMDE L VK VE AHD AE 
VLCLEYSKPETGLTLLASASRDRLIHVLNVEKNYNLEQTLDDHSSSITAIKFAGNRDIQMI SCGADKSIYFRSAQQG 
SDGLHFVRTHHVAEKTTLYDMDIDITQKYVAVACQDRNVRVYNTVNGKQKKCYKGSQGDEGSLLKVHVDPSGTFLAT 
SCSDKSISVIDFYSGECIAKMFGHSEI ITSMKFTYDCHHLITVSGDSCVFIWHLGPEITNCMKQHLLEIDHRQQQQH 
TNDKKRSGHPRQDTYVSTPSEIHSLSPGEQTEDDLEEECEPEEMLKTPSKDSLDPDPRCLLTNGKLPLWAKRLLGDD 

DVADGSAFHAKRSYQPHGRWAERAGQEPLKTILDAQDLDCYFTP 
SEQ ID NO:1571 

>AA281370 - The elongated unique aa seq not found in SwissProt 
(Q96AD9) encoded by TO and 4. 

MSHFPDRGSENGTPMDVKAGVRVMQVSPDGQHLASGDRSGNLRIHELHFMDELVKVEAHDAEVLCLEYSKPETGLTL 
LASASRDRLIHVLNVEKNYNLEQTLDDHSSSITAIKFAGNRDIQMISCGADKSIYFRSAQQGSDGLHFVRTHHVAEK 
TTLYDMDIDITQKYVAVACQDRNVRVYNTVNGKQKKCYKGSQGDEGSLLKVHVDPSGTFLATSCSDKSISVIDFYSG 
ECIAKMFGHSEIITSMKFTYDCHHLITVSGDSCVFXWHLGPEITNCMKQHLLEIDHRQQQQHTNDKKRSGHPRQDTY 
VSTPSEIHSLSPGEQTEDDLEEECEPEEMLKTPSKDSLDPDPRCLLTNGKLPLWAKRLLGDDDVADGSAFHAKRSYQ 

PHGRWAERAGQEPLKTILDAQDLDCYFTP 

SEQ ID NO: 1572 

>Forward primer: 

ACTCACTCAGAGACTAACACAAAGGAAG 

SEQ ID NO:1573 

>Reverse primer: 

AGTATGGGAAGAATTTACTGGTCACA 

SEQ ID NO:1574 
>Amplicon : 

ACTCACTCAGAGACTAACACAAAGGAAGTAATTTCTTACCTGGTCATTATTTAGTCTACAATAAGTTCATCCTTCTT 
CAGTGTGACCAGTAAATTCTTCCCATACT 
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SEQ ID NO: 1575 

>Z21368 # transcriptj lien 3483 (includes node 51 - TAA-seg 19) 

GGCCATTCCAATGGAACAGACAGGGTAAGGACCAATCTGGACTGTGTTATCTTTTCCAGGTGCAAGTATGTGCTATG 

GAACTCCTAGTTATAACTATGCACCAAATATGGATAAACACTGGATTATGCAGTACACAGGACCAATGCTGCCCATC 

CACATGGAATTTACAAACATTCTACAGCGCAAAAGGCTCCAGACTTTGATGTCAGTGGATGATTCTGTGGAGAGGCT 

GTATAACATGCTCGTGGAGACGGGGGAGCTGGAGAATACTTACATCATTTACACCGCCGACCATGGTTACCATATTG 

GGCAGTTTGGACTGGTCAAGGGGAAATCCATGCCATATGACTTTGATATTCGTGTGCCTTTTTTTATTCGTGGTCCA 

AGTGTAGAACCAGGATCAATAGTCCCACAGATCGTTCTCAACATTGACTTGGCCCCCACGATCCTGGATATTGCTGG 

GCTCGACACACCTCCTGATGTGGACGGCAAGTCTGTCCTCAAACTTCTGGACCCAGAAAAGCCAGGTAACAGGTTTC 

GAACAAACAAGAAGGCCAAAATTTGGCGTGATACATTCCTAGTGGAAAGAGGCAAATTTCTACGTAAGAAGGAAGAA 

TCCAGCAAGAATATCCAACAGTCAAATCACTTGCCCAAATATGAACGGGTCAAAGAACTATGCCAGCAGGCCAGGTA 

CCAGACAGCCTGTGAACAACCGGGGCAGAAGTGGCAATGCATTGAGGATACATCTGGCAAGCTTCGAATTCACAAGT 

GTAAAGGACCCAGTGACCTGCTCACAGTCCGGCAGAGCACGCGGAACCTCTACGCTCGCGGCTTCCATGACAAAGAC 

AAAGAGTGCAGTTGTAGGGAGTCTGGTTACCGTGCCAGCAGAAGCCAAAGAAAGAGTCAACGGCAATTCTTGAGAAA 

CCAGGGGACTCCAAAGTACAAGCCCAGATTTGTCCATACTCGGCAGACACGTTCCTTGTCCGTCGAATTTGAAGGTG 

AAATATATGACATAAATCTGGAAGAAGAAGAAGAATTGCAAGTGTTGCAACCAAGAAACATTGCTAAGCGTCATGAT 

GAAGGCCACAAGGGGCCAAGAGATCTCCAGGCTTCCAGTGGTGGCAACAGGGGCAGGATGCTGGCAGATAGCAGCAA 

CGCCGTGGGCCCACCTACCACTGTCCGAGTGACACACAAGTGTTTTATTCTTCCCAATGACTCTATCCATTGTGAGA 

GAGAACTGTACCAATCGGCCAGAGCGTGGAAGGACCATAAGGCATACATTGACAAAGAGATTGAAGCTCTGCAAGAT 

AAAATTAAGAATTTAAGAGAAGTGAGAGGACATCTGAAGAGAAGGAAGCCTGAGGAATGTAGCTGCAGTAAACAAAG 

CTATTACAATAAAGAGAAAGGTGTAAAAAAGCAAGAGAAATTAAAGAGCCATCTTCACCCATTCAAGGAGGCTGCTC 

AGGAAGTAGATAGCAAACTGCAACTTTTCAAGGAGAACAACCGTAGGAGGAAGAAGGAGAGGAAGGAGAAGAGACGG 

CAGAGGAAGGGGGAAGAGTGCAGCCTGCCTGGCCTCACTTGCTTCACGCATGACAACAACCACTGGCAGACAGCCCC 

GTTCTGGAACCTGGGATCTTTCTGTGCTTGCACGAGTTCTAACAATAACACCTACTGGTGTTTGCGTACAGTTAATG 

AGACGCATAATTTTCTTTTCTGTGAGTTTGCTACTGGCTTTTTGGAGTATTTTGATATGAATACAGATCCTTATCAG 

CTCACAAATACAGTGCACACGGTAGAACGAGGCATTTTGAATCAGCTACACGTACAACTAATGGAGCTCAGAAGCTG 

TCAAGGATATAAGCAGTGCAACCCAAGACCTAAGAATCTTGATGTTGGAAATAAAGATGGAGGAAGCTATGACCTAC 

ACAGAGGACAGTTATGGGATGGATGGGAAGGTTAATCAGCCCCGTCTCACTGCAGACATCAACTGGCAAGGCCTAGA 

GGAGCTACACAGTGTGAATGAAAACATCTATGAGTACAGACAAAACTACAGACTTAGTCTGGTGGACTGGACTAATT 

ACTTGAAGGATTTAGATAGAGTATTTGCACTGCTGAAGAGTCACTATGAGCAAAATAAAACAAATAAGACTCAAACT 

GCTCAAAGTGACGGGTTCTTGGTTGTCTCTGCTGAGCACGCTGTGTCAATGGAGATGGCCTCTGCTGACTCAGATGA 

AGACCCAAGGCATAAGGTTGGGAAAACACCTCATTTGACCTTGCCAGCTGACCTTCAAACCCTGCATTTGAACCGAC 

CAACATTAAGTCCAGAGAGTAAACTTGAATGGAATAACGACATTCCAGAAGTTAATCATTTGAATTCTGAACACTGG 

AGAAAAACCGAAAAATGGACGGGGCATGAAGAGACTAATCATCTGGAAACCGATTTCAGTGGCGATGGCATGACAGA 

GCTAGAGCTCGGGCCCAGCCCCAGGCTGCAGCCCATTCGCAGGCACCCGAAAGAACTTCCCCAGTATGGTGGTCCTG 

GAAAGGACATTTTTGAAGATCAACTATATCTTCCTGTGCATTCCGATGGAATTTCAGTTCATCAGATGTTCACCATG 

GCCACCGCAGAACACCGAAGTAATTCCAGCATAGCGGGGAAGATGTTGACCAAGGTGGAGAAGAATCACGAAAAGGA 
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GAAGTCACAGCACCTAGAAGGCAGCGCCTCCTCTTCACTCTCCTCTGATTAGATGAAACTGTTACCTTACCCTAAAC 
ACAGTATTTCTTTTTAACTTTTTTATTTGTAAACTAATAAAGGTAATCACAGCCACCAACATTCCAAGCTACCCTGG 
GTACCTTTGTGCAGTAGAAGCTAGTGAGCATGTGAGCAAGCGGTGTGCACACGGAGACTCATCGTTATAATTTACTA 
TCTGCCAAGAGTAGAAAGAAAGGCTGGGGATATTTGGGTTGGCTTGGTTTTGATTTTTTGCTTGTTTGTTTGTTTTG 
TACTAAAACAGTATTATCTTTTGAATATCGTAGGGACATAAGTATATACATGTTATCCAATCAAGATGGCTAGAATG 
GTGCCTTTCTGAGTGTCTAAAACTTGACACCCCTGGTAAATCTTTCAACACACTTCCACTGCCTGCGTAATGAAGTT 
TTGATTCATTTTTAACCACTGGAATTTTTCAATGCCGTCATTTTCAGTTAGATGATTTTGCACTTTGAGATTAAAAT 
GCCATGTCTATTTGATTAGTCTTATTTTTTTATTTTTACAGGCTTATCAGTCTCACTGTTGGCTGTCATTGTGACAA 
AGTCAAATAAACCCCCAAGGACGACACACAGTATGGATCACATATTGTTTGACATTAAGCTTTTGCCAGAAAATGTT 
GCATGTGTTTTACCTCGACTTGCTAAAATCGATTAGCAGAAAGGCATGGCTAATAATGTTGGTGGTGAAAATAAATA 
AATAAGTAAACAAAATGA 

SEQ ID NO: 1576 

>Z21368 # transcript_5 #len 3344 (includes node 53 - TAA seg 27) 

GGCTGCATCTTGCCTGGTGATGTGGTCAGAATACAGGGGTGCAGGCATCTCTCCAGCCTGACCCTGGCAAGAGTCAG 
TTAATCTTGCTCAGTGCCATTGCTGTGATCACACACCCACCCTTGCCACACAACTACCCATGCCTAGGAGACCAGAT 
GAGAGGGTGAGAAGAGTTGAAGGCCAATGAGTCACTGCTGTAGAAAAAGCAGCCCTAAGTGCCACCTTCCCCTGGCA 
TTGGATCTCAGCCATCACCGTGTGCCCCTTTACAGAGTCCCACAGATCGTTCTCAACATTGACTTGGCCCCCACGAT 
CCTGGATATTGCTGGGCTCGACACACCTCCTGATGTGGACGGCAAGTCTGTCCTCAAACTTCTGGACCCAGAAAAGC 
CAGGTAACAGGTTTCGAACAAACAAGAAGGCCAAAATTTGGCGTGATACATTCCTAGTGGAAAGAGGCAAATTTCTA 
CGTAAGAAGGAAGAATCCAGCAAGAATATCCAACAGTCAAATCACTTGCCCAAATATGAACGGGTCAAAGAACTATG 
CCAGCAGGCCAGGTACCAGACAGCCTGTGAACAACCGGGGCAGAAGTGGCAATGCATTGAGGATACATCTGGCAAGC 
TTCGAATTCACAAGTGTAAAGGACCCAGTGACCTGCTCACAGTCCGGCAGAGCACGCGGAACCTCTACGCTCGCGGC 
TTCCATGACAAAGACAAAGAGTGCAGTTGTAGGGAGTCTGGTTACCGTGCCAGCAGAAGCCAAAGAAAGAGTCAACG 
GCAATTCTTGAGAAACCAGGGGACTCCAAAGTACAAGCCCAGATTTGTCCATACTCGGCAGACACGTTCCTTGTCCG 
TCGAATTTGAAGGTGAAATATATGACATAAATCTGGAAGAAGAAGAAGAATTGCAAGTGTTGCAACCAAGAAACATT 
GCTAAGCGTCATGATGAAGGCCACAAGGGGCCAAGAGATCTCCAGGCTTCCAGTGGTGGCAACAGGGGCAGGATGCT 
GGCAGATAGCAGCAACGCCGTGGGCCCACCTACCACTGTCCGAGTGACACACAAGTGTTTTATTCTTCCCAATGACT 
CTATCCATTGTGAGAGAGAACTGTACCAATCGGCCAGAGCGTGGAAGGACCATAAGGCATACATTGACAAAGAGATT 
GAAGCTCTGCAAGATAAAATTAAGAATTTAAGAGAAGTGAGAGGACATCTGAAGAGAAGGAAGCCTGAGGAATGTAG 
CTGCAGTAAACAAAGCTATTACAATAAAGAGAAAGGTGTAAAAAAGCAAGAGAAATTAAAGAGCCATCTTCACCCAT 
TCAAGGAGGCTGCTCAGGAAGTAGATAGCAAACTGCAACTTTTCAAGGAGAACAACCGTAGGAGGAAGAAGGAGAGG 
AAGGAGAAGAGACGGCAGAGGAAGGGGGAAGAGTGCAGCCTGCCTGGCCTCACTTGCTTCACGCATGACAACAACCA 
CTGGCAGACAGCCCCGTTCTGGAACCTGGGATCTTTCTGTGCTTGCACGAGTTCTAACAATAACACCTACTGGTGTT 
TGCGTACAGTTAATGAGACGCATAATTTTCTTTTCTGTGAGTTTGCTACTGGCTTTTTGGAGTATTTTGATATGAAT 
ACAGATCCTTATCAGCTCACAAATACAGTGCACACGGTAGAACGAGGCATTTTGAATCAGCTACACGTACAACTAAT 
GGAGCTCAGAAGCTGTCAAGGATATAAGCAGTGCAACCCAAGACCTAAGAATCTTGATGTTGGAAATAAAGATGGAG 
GAAGCTATGACCTACACAGAGGACAGTTATGGGATGGATGGGAAGGTTAATCAGCCCCGTCTCACTGCAGACATCAA 
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CTGGCAAGGCCTAGAGGAGCTACACAGTGTGAATGAAAACATCTATGAGTACAGACAAAACTACAGACTTAGTCTGG 

TGGACTGGACTAATTACTTGAAGGATTTAGATAGAGTATTTGCACTGCTGAAGAGTCACTATGAGCAAAATAAAACA 

AATAAGACTCAAACTGCTCAAAGTGACGGGTTCTTGGTTGTCTCTGCTGAGCACGCTGTGTCAATGGAGATGGCCTC 

TGCTGACTCAGATGAAGACCCAAGGCATAAGGTTGGGAAAACACCTCATTTGACCTTGCCAGCTGACCTTCAAACCC 

TGCATTTGAACCGACCAACATTAAGTCCAGAGAGTAAACTTGAATGGAATAACGACATTCCAGAAGTTAATCATTTG 

AATTCTGAACACTGGAGAAAAACCGAAAAATGGACGGGGCATGAAGAGACTAATCATCTGGAAACCGATTTCAGTGG 

CGATGGCATGACAGAGCTAGAGCTCGGGCCCAGCCCCAGGCTGCAGCCCATTCGCAGGCACCCGAAAGAACTTCCCC 

AGTATGGTGGTCCTGGAAAGGACATTTTTGAAGATCAACTATATCTTCCTGTGCATTCCGATGGAATTTCAGTTCAT 

CAGATGTTCACCATGGCCACCGCAGAACACCGAAGTAATTCCAGCATAGCGGGGAAGATGTTGACCAAGGTGGAGAA 

GAATCACGAAAAGGAGAAGTCACAGCACCTAGAAGGCAGCGCCTCCTCTTCACTCTCCTCTGATTAGATGAAACTGT 

TACCTTACCCTAAACACAGTATTTCTTTTTAACTTTTTTATTTGTAAACTAATAAAGGTAATCACAGCCACCAACAT 

TCCAAGCTACCCTGGGTACCTTTGTGCAGTAGAAGCTAGTGAGCATGTGAGCAAGCGGTGTGCACACGGAGACTCAT 

CGTTATAATTTACTATCTGCCAAGAGTAGAAAGAAAGGCTGGGGATATTTGGGTTGGCTTGGTTTTGATTTTTTGCT 

TGTTTGTTTGTTTTGTACTAAAACAGTATTATCTTTTGAATATCGTAGGGACATAAGTATATACATGTTATCCAATC 

AAGATGGCTAGAATGGTGCCTTTCTGAGTGTCTAAAACTTGACACCCCTGGTAAATCTTTCAACACACTTCCACTGC 

CTGCGTAATGAAGTTTTGATTCATTTTTAACCACTGGAATTTTTCAATGCCGTCATTTTCAGTTAGATGATTTTGCA 

CTTTGAGATTAAAATGCCATGTCTATTTGATTAGTCTTATTTTTTTATTTTTACAGGCTTATCAGTCTCACTGTTGG 

CTGTCATTGTGACAAAGTCAAATAAACCCCCAAGGACGACACACAGTATGGATCACATATTGTTTGACATTAAGCTT 

TTGCCAGAAAATGTTGCATGTGTTTTACCTCGACTTGCTAAAATCGATTAGCAGAAAGGCATGGCTAATAATGTTGG 

TGGTGAAAATAAATAAATAAGTAAACAAAATGA 

SEQ ID NO:1577 

>Z21368 # transcript_12 Hen 1247 (includes node 52 - TAA seg 21 ; node 25 - 
TAA seg 32) 

AGCAACACATTCAGGCAAAGGGATGCGAGAAGAACTCCTAGTTATAACTATGCACCAAATATGGATAAACACTGGAT 
TATGCAGTACACAGGACCAATGCTGCCCATCCACATGGAATTTACAAACATTCTACAGCGCAAAAGGCTCCAGACTT 
TGATGTCAGTGGATGATTCTGTGGAGAGGCTGTATAACATGCTCGTGGAGACGGGGGAGCTGGAGAATACTTACATC 
ATTTACACCGCCGACCATGGTTACCATATTGGGCAGTTTGGACTGGTCAAGGGGAAATCCATGCCATATGACTTTGA 
TATTCGTGTGCCTTTTTTTATTCGTGGTCCAAGTGTAGAACCAGGATCAATGTTTCGAACAAACAAGAAGGCCAAAA 
TTTGGCGTGATACATTCCTAGTGGAAAGAGGGTAATTATTGGTTCCTGGGGTGCTTCTGGGAACCAGTCCTAGTGGG 
CAGCTTTCCCTGCTGAGTATTTTTTTTCTCCTTATTTTTGTTTACTAAGCATGCAGATTTCGTAAACCTAGTCACAA 
GATTGAATGGTTTGCTGCTTATTCTGTAGTGGTCAATAGAGTAATAATTGCTGGATCAGAATTGTAAAGAATAACCC 
TCAAGTTGGTTAATTGGTACAAAAACACAGTTAGATAGAAGTTATAGAATTTGATAGTATAGTTGGGACATTATCGT 
TAACAATAATTTATGTATATCTTAAAATAGCTAGAAGTGAAGAATTGCAAAGTTCCCAACACAAGGAAAAGATAAAT 
GAGATGATGAATATCCCAATTATCTTGATTTGATCATTACACATTGTAGACTGGTATCCATATATCACACGTACCCC 
CAAAATATGTATAATTGTGATATATCAATTTTTAAAATACCAAAAAAGCAAGAGAATGACGACTCCACATTCCCCAA 
AAAGAATAAATTCTCATAAGCTTGGACCAAAGCCTTTATCATGGGTGTAGATTTACTGTTGCATTTCTCAGTGCTGG 
TTTCTAATCAGACCAGTGGATTGAGTTTCTCTACCATCCTCCCCACGTTCTTCTCTAAGCTGCCTCCAAGCCTCACC 
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CGGCACCCTTCTTCCTACTTCCTACTTCTTTTCCTTGTGTGCCTTTCCTAGTTTTAAATAGATAAATGTATGCCATT 
GTAATTATTTCCATTGTCACTTCTGGGTTTCCCCTTTTGGTTCATTAATACCCATTGCCTTGTTTTTCTCTGTACAT 

AAATTAGGAGAGAGA 
SEQ ID NO:1578 

>Z21368 # transcript_13 Hen 5710 (includes node 2 - TAA seg 5) 

AGGTTACTTGACTGGGAGTTCTCAGACCTCCAGTTTCAGCCCTGCCCTCAGCCTCCAATCCGTAAGAGACACCCAGC 

CCCAGCAATTGGATTGGGCAGCCCGTCTTGACACACCACTGTGCTGAGTGCTTGAGGACGTGTTTCAACAGATGGTT 

GGGGTTAGTGTGTGTCATCACATTCGAGTGGGGATTAAGAGAAGGAAGGCTGCCTTGCTGGAGCTGTGTGGTCTTCT 

CCAAGTGAGAGTCGCAGGCAATAGAACTACTTTGCTTTTGGAGGAAAAGGAGGAATTCATTTTCAGCAGACACAAGA 

AAAGCAGTTTTTTTTTCAGGGATTCTTCACTTCTCTTGAACAAGGAACTCACTCAGAGACTAACACAAAGGAAGTAA 

TTTCTTACCTGGTCATTATTTAGTCTACAATAAGTTCATCCTTCTTCAGTGTGACCAGTAAATTCTTCCCATACTCT 

TGAAGAGAGCATAATTGGAATGGAGAGGTGCTGACGGCCACCCACCATCATCTAAAGAAGATAAACTTGGCAAATGA 

CATGCAGGTTCTTCAAGGCAGAATAATTGCAGAAAATCTTCAAAGGACCCTATCTGCAGATGTTCTGAATACCTCTG 

AGAATAGAGATTGATTATTCAACCAGGATACCTAATTCAAGAACTCCAGAAATCAGGAGACGGAGACATTTTGTCAG 

TTTTGCAACATTGGACCAAATACAATGAAGTATTCTTGCTGTGCTCTGGTTTTGGCTGTCCTGGGCACAGAATTGCT 

GGGAAGCCTCTGTTCGACTGTCAGATCCCCGAGGTTCAGAGGACGGATACAGCAGGAACGAAAAAACATCCGACCCA 

ACATTATTCTTGTGCTTACCGATGATCAAGATGTGGAGCTGGGGTCCCTGCAAGTCATGAACAAAACGAGAAAGATT 

ATGGAACATGGGGGGGCCACCTTCATCAATGCCTTTGTGACTACACCCATGTGCTGCCCGTCACGGTCCTCCATGCT 

CACCGGGAAGTATGTGCACAATCACAATGTCTACACCAACAACGAGAACTGCTCTTCCCCCTCGTGGCAGGCCATGC 

ATGAGCCTCGGACTTTTGCTGTATATCTTAACAACACTGGCTACAGAACAGCCTTTTTTGGAAAATACCTCAATGAA 

TATAATGGCAGCTACATCCCCCCTGGGTGGCGAGAATGGCTTGGATTAATCAAGAATTCTCGCTTCTATAATTACAC 

TGTTTGTCGCAATGGCATCAAAGAAAAGCATGGATTTGATTATGCAAAGGACTACTTCACAGACTTAATCACTAACG 

AGAGCATTAATTACTTCAAAATGTCTAAGAGAATGTATCCCCATAGGCCCGTTATGATGGTGATCAGCCACGCTGCG 

CCCCACGGCCCCGAGGACTCAGCCCCACAGTTTTCTAAACTGTACCCCAATGCTTCCCAACACATAACTCCTAGTTA 

TAACTATGCACCAAATATGGATAAACACTGGATTATGCAGTACACAGGACCAATGCTGCCCATCCACATGGAATTTA 

CAAACATTCTACAGCGCAAAAGGCTCCAGACTTTGATGTCAGTGGATGATTCTGTGGAGAGGCTGTATAACATGCTC 

GTGGAGACGGGGGAGCTGGAGAATACTTACATCATTTACACCGCCGACCATGGTTACCATATTGGGCAGTTTGGACT 

GGTCAAGGGGAAATCCATGCCATATGACTTTGATATTCGTGTGCCTTTTTTTATTCGTGGTCCAAGTGTAGAACCAG 

GATCAATAGTCCCACAGATCGTTCTCAACATTGACTTGGCCCCCACGATCCTGGATATTGCTGGGCTCGACACACCT 

CCTGATGTGGACGGCAAGTCTGTCCTCAAACTTCTGGACCCAGAAAAGCCAGGTAACAGGTTTCGAACAAACAAGAA 

GGCCAAAATTTGGCGTGATACATTCCTAGTGGAAAGAGGCAAATTTCTACGTAAGAAGGAAGAATCCAGCAAGAATA 

TCCAACAGTCAAATCACTTGCCCAAATATGAACGGGTCAAAGAACTATGCCAGCAGGCCAGGTACCAGACAGCCTGT 

GAACAACCGGGGCAGAAGTGGCAATGCATTGAGGATACATCTGGCAAGCTTCGAATTCACAAGTGTAAAGGACCCAG 

TGACCTGCTCACAGTCCGGCAGAGCACGCGGAACCTCTACGCTCGCGGCTTCCATGACAAAGACAAAGAGTGCAGTT 

GTAGGGAGTCTGGTTACCGTGCCAGCAGAAGCCAAAGAAAGAGTCAACGGCAATTCTTGAGAAACCAGGGGACTCCA 

AAGTACAAGCCCAGATTTGTCCATACTCGGCAGACACGTTCCTTGTCCGTCGAATTTGAAGGTGAAATATATGACAT 

AAATCTGGAAGAAGAAGAAGAATTGCAAGTGTTGCAACCAAGAAACATTGCTAAGCGTCATGATGAAGGCCACAAGG 
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GGCCAAGAGATCTCCAGGCTTCCAGTGGTGGCAACAGGGGCAGGATGCTGGCAGATAGCAGCAACGCCGTGGGCCCA 

CCTACCACTGTCCGAGTGACACACAAGTGTTTTATTCTTCCCAATGACTCTATCCATTGTGAGAGAGAACTGTACCA 

ATCGGCCAGAGCGTGGAAGGACCATAAGGCATACATTGACAAAGAGATTGAAGCTCTGCAAGATAAAATTAAGAATT 

TAAGAGAAGTGAGAGGACATCTGAAGAGAAGGAAGCCTGAGGAATGTAGCTGCAGTAAACAAAGCTATTACAATAAA 

GAGAAAGGTGTAAAAAAGCAAGAGAAATTAAAGAGCCATCTTCACCCATTCAAGGAGGCTGCTCAGGAAGTAGATAG 

CAAACTGCAACTTTTCAAGGAGAACAACCGTAGGAGGAAGAAGGAGAGGAAGGAGAAGAGACGGCAGAGGAAGGGGG 

AAGAGTGCAGCCTGCCTGGCCTCACTTGCTTCACGCATGACAACAACCACTGGCAGACAGCCCCGTTCTGGAACCTG 

GGATCTTTCTGTGCTTGCACGAGTTCTAACAATAACACCTACTGGTGTTTGCGTACAGTTAATGAGACGCATAATTT 

TCTTTTCTGTGAGTTTGCTACTGGCTTTTTGGAGTATTTTGATATGAATACAGATCCTTATCAGCTCACAAATACAG 

TGCACACGGTAGAACGAGGCATTTTGAATCAGCTACACGTACAACTAATGGAGCTCAGAAGCTGTCAAGGATATAAG 

CAGTGCAACCCAAGACCTAAGAATCTTGATGTTGGAAATAAAGATGGAGGAAGCTATGACCTACACAGAGGACAGTT 

ATGGGATGGATGGGAAGGTTAATCAGCCCCGTCTCACTGCAGACATCAACTGGCAAGGCCTAGAGGAGCTACACAGT 

GTGAATGAAAACATCTATGAGTACAGACAAAACTACAGACTTAGTCTGGTGGACTGGACTAATTACTTGAAGGATTT 

AGATAGAGTATTTGCACTGCTGAAGAGTCACTATGAGCAAAATAAAACAAATAAGACTCAAACTGCTCAAAGTGACG 

GGTTCTTGGTTGTCTCTGCTGAGCACGCTGTGTCAATGGAGATGGCCTCTGCTGACTCAGATGAAGACCCAAGGCAT 

AAGGTTGGGAAAACACCTCATTTGACCTTGCCAGCTGACCTTCAAACCCTGCATTTGAACCGACCAACATTAAGTCC 

AGAGAGTAAACTTGAATGGAATAACGACATTCCAGAAGTTAATCATTTGAATTCTGAACACTGGAGAAAAACCGAAA 

AATGGACGGGGCATGAAGAGACTAATCATCTGGAAACCGATTTCAGTGGCGATGGCATGACAGAGCTAGAGCTCGGG 

CCCAGCCCCAGGCTGCAGCCCATTCGCAGGCACCCGAAAGAACTTCCCCAGTATGGTGGTCCTGGAAAGGACATTTT 

TGAAGATCAACTATATCTTCCTGTGCATTCCGATGGAATTTCAGTTCATCAGATGTTCACCATGGCCACCGCAGAAC 

ACCGAAGTAATTCCAGCATAGCGGGGAAGATGTTGACCAAGGTGGAGAAGAATCACGAAAAGGAGAAGTCACAGCAC 

CTAGAAGGCAGCGCCTCCTCTTCACTCTCCTCTGATTAGATGAAACTGTTACCTTACCCTAAACACAGTATTTCTTT 

TTAACTTTTTTATTTGTAAACTAATAAAGGTAATCACAGCCACCAACATTCCAAGCTACCCTGGGTACCTTTGTGCA 

GTAGAAGCTAGTGAGCATGTGAGCAAGCGGTGTGCACACGGAGACTCATCGTTATAATTTACTATCTGCCAAGAGTA 

GAAAGAAAGGCTGGGGATATTTGGGTTGGCTTGGTTTTGATTTTTTGCTTGTTTGTTTGTTTTGTACTAAAACAGTA 

TTATCTTTTGAATATCGTAGGGACATAAGTATATACATGTTATCCAATCAAGATGGCTAGAATGGTGCCTTTCTGAG 

TGTCTAAAACTTGACACCCCTGGTAAATCTTTCAACACACTTCCACTGCCTGCGTAATGAAGTTTTGATTCATTTTT 

AACCACTGGAATTTTTCAATGCCGTCATTTTCAGTTAGATGATTTTGCACTTTGAGATTAAAATGCCATGTCTATTT 

GATTAGTCTTATTTTTTTATTTTTACAGGCTTATCAGTCTCACTGTTGGCTGTCATTGTGACAAAGTCAAATAAACC 

CCCAAGGACGACACACAGTATGGATCACATATTGTTTGACATTAAGCTTTTGCCAGAAAATGTTGCATGTGTTTTAC 

CTCGACTTGCTAAAATCGATTAGCAGAAAGGCATGGCTAATAATGTTGGTGGTGAAAATAAATAAATAAGTAAACAA 

AATGAAGATTGCCTGCTCTCTCTGTGCCTAGCCTCAAAGCGTTCATCATACATCATACCTTTAAGATTGCTATATTT 

TGGGTTATTTTCTTGACAGGAGAAAAAGATCTAAAGATCTTTTATTTTCATCTTTTTTGGTTTTCTTGGCATGACTA 

AG A AG C T T A A AT G T T GAT AAA AT AT GAC T AGT T T T G AAT T T AC AC C A AG AAC T T CT C AAT A AAAG AAA AT CAT G A AT 
GCTCCACAATTTCAACATACCACAAGAGAAGTTAATTTCTTAACATTGTGTTCTATGATTATTTGTAAGACCTTCAC 
CAAGTTCTGATATCTTTTAAAGACATAGTTCAAAATTGCTTTTGAAAATCTGTATTCTTGAAAATATCCTTGTTGTG 
TATTAGGTTTTTAAATACCAGCTAAAGGATTACCTCACTGAGTCATCAGTACCCTCCTATTCAGCTCCCCAAGATGA 
TGTGTTTTTGCTTACCCTAAGAGAGGTTTTCTTCTTATTTTTAGATAATTCAAGTGCTTAGATAAATTATGTTTTCT 
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TTAAGTGTTTATGGTAAACTCTTTTAAAGAAAATTTAATATGTTATAGCTGAATCTTTTTGGTAACTTTAAATCTTT 
ATCATAGACTCTGTACATATGTTCAAATTAGCTGCTTGCCTGATGTGTGTATCATCGGTGGGATGACAGAACAAACA 
TATTTATGATCATGAATAATGTGCTTTGTAAAAAGATTTCAAGTTATTAGGAAGCATACTCTGTTTTTTAATCATGT 
ATAATATTCCATGATACTTTTATAGAACAATTCTGGCTTCAGGAAAGTCTAGAAGCAATATTTCTTCAAATAAAAGG 

TGTTTAAACTTT 
SEQ ID NO:1579 

>Z21368 # transcript_14 #len 5663 (includes node 2 - TAA-seg 5) 

CGCAGACCGTCGCTAATGAATCTTGGGGCCGGTGTCGGGCCGGGGCGGCTTGATCGGCAACTAGGAAACCCCAGGCG 

CAGAGGCCAGGAGCGAGGGCAGCGAGGATCAGAGGCCAGGCCTTCCCGGCTGCCGGCGCTCCTCGGAGGTCAGGGCA 

GATGAGGAACATGACTCTCCCCCTTCGGAGGAGGAAGGAAGTCCCGCTGCCACCTTATCTCTGCTCCTCTGCCTCCT 

CCCTGTTCCCAGAGCTTTTTCTCTAGAGAAGATTTTGAAGGCGGCTTTTGGATTCTTCACTTCTCTTGAACAAGGAA 

CTCACTCAGAGACTAACACAAAGGAAGTAATTTCTTACCTGGTCATTATTTAGTCTACAATAAGTTCATCCTTCTTC 

AGTGTGACCAGTAAATTCTTCCCATACTCTTGAAGAGAGCATAATTGGAATGGAGAGGTGCTGACGGCCACCCACCA 

TCATCTAAAGAAGATAAACTTGGCAAATGACATGCAGGTTCTTCAAGGCAGAATAATTGCAGAAAATCTTCAAAGGA 

CCCTATCTGCAGATGTTCTGAATACCTCTGAGAATAGAGATTGATTATTCAACCAGGATACCTAATTCAAGAACTCC 

AGAAATCAGGAGACGGAGACATTTTGTCAGTTTTGCAACATTGGACCAAATACAATGAAGTATTCTTGCTGTGCTCT 

GGTTTTGGCTGTCCTGGGCACAGAATTGCTGGGAAGCCTCTGTTCGACTGTCAGATCCCCGAGGTTCAGAGGACGGA 

TACAGCAGGAACGAAAAAACATCCGACCCAACATTATTCTTGTGCTTACCGATGATCAAGATGTGGAGCTGGGGTCC 

CTGCAAGTCATGAACAAAACGAGAAAGATTATGGAACATGGGGGGGCCACCTTCATCAATGCCTTTGTGACTACACC 

CATGTGCTGCCCGTCACGGTCCTCCATGCTCACCGGGAAGTATGTGCACAATCACAATGTCTACACCAACAACGAGA 

ACTGCTCTTCCCCCTCGTGGCAGGCCATGCATGAGCCTCGGACTTTTGCTGTATATCTTAACAACACTGGCTACAGA 

ACAGCCTTTTTTGGAAAATACCTCAATGAATATAATGGCAGCTACATCCCCCCTGGGTGGCGAGAATGGCTTGGATT 

AATCAAGAATTCTCGCTTCTATAATTACACTGTTTGTCGCAATGGCATCAAAGAAAAGCATGGATTTGATTATGCAA 

AGGACTACTTCACAGACTTAATCACTAACGAGAGCATTAATTACTTCAAAATGTCTAAGAGAATGTATCCCCATAGG 

CCCGTTATGATGGTGATCAGCCACGCTGCGCCCCACGGCCCCGAGGACTCAGCCCCACAGTTTTCTAAACTGTACCC 

CAATGCTTCCCAACACATAACTCCTAGTTATAACTATGCACCAAATATGGATAAACACTGGATTATGCAGTACACAG 

GACCAATGCTGCCCATCCACATGGAATTTACAAACATTCTACAGCGCAAAAGGCTCCAGACTTTGATGTCAGTGGAT 

GATTCTGTGGAGAGGCTGTATAACATGCTCGTGGAGACGGGGGAGCTGGAGAATACTTACATCATTTACACCGCCGA 

CCATGGTTACCATATTGGGCAGTTTGGACTGGTCAAGGGGAAATCCATGCCATATGACTTTGATATTCGTGTGCCTT 

TTTTTATTCGTGGTCCAAGTGTAGAACCAGGATCAATAGTCCCACAGATCGTTCTCAACATTGACTTGGCCCCCACG 

ATCCTGGATATTGCTGGGCTCGACACACCTCCTGATGTGGACGGCAAGTCTGTCCTCAAACTTCTGGACCCAGAAAA 

GCCAGGTAACAGGTTTCGAACAAACAAGAAGGCCAAAATTTGGCGTGATACATTCCTAGTGGAAAGAGGCAAATTTC 

TACGTAAGAAGGAAGAATCCAGCAAGAATATCCAACAGTCAAATCACTTGCCCAAATATGAACGGGTCAAAGAACTA 

TGCCAGCAGGCCAGGTACCAGACAGCCTGTGAACAACCGGGGCAGAAGTGGCAATGCATTGAGGATACATCTGGCAA 

GCTTCGAATTCACAAGTGTAAAGGACCCAGTGACCTGCTCACAGTCCGGCAGAGCACGCGGAACCTCTACGCTCGCG 

GCTTCCATGACAAAGACAAAGAGTGCAGTTGTAGGGAGTCTGGTTACCGTGCCAGCAGAAGCCAAAGAAAGAGTCAA 

CGGCAATTCTTGAGAAACCAGGGGACTCCAAAGTACAAGCCCAGATTTGTCCATACTCGGCAGACACGTTCCTTGTC 
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CGTCGAATTTGAAGGTGAAATATATGACATAAATCTGGAAGAAGAAGAAGAATTGCAAGTGTTGCAACCAAGAAACA 

TTGCTAAGCGTCATGATGAAGGCCACAAGGGGCCAAGAGATCTCCAGGCTTCCAGTGGTGGCAACAGGGGCAGGATG 

CTGGCAGATAGCAGCAACGCCGTGGGCCCACCTACCACTGTCCGAGTGACACACAAGTGTTTTATTCTTCCCAATGA 

CTCTATCCATTGTGAGAGAGAACTGTACCAATCGGCCAGAGCGTGGAAGGACCATAAGGCATACATTGACAAAGAGA 

TTGAAGCTCTGCAAGATAAAATTAAGAATTTAAGAGAAGTGAGAGGACATCTGAAGAGAAGGAAGCCTGAGGAATGT 

AGCTGCAGTAAACAAAGCTATTACAATAAAGAGAAAGGTGTAAAAAAGCAAGAGAAATTAAAGAGCCATCTTCACCC 

ATTCAAGGAGGCTGCTCAGGAAGTAGATAGCAAACTGCAACTTTTCAAGGAGAACAACCGTAGGAGGAAGAAGGAGA 

GGAAGGAGAAGAGACGGCAGAGGAAGGGGGAAGAGTGCAGCCTGCCTGGCCTCACTTGCTTCACGCATGACAACAAC 

CACTGGCAGACAGCCCCGTTCTGGAACCTGGGATCTTTCTGTGCTTGCACGAGTTCTAACAATAACACCTACTGGTG 

TTTGCGTACAGTTAATGAGACGCATAATTTTCTTXTCTGTGAGTTTGCTACTGGCTTTTTGGAGTATTTTGATATGA 

ATACAGATCCTTATCAGCTCACAAATACAGTGCACACGGTAGAACGAGGCATTTTGAATCAGCTACACGTACAACTA 

ATGGAGCTCAGAAGCTGTCAAGGATATAAGCAGTGCAACCCAAGACCTAAGAATCTTGATGTTGGAAATAAAGATGG 

AGGAAGCTATGACCTACACAGAGGACAGTTATGGGATGGATGGGAAGGTTAATCAGCCCCGTCTCACTGCAGACATC 

AACTGGCAAGGCCTAGAGGAGCTACACAGTGTGAATGAAAACATCTATGAGTACAGACAAAACTACAGACTTAGTCT 

GGTGGACTGGACTAATTACTTGAAGGATTTAGATAGAGTATTTGCACTGCTGAAGAGTCACTATGAGCAAAATAAAA 

CAAATAAGACTCAAACTGCTCAAAGTGACGGGTTCTTGGTTGTCTCTGCTGAGCACGCTGTGTCAATGGAGATGGCC 

TCTGCTGACTCAGATGAAGACCCAAGGCATAAGGTTGGGAAAACACCTCATTTGACCTTGCCAGCTGACCTTCAAAC 

CCTGCATTTGAACCGACCAACATTAAGTCCAGAGAGTAAACTTGAATGGAATAACGACATTCCAGAAGTTAATCATT 

TGAATTCTGAACACTGGAGAAAAACCGAAAAATGGACGGGGCATGAAGAGACTAATCATCTGGAAACCGATTTCAGT 

GGCGATGGCATGACAGAGCTAGAGCTCGGGCCCAGCCCCAGGCTGCAGCCCATTCGCAGGCACCCGAAAGAACTTCC 

CCAGTATGGTGGTCCTGGAAAGGACATTTTTGAAGATCAACTATATCTTCCTGTGCATTCCGATGGAATTTCAGTTC 

ATCAGATGTTCACCATGGCCACCGCAGAACACCGAAGTAATTCCAGCATAGCGGGGAAGATGTTGACCAAGGTGGAG 

AAGAATCACGAAAAGGAGAAGTCACAGCACCTAGAAGGCAGCGCCTCCTCTTCACTCTCCTCTGATTAGATGAAACT 

GTTACCTTACCCTAAACACAGTATTTCTTTTTAACTTTTTTATTTGTAAACTAATAAAGGTAATCACAGCCACCAAC 

ATTCCAAGCTACCCTGGGTACCTTTGTGCAGTAGAAGCTAGTGAGCATGTGAGCAAGCGGTGTGCACACGGAGACTC 

ATCGTTATAATTTACTATCTGCCAAGAGTAGAAAGAAAGGCTGGGGATATTTGGGTTGGCTTGGTTTTGATTTTTTG 

CTTGTTTGTTTGTTTTGTACTAAAACAGTATTATCTTTTGAATATCGTAGGGACATAAGTATATACATGTTATCCAA 

TCAAGATGGCTAGAATGGTGCCTTTCTGAGTGTCTAAAACTTGACACCCCTGGTAAATCTTTCAACACACTTCCACT 

GCCTGCGTAATGAAGTTTTGATTCATTTTTAACCACTGGAATTTTTCAATGCCGTCATTTTCAGTTAGATGATTTTG 

CACTTTGAGATTAAAATGCCATGTCTATTTGATTAGTCTTATTTTTTTATTTTTACAGGCTTATCAGTCTCACTGTT 

GGCTGTCATTGTGACAAAGTCAAATAAACCCCCAAGGACGACACACAGTATGGATCACATATTGTTTGACATTAAGC 

TTTTGCCAGAAAATGTTGCATGTGTTTTACCTCGACTTGCTAAAATCGATTAGCAGAAAGGCATGGCTAATAATGTT 

GGTGGTGAAAATAAATAAATAAGTAAACAAAATGAAGATTGCCTGCTCTCTCTGTGCCTAGCCTCAAAGCGTTCATC 

ATACATCATACCTTTAAGATTGCTATATTTTGGGTTATTTTCTTGACAGGAGAAAAAGATCTAAAGATCTTTTATTT 

TCATCTTTTTTGGTTTTCTTGGCATGACTAAGAAGCTTAAATGTTGATAAAATATGACTAGTTTTGAATTTACACCA 

AGAACTTCTCAATAAAAGAAAATCATGAATGCTCCACAATTTCAACATACCACAAGAGAAGTTAATTTCTTAACATT 

GTGTTCTATGATTATTTGTAAGACCTTCACCAAGTTCTGATATCTTTTAAAGACATAGTTCAAAATTGCTTTTGAAA 

ATCTGTATTCTTGAAAATATCCTTGTTGTGTATTAGGTTTTTAAATACCAGCTAAAGGATTACCTCACTGAGTCATC 
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AGTACCCTCCTATTCAGCTCCCCAAGATGATGTGTTTTTGCTTACCCTAAGAGAGGTTTTCTTCTTATTTTTAGATA 
ATTCAAGTGCTTAGATAAATTATGTTTTCTTTAAGTGTTTATGGTAAACTCTTTTAAAGAAAATTTAATATGTTATA 
GCTGAATCTTTTTGGTAACTTTAAATCTTTATCATAGACTCTGTACATATGTTCAAATTAGCTGCTTGCCTGATGTG 
TGTATCATCGGTGGGATGACAGAACAAACATATTTATGATCATGAATAATGTGCTTTGTAAAAAGATTTCAAGTTAT 
TAGGAAGCATACTCTGTTTTTTAATCATGTATAATATTCCATGATACTTTTATAGAACAATTCTGGCTTCAGGAAAG 
TCTAGAAGCAATATTTCTTCAAATAAAAGGTGTTTAAACTTT 

SEQ ID NO:1580 

>Z21368_PEA_2_P2 # trn_4 #len 630 

MCYGTPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELENTYIIYTADHG 
YHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPG 
NRFRTNKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARYQTACEQPGQKWQCIEDTSGKLR 
IHKCKGPSDLLTVRQSTRNLYARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVE 
FEGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGPPTTVRVTHKCFILPNDSI 
HCERELYQSARAWKDHKAYIDKEIEALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVKKQEKLKSHLHPFK 
EAAQEVDSKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHDNNHWQTAPFWNLGSFCACTSSNNNTYWCLR 
TVNETHNFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQCNPRPKNLDVGNKDGGS 

YDLHRGQLWDGWEG 
SEQ ID NO:1581 

MJnique aa seq coded by Z2 13 68_PEA_2_P2 
MCYG 

SEQ ID NO:1582 

>Z213 68_PEA_2_P3 # trn_5 lien 546 

MSHCCRKSSPKCHLPLALDLSHHRVPLYRVPQIVLNIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRTNK 
KAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARYQTACEQPGQKWQCIEDTSGKLRIHKCKGP 
SDLLTVRQSTRNLYARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHTRQTRSLSVEFEGEIYD 
INLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGPPTTVRVTHKCFILPNDSIHCERELY 
QSARAWKDHKAYIDKEIEALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVKKQEKLKSHLHPFKEAAQEVD 
SKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHDNNHWQTAPFWNLGSFCACTSSNNNTYWCLRTVNETHN 
FLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQCNPRPKNLDVGNKDGGSYDLHRGQ 

LWDGWEG 

SEQ ID NO:1583 

>Unique aa seq coded by Z213 68_PEA_2_P3 
MSHCCRKSSPKCHLPLALDLSHHRVPLYR 
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SEQ ID NO: 1584 

>Z21368_PEA_2_P9 # trn_12 #len 139 

SNTFRQRDARRTPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELENTYI 
IYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSMFRTNKKAKIWRDTFLVERG 

SEQ ID NO:1585 

>Unique aa seq coded by Z21368_PEA_2_P9 
SNTFRQRDARR 

SEQ ID NO:1586 

>Z21368_PEA_2_P1 # trn_13; trn_14 #len 871 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSLQVMNKTRKIMEHGGATF 
INAFVTTPMCCPSRSSMLTGKYVHNHNVYTNNENCSSPSWQAMHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPP 
GWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSA 
PQFSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELEN 
TYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVLNIDLAPTILDIAGLDTPPDVDGKSV 
LKLLDPEKPGNRFRTNKKAKIWRDTFLVERGKFLRKKEESSKNIQQSNHLPKYERVKELCQQARYQTACEQPGQKWQ 
CIEDTSGKLRIHKCKGPSDLLTVRQSTRNLYARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVH 
TRQTRSLSVEFEGEIYDINLEEEEELQVLQPRNIAKRHDEGHKGPRDLQASSGGNRGRMLADSSNAVGPPTTVRVTH 
KCFILPNDSIHCERELYQSARAWKDHKAYIDKEIEALQDKIKNLREVRGHLKRRKPEECSCSKQSYYNKEKGVKKQE 
KLKSHLHPFKEAAQEVDSKLQLFKENNRRRKKERKEKRRQRKGEECSLPGLTCFTHDNNHWQTAPFWNLGSFCACTS 
SNNNTYWCLRTVNETHNFLFCEFATGFLEYFDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQCNPRPKN 

LDVGNKDGGSYDLHRGQLWDGWEG 
SEQ ID NO:1587 

>Z21368 # node 2 - TAA seg 5 #len 162 

GGATTCTTCACTTCTCTTGAACAAGGAACTCACTCAGAGACTAACACAAAGGAAGTAATTTCTTACCTGGTCATTAT 
TTAGTCTACAATAAGTTCATCCTTCTTCAGTGTGACCAGTAAATTCTTCCCATACTCTTGAAGAGAGCATAATTGGA 

ATGGAGAG 

SEQ ID NO:1588 

>Z21368 # node 51 - TAA seg 19 #len 78 

GGCCATTCCAATGGAACAGACAGGGTAAGGACCAATCTGGACTGTGTTATCTTTTCCAGGTGCAAGTATGTGCTATG 
G 

SEQ ID NO:1589 

>Z21368 # node 52 - TAA seg 21 #len 32 
AGCAACACATTCAGGCAAAGGGATGCGAGAAG 
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SEQ ID NO:1590 

>Z21368 # node 53 - TAA seg 27 #len 266 

GGCTGCATCTTGCCTGGTGATGTGGTCAGAATACAGGGGTGCAGGCATCTCTCCAGCCTGACCCTGGCAAGAGTCAG 
TTAATCTTGCTCAGTGCCATTGCTGTGATCACACACCCACCCTTGCCACACAACTACCCATGCCTAGGAGACCAGAT 
GAGAGGGTGAGAAGAGTTGAAGGCCAATGAGTCACTGCTGTAGAAAAAGCAGCCCTAAGTGCCACCTTCCCCTGGCA 

TTGGATCTCAGCCATCACCGTGTGCCCCTTTACAG 
SEQ ID NO:1591 

>Z21368 # node 25 - TAA seg 32 #len 831 

GTAATTATTGGTTCCTGGGGTGCTTCTGGGAACCAGTCCTAGTGGGCAGCTTTCCCTGCTGAGTATTTTTTTTCTCC 
TTATTTTTGTTTACTAAGCATGCAGATTTCGTAAACCTAGTCACAAGATTGAATGGTTTGCTGCTTATTCTGTAGTG 
GTCAATAGAGTAATAATTGCTGGATCAGAATTGTAAAGAATAACCCTCAAGTTGGTTAATTGGTACAAAAACACAGT 
TAGATAGAAGTTATAGAATTTGATAGTATAGTTGGGACATTATCGTTAACAATAATTTATGTATATCTTAAAATAGC 
TAGAAGTGAAGAATTGCAAAGTTCCCAACACAAGGAAAAGATAAATGAGATGATGAATATCCCAATTATCTTGATTT 
GATCATTACACATTGTAGACTGGTATCCATATATCACACGTACCCCCAAAATATGTATAATTGTGATATATCAATTT 
TTAAAATACCAAAAAAGCAAGAGAATGACGACTCCACATTCCCCAAAAAGAATAAATTCTCATAAGCTTGGACCAAA 
GCCTTTATCATGGGTGTAGATTTACTGTTGCATTTCTCAGTGCTGGTTTCTAATCAGACCAGTGGATTGAGTTTCTC 
TACCATCCTCCCCACGTTCTTCTCTAAGCTGCCTCCAAGCCTCACCCGGCACCCTTCTTCCTACTTCCTACTTCTTT 
TCCTTGTGTGCCTTTCCTAGTTTTAAATAGATAAATGTATGCCATTGTAATTATTTCCATTGTCACTTCTGGGTTTC 
CCCTTTTGGTTCATTAATACCCATTGCCTTGTTTTTCTCTGTACATAAATTAGGAGAGAGA 

SEQ ID NO: 1592 
>Forward primer: 
GGCGGCGGCAGGAT 

SEQ ID NO:1593 
>Reverse primer: 
GTCGGGAGCGCAGGG 

SEQ ID NO:1594 
>Amplicon : 

GGCGGCGGCAGGATCGGCCAGAGGAGGAGGGAAGCGCTTTTTTTGATCCTGATTCCAGTTTGCCTCTCTCTTTTTTT 
CCCCCAAATTATTCTTCGCCTGATTTTCCTCGCGGAGCCCTGCGCTCCCGAC 

SEQ ID NO:1595 

>HUMHMGBOX # transcriptj #len 2533 {Includes node 0 - TAA seg 2) 
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CCCGTCACATGCGATGGTTGTCTATTAACTTGTTCAAAAAAGTATCAGGAGTTGTCAAGGCAGAGAAGAGAGTGTTT 

GCAAAAGGGGGAAAGTAGTTTGCTGCCTCTTTAAGACTAGGACTGAGAGAAAGAAGAGGAGAGAGAAAGAAAGGGAG 

AGAAGTTTGAGCCCCAGGCTTAAGCCTTTCCAAAAAATAATAATAACAATCATCGGCGGCGGCAGGATCGGCCAGAG 

GAGGAGGGAAGCGCTTTTTTTGATCCTGATTCCAGTTTGCCTCTCTCTTTTTTTCCCCCAAATTATTCTTCGCCTGA 

TTTTCCTCGCGGAGCCCTGCGCTCCCGACACCCCCGCCCGCCTCCCCTCCTCCTCTCCCCCCGCCCGCGGGCCCCCC 

AAAGTCCCGGCCGGGCCGAGGGTCGGCGGCCGCCGGCGGGCCGGGCCCGCGCACAGCGCCCGCATGTACAACATGAT 

GGAGACGGAGCTGAAGCCGCCGGGCCCGCAGCAAACTTCGGGGGGCGGCGGCGGCAACTCCACCGCGGCGGCGGCCG 

GCGGCAACCAGAAAAACAGCCCGGACCGCGTCAAGCGGCCCATGAATGCCTTCATGGTGTGGTCCCGCGGGCAGCGG 

CGCAAGATGGCCCAGGAGAACCCCAAGATGCACAACTCGGAGATCAGCAAGCGCCTGGGCGCCGAGTGGAAACTTTT 

GTCGGAGACGGAGAAGCGGCCGTTCATCGACGAGGCTAAGCGGCTGCGAGCGCTGCACATGAAGGAGCACCCGGATT 

ATAAATACCGGCCCCGGCGGAAAACCAAGACGCTCATGAAGAAGGATAAGTACACGCTGCCCGGCGGGCTGCTGGCC 

CCCGGCGGCAATAGCATGGCGAGCGGGGTCGGGGTGGGCGCCGGCCTGGGCGCGGGCGTGAACCAGCGCATGGACAG 

TTACGCGCACATGAACGGCTGGAGCAACGGCAGCTACAGCATGATGCAGGACCAGCTGGGCTACCCGCAGCACCCGG 

GCCTCAATGCGCACGGCGCAGCGCAGATGCAGCCCATGCACCGCTACGACGTGAGCGCCCTGCAGTACAACTCCATG 

ACCAGCTCGCAGACCTACATGAACGGCTCGCCCACCTACAGCATGTCCTACTCGCAGCAGGGCACCCCTGGCATGGC 

TCTTGGCTCCATGGGTTCGGTGGTCAAGTCCGAGGCCAGCTCCAGCCCCCCTGTGGTTACCTCTTCCTCCCACTCCA 

GGGCGCCCTGCCAGGCCGGGGACCTCCGGGACATGATCAGCATGTATCTCCCCGGCGCCGAGGTGCCGGAACCCGCC 

GCCCCCAGCAGACTTCACATGTCCCAGCACTACCAGAGCGGCCCGGTGCCCGGCACGGCCATTAACGGCACACTGCC 

CCTCTCACACATGTGAGGGCCGGACAGCGAACTGGAGGGGGGAGAAATTTTCAAAGAAAAACGAGGGAAATGGGAGG 

G G T GC A AAAG AG G A G AG T A AG A A AC A G CAT G G AG AA AAC C C G GT AC G C T C A A A A AG A AAA AG G A A AA AA AAA AAT C C 

CATCACCCACAGCAAATGACAGCTGCAAAAGAGAACACCAATCCCATCCACACTCACGCAAAAACCGCGATGCCGAC 

AAGAAAACTTTTATGAGAGAGATCCTGGACTTCTTTTTGGGGGACTATTTTTGTACAGAGAAAACCTGGGGAGGGTG 

GGGAGGGCGGGGGAATGGACCTTGTATAGATCTGGAGGAAAGAAAGCTACGAAAAACTTTTTAAAAGTTCTAGTGGT 

ACGGTAGGAGCTTTGCAGGAAGTTTGCAAAAGTCTTTACCAATAATATTTAGAGCTAGTCTCCAAGCGACGAAAAAA 

ATGTTTTAATATTTGCAAGCAACTTTTGTACAGTATTTATCGAGATAAACATGGCAATCAAAATGTCCATTGTTTAT 

AAGCTGAGAATTTGCCAATATTTTTCAAGGAGAGGCTTCTTGCTGAATTTTGATTCTGCAGCTGAAATTTAGGACAG 

TTGCAAACGTGAAAAGAAGAAAATTATTCAAATTTGGACATTTTAATTGTTTAAAAATTGTACAAAAGGAAAAAATT 

AGAATAAGTACTGGCGAACCATCTCTGTGGTCTTGTTTAAAAAGGGCAAAAGTTTTAGACTGTACTAAATTTTATAA 

CTTACTGTTAAAAGCAAAAATGGCCATGCAGGTTGACACCGTTGGTAATTTATAATAGCTTTTGTTCGATCCCAACT 

TTCCATTTTGTTCAGATAAAAAAAACCATGAAATTACTGTGTTTGAAATATTTTCTTATGGTTTGTAATATTTCTGT 

AAATTTATTGTGATATTTTAAGGTTTTCCCCCCTTTATTTTCCGTAGTTGTATTTTAAAAGATTCGGCTCTGTATTA 

TTTGAATCAGTCTGCCGAGAATCCATGTATATATTTGAACTAATATCATCCTTATAACAGGTACATTTTCAACTTAA 

GTTTTTACTCCATTATGCACAGTTTGAGATAAATAAATTTTTGAAATATGGACACTGAAAAAATCTTGA 

SEQ ID NO:1596 

>HUMHMGB0X_P1 # trn_0 #len 317 

MYNMMETELKPPGPQQTSGGGGGNSTAAAAGGNQKNSPDRVKRPMNAFMVWSRGQRRKMAQENPKMHNSEISKRLGA 
EWKLLSETEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTKTLMKKDKYTLPGGLLAPGGNSMASGVGVGAGLGAGVN 
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QRMDSYAHMNGWSNGSYSMMQDQLGYPQHPGLNAHGAAQMQPMHRYDVSALQYNSMTSSQTYMNGSPTYSMSYSQQG 
TPGMALGSMGSVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMISMYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 

SEQ ID NO: 1597 

>HUMHMGBOX # node_0 (TAA segment 2) #len 161 

TCGGCGGCGGCAGGATCGGCCAGAGGAGGAGGGAAGCGCTTTTTTTGATCCTGATTCCAGTTTGCCTCTCTCTTTTT 
TTCCCCCAAATTATTCTTCGCCTGATTTTCCTCGCGGAGCCCTGCGCTCCCGACACCCCCGCCCGCCTCCCCTCCTC 

CTCTCCC 

SEQ ID NO: 1598 
>Forward primer: 
CCCCAGACTCTGTGCACTTCA 

SEQ ID NO:1599 
>Reverse primer: 
TGGGCTCTGCTCTGTCTTAGTGTA 

SEQ ID NO : 1600 
>Amplicon : 

CCCCAGACTCTGTGCACTTCAGACCAGCAGCAGCAGGAGGGCTCCCGAGGGCCTTATGAGAAAACCTGTGTGGACAT 
CCCTTGGTGTACACTAAGACAGAGCAGAGCCCA 

SEQ ID NO:1601 

>HSB6PR # transcriptj #len 5390 (Includes node 29 - TAA seg 34) 

GGGGTGGTGCAGGGCAGGGGTGGTATATCCTGTCTGACGGAGGGCGGGCCTCGCCAGTGCCAGAGAGGGACGAACCA 
GGGTGGAAGCGCCAGGAGCAGCTGCAGGGAGCCCTCACGCGGACCTCGCACTCTATGGCCGTAGGGAGCCGCTGAGA 
GCGAGAAGAGCACGCTCCTGCCCGCCCGCTGCACCGCACCTCGCCTCGCCTCTCTGCTCTCCTAGGCCCCGGCCGCG 
CGCCACCCGCCTCCCGCCACCATGAACCACTCGCCGCTCAAGACCGCCTTGGCGTACGAATGCTTCCAGGACCAGGA 
CAACTCCACGTTGGCTTTGCCGTCGGACCAAAAGATGAAAACAGGCACGTCTGGCAGGCAGCGCGTGCAGGAGCAGG 
TGATGATGACCGTCAAGCGGCAGAAGTCCAAGTCTTCCCAGTCGTCCACCCTGAGCCACTCCAATCGAGGTTCCATG 
TATGATGGCTTGGCTGACAATTACAACTATGGGACCACCAGCAGGAGCAGCTACTACTCCAAGTTCCAGGCAGGGAA 
TGGCTCATGGGGATATCCGATCTACAATGGAACCCTCAAGCGGGAGCCTGACAACAGGCGCTTCAGCTCCTACAGCC 
AGATGGAGAACTGGAGCCGGCACTACCCCCGGGGCAGCTGTAACACCACCGGCGCAGGCAGCGACATCTGCTTCATG 
CAGAAAATCAAGGCGAGCCGCAGTGAGCCCGACCTCTACTGTGACCCACGGGGCACCCTGCGCAAGGGCACGCTGGG 
CAGCAAGGGCCAGAAGACCACCCAGAACCGCTACAGCTTTTACAGCACCTGCAGTGGTCAGAAGGCCATAAAGAAGT 
GCCCTGTGCGCCCGCCCTCTTGTGCCTCCAAGCAGGACCCTGTGTATATCCCGCCCATCTCCTGCAACAAGGACCTG 
TCCTTTGGCCACTCTAGGGCCAGCTCCAAGATCTGCAGTGAGGACATCGAGTGCAGTGGGCTGACCATCCCCAAGGC 
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TGTGCAGTACCTGAGCTCCCAGGATGAGAAGTACCAGGCCATTGGGGCCTATTACATCCAGCATACCTGCTTCCAGG 

ATGAATCTGCCAAGCAACAGGTCTATCAGCTGGGAGGCATCTGCAAGCTGGTGGACCTCCTCCGCAGCCCCAACCAG 

AACGTCCAGCAGGCCGCGGCAGGGGCCCTGCGCAACCTGGTGTTCAGGAGCACCACCAACAAGCTGGAGACCCGGAG 

GCAGAATGGGATCCGCGAGGCAGTCAGCCTCCTGAGGAGAACCGGGAACGCCGAGATCCAGAAGCAGCTGACTGGGC 

TGCTCTGGAACCTGTCTTCCACTGACGAGCTGAAGGAGGAACTCATTGCCGACGCCCTGCCTGTTCTGGCCGACCGC 

GTCATCATTCCCTTCTCTGGCTGGTGCGATGGCAATAGCAACATGTCCCGGGAAGTGGTGGACCCTGAGGTCTTCTT 

CAATGCCACAGGCTGCTTGAGGAACCTGAGCTCGGCCGATGCAGGCCGCCAGACCATGCGTAACTACTCAGGGCTCA 

TTGATTCCCTCATGGCCTATGTCCAGAACTGTGTAGCGGCCAGCCGCTGTGACGACAAGTCTGTGGAAAACTGCATG 

TGTGTTCTGCACAACCTCTCCTACCGCCTGGACGCCGAGGTGCCCACCCGCTACCGCCAGCTGGAGTATAACGCCCG 

CAACGCCTACACCGAGAAGTCCTCCACTGGCTGCTTCAGCAACAAGAGCGACAAGATGATGAACAACAACTATGACT 

GCCCCCTGCCTGAGGAAGAGACCAACCCCAAGGGCAGCGGCTGGTTGTACCATTCAGATGCCATCCGCACCTACCTG 

AACCTCATGGGCAAGAGCAAGAAAGATGCTACCCTGGAGGCCTGTGCTGGTGCCCTGCAGAACCTGACAGCCAGCAA 

GGGGCTGATGTCCAGTGGCATGAGCCAGTTGATTGGGCTGAAGGAAAAGGGCCTGCCACAAATTGCCCGCCTCCTGC 

AATCTGGCAACTCTGATGTGGTGCGGTCCGGAGCCTCCCTCCTGAGCAACATGTCCCGCCACCCTCTGCTGCACAGA 

GTGATGGGGAACCAGGTGTTCCCGGAGGTGACCAGGCTCCTCACCAGCCACACTGGCAATACCAGCAACTCCGAAGA 

CATCTTGTCCTCGGCCTGCTACACTGTGAGGAACCTGATGGCCTCGCAGCCACAACTGGCCAAGCAGTACTTCTCCA 

GCAGCATGCTCAACAACATCATCAACCTGTGCCGAAGCAGTGCCTCACCCAAGGCCGCAGAAGCTGCCCGGCTTCTC 

CTGTCTGACATGTGGTCCAGCAAGGAACTGCAGGGTGTCCTCAGACAGCAAGGTTTCGATAGGAACATGCTGGGAAC 

CTTAGCTGGGGCCAACAGCCTCAGGAACTTCACCTCCCGATTCTAAGAAGAGACTGTCCAAGCAAGTTAGGCTTGCA 

GGAAGATATGACCCAGCTGAGAAGCCCTCAGGCCTCGCTGGATGGGGTTTTCTGTCCATCCTGTGCAGTATTTGGGA 

'aagttcacaagaaactgagaagaaacctaaaaactgtggatagtggaaagatttttagattttttttttccttgggg 

AAACTGGCAGGCAATGGGGGTTAGGGAGGTTGGGGCGGGGGGGGCTTTCTTGAGTTAAAGGGGCTTATATGTGATGT 
CAATATTTCTTCCTCTGAGAAATGGTATATATATGTGTCTAATGTAAGTGTGTGCATGCATGTGCGCGTGCATGTGT 
GTGTGTGTGAGTGTCTTAAAGCATAACCACAAACTGCAAAAAGCTAGGTAAGCTATTTTGTTGCAGCTCATAAGGTG 
GTGAAAAGGACTCTCCTGTGTTTCTTACTCATAGGCAAGGACAACATGTGCTTTTTGGTGAGCTGCTCATAATTCCT 
GAAATGTGTGGTGCCAGGGCAAGGGGGCCATCACTGCAGTCAGGCCCTCAGAGGAGTCCTGCAGGCTTCCTACCAGT 
GGTCTCCAAGGGTGCAGGAGTAACTGGGGCTGGGCCAGCCTCCCCCCTTACAAGGCTGCTTTCCAGGAAGGGAGGTC 
TGGTGTATCTCATGGGAGAATCTGGGGTGTCTGTAGTGTCACCCCTCCAGCAGCGCCACAAGGACTGAGGTTGGGTA 
GGTGTGAGGTTCCAGAGGACAGCAGGACACTCTCGCATACTTTGCCAAATGAGGCCTGCTCAGAGGAGTAGGAGCTG 
AAAGATGGTGCCTTCCACCCTCTTGGGCTGTGTGCCCATCAGAGCAGGCTCAGCCTGCAAAGGCCCTGCATTCAGAG 
GTCTTGTAATCTACTTGTTGCAGGAGAAAGAAGGTAAAAAATGATTTTTTTAAGAAAAGCTATTTTATTGCAGCTCT 
TTCCCAAGAGCTGTTCTGGGAATGGCTGGTCTTCATATTCCCAGTGGAGAGGGGAACAAGTGGGGCTGGGCATATAC 
CTATTCCGGCTTCTAGTGGGATGGAGTTGGGGTATAGAAATTAACCAGGAAGATGTTTCCACCAAGCCTGCTGTGAG 
TCAATTGAGGGAGTGTTTGGGGTCCCAGGAGACTTGGACGGGGGGAGTTTGGGTAGACTAGGAAAGGAAAGTGCCAT 
ATCAGGGTACCGGTACCGGCAAGCTCACATCTCAGCCAGGGGCCATGCCCCACTTCCCCTGACCCCAGCTGTCTTGT 
CTCCACTCTGTGAAACCCACAGGGGATGTGATAAACAGGGCTATTAGGGGTATCAGCCACGTCGAGCCCCCAGACTC 
TGTGCACTTCAGACCAGCAGCAGCAGGAGGGCTCCCGAGGGCCTTATGAGAAAACCTGTGTGGACATCCCTTGGTGT 
ACACTAAGACAGAGCAGAGCCCAGCGCTCCCAAGCCTTCCTCCTTCCAGCTTCTACCTCCATGCTAGCATTGCTGGT 
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GTTAGAGAGGAATTAACTTCCTGGTCTGTGCCCTTCTCTAGAAGAATATAAGATGCTCCTCCTCCTCACCCCTTCTC 

AGCCTCCTCCCAAGTCTTCCTCTTCTGCACCACCCCCGAGTCCAAACCCACCTCTTGCCCCAGCATTCAGGCTGGAA 

AACACTGATGTGGACTCAGTATGACAACTGAGATGGGGGAAGCCAGACATGTGAGGACGCTGTCCTCCGAGAGGTGT 

CCCCGGCTGTTAGCCAGCTGTGCTGTGGTGCTGTGGGTCTGTCATACCCTCCCTTGCTTCTGTTCACACTGGGAGGC 

CCACTCCTGGCTCACCTCTCCCTCTCAGGGACCCACGTGGGAGCCTGGATCCCTGGACTGTCCTGGGCATAGGTTTC 

AGGGGCCTCCTTTGTTGTCATCAGAACCCAGAGGAATTCTTCTCCTAAAAAATACGTATGGCATACCAATCTGTGCG 

GGGCAGTGTCCTAAGCACTTAGACTACATCAGGGAAGAACACAGACCACATCCCTGTCCTCATGCGGCTTATGTTTT 

CTGGAGGAAAGTGGAGACACAAGTCCTTGGCTTTAGGGCTCCCCCGGCTGGGGGCTGTGCAGTCCGGTCAGGGCGGG 

AGGGGAAATGCACCGCTGCATGTGAACCTTACCAGCCCAGGCGGATGCCCCTTCCCCTTAGCACTACCCTGGCCTCC 

TGCATCCCCTCGCCTCATGTTCCTCCCACCTTCAAAGAATGAAGAGCCCCATGGGCCCAGCCCCTGCCCTGGGAACC 

AGGCAGCCTTCCAGACCTCAGGGGCTGAGGCAGACTATTAGGGCAGGGCTGACTTTGGTGACACTGCCCATTCCCTC 

TCAGGCCAGCTCAGGTCACCCGGGCCTCTGACCCAGGCCTGTCACTTTGAGAGGGGCAAAACTGAGAGGGGCTTTTC 

CTAGAGAAAGAGAACAAGGAGCTTGCCAGGCTTCATGTAGCCGACACACGTCTCAGGATTTTAAGTCCACATTGGCC 

TCACACTACCAGGGCCAATGCCCAAAATAAGGAGTTCCAATTTGGGGCCAAATGAGGAAGGACACAGACTCTGCCCT 

GGGATCTCCTGTGCTAGCGGCCAATGACAAATCCAGTCATTGGCCACCAGCCACCTCTGCAGTGGGGACCACACTAG 

CAGCCCTGACTCCACACTCCTCCTGGGGACCCAAGAGGCAGTGTTGCTGTCTGCGTGTCCACCTTGGAATCTGGCTG 

AACTGGCTGGGAGGACCAAGACTGCGGCTGGGGTGGGCAGGGAAGGGAAGCCGGGGGCTGCTGTGAGGGATCTTGGA 

GCTTCCCTGTAGCCCACCTTCCCCTTGCTTCATGTTTGTAGAGGAACCTTGTGCCGGCCAGGCCCAGTTTCCTTGTG 

TGATACACTAATGTATTTGCTTTTTTTGGAAATAGAGAAAATCAATAAATTGCTAGTGTTTCTTTGAACTTTTTCGA 

SEQ ID NO:1602 

>HSB6PR # transcript^ #len 5453 (Includes node 29 - TAA seg 34) 

GGGGTGGTGCAGGGCAGGGGTGGTATATCCTGTCTGACGGAGGGCGGGCCTCGCCAGTGCCAGAGAGGGACGAACCA 
GGGTGGAAGCGCCAGGAGCAGCTGCAGGGAGCCCTCACGCGGACCTCGCACTCTATGGCCGTAGGGAGCCGCTGAGA 
GCGAGAAGAGCACGCTCCTGCCCGCCCGCTGCACCGCACCTCGCCTCGCCTCTCTGCTCTCCTAGGCCCCGGCCGCG 
CGCCACCCGCCTCCCGCCACCATGAACCACTCGCCGCTCAAGACCGCCTTGGCGTACGAATGCTTCCAGGACCAGGA 
CAACTCCACGTTGGCTTTGCCGTCGGACCAAAAGATGAAAACAGGCACGTCTGGCAGGCAGCGCGTGCAGGAGCAGG 
TGATGATGACCGTCAAGCGGCAGAAGTCCAAGTCTTCCCAGTCGTCCACCCTGAGCCACTCCAATCGAGGTTCCATG 
TATGATGGCTTGGCTGACAATTACAACTATGGGACCACCAGCAGGAGCAGCTACTACTCCAAGTTCCAGGCAGGGAA 
TGGCTCATGGGGATATCCGATCTACAATGGAACCCTCAAGCGGGAGCCTGACAACAGGCGCTTCAGCTCCTACAGCC 
AGATGGAGAACTGGAGCCGGCACTACCCCCGGGGCAGCTGTAACACCACCGGCGCAGGCAGCGACATCTGCTTCATG 
CAGAAAATCAAGGCGAGCCGCAGTGAGCCCGACCTCTACTGTGACCCACGGGGCACCCTGCGCAAGGGCACGCTGGG 
CAGCAAGGGCCAGAAGACCACCCAGAACCGCTACAGCTTTTACAGCACCTGCAGTGGTCAGAAGGCCATAAAGAAGT 
GCCCTGTGCGCCCGCCCTCTTGTGCCTCCAAGCAGGACCCTGTGTATATCCCGCCCATCTCCTGCAACAAGGACCTG 
TCCTTTGGCCACTCTAGGGCCAGCTCCAAGATCTGCAGTGAGGACATCGAGTGCAGTGGGCTGACCATCCCCAAGGC 
TGTGCAGTACCTGAGCTCCCAGGATGAGAAGTACCAGGCCATTGGGGCCTATTACATCCAGCATACCTGCTTCCAGG 
ATGAATCTGCCAAGCAACAGGTCTATCAGCTGGGAGGCATCTGCAAGCTGGTGGACCTCCTCCGCAGCCCCAACCAG 
AACGTCCAGCAGGCCGCGGCAGGGGCCCTGCGCAACCTGGTGTTCAGGAGCACCACCAACAAGCTGGAGACCCGGAG 
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GCAGAATGGGATCCGCGAGGCAGTCAGCCTCCTGAGGAGAACCGGGAACGCCGAGATCCAGAAGCAGCTGACTGGGC 

TGCTCTGGAACCTGTCTTCCACTGACGAGCTGAAGGAGGAACTCATTGCCGACGCCCTGCCTGTTCTGGCCGACCGC 

GTCATCATTCCCTTCTCTGGCTGGTGCGATGGCAATAGCAACATGTCCCGGGAAGTGGTGGACCCTGAGGTCTTCTT 

CAATGCCACAGGCTGCTTGAGAAAGAGACTGGGCATGCGGGAGCTTCTGGCTCTTGTTCCGCAAAGGGCCACTAGTA 

GCAGGGTGAACCTGAGCTCGGCCGATGCAGGCCGCCAGACCATGCGTAACTACTCAGGGCTCATTGATTCCCTCATG 

GCCTATGTCCAGAACTGTGTAGCGGCCAGCCGCTGTGACGACAAGTCTGTGGAAAACTGCATGTGTGTTCTGCACAA 

CCTCTCCTACCGCCTGGACGCCGAGGTGCCCACCCGCTACCGCCAGCTGGAGTATAACGCCCGCAACGCCTACACCG 

AGAAGTCCTCCACTGGCTGCTTCAGCAACAAGAGCGACAAGATGATGAACAACAACTATGACTGCCCCCTGCCTGAG 

GAAGAGACCAACCCCAAGGGCAGCGGCTGGTTGTACCATTCAGATGCCATCCGCACCTACCTGAACCTCATGGGCAA 

GAGCAAGAAAGATGCTACCCTGGAGGCCTGTGCTGGTGCCCTGCAGAACCTGACAGCCAGCAAGGGGCTGATGTCCA 

GTGGCATGAGCCAGTTGATTGGGCTGAAGGAAAAGGGCCTGCCACAAATTGCCCGCCTCCTGCAATCTGGCAACTCT 

GATGTGGTGCGGTCCGGAGCCTCCCTCCTGAGCAACATGTCCCGCCACCCTCTGCTGCACAGAGTGATGGGGAACCA 

GGTGTTCCCGGAGGTGACCAGGCTCCTCACCAGCCACACTGGCAATACCAGCAACTCCGAAGACATCTTGTCCTCGG 

CCTGCTACACTGTGAGGAACCTGATGGCCTCGCAGCCACAACTGGCCAAGCAGTACTTCTCCAGCAGCATGCTCAAC 

AACATCATCAACCTGTGCCGAAGCAGTGCCTCACCCAAGGCCGCAGAAGCTGCCCGGCTTCTCCTGTCTGACATGTG 

GTCCAGCAAGGAACTGCAGGGTGTCCTCAGACAGCAAGGTTTCGATAGGAACATGCTGGGAACCTTAGCTGGGGCCA 

ACAGCCTCAGGAACTTCACCTCCCGATTCTAAGAAGAGACTGTCCAAGCAAGTTAGGCTTGCAGGAAGATATGACCC 

AGCTGAGAAGCCCTCAGGCCTCGCTGGATGGGGTTTTCTGTCCATCCTGTGCAGTATTTGGGAAAGTTCACAAGAAA 

CTGAGAAGAAACCTAAAAACTGTGGATAGTGGAAAGATTTTTAGATTTTTTTTTTCCTTGGGGAAACTGGCAGGCAA 

TGGGGGTTAGGGAGGTTGGGGCGGGGGGGGCTTTCTTGAGTTAAAGGGGCTTATATGTGATGTCAATATTTCTTCCT 

CTGAGAAATGGTATATATATGTGTCTAATGTAAGTGTGTGCATGCATGTGCGCGTGCATGTGTGTGTGTGTGAGTGT 

CTTAAAGCATAACCACAAACTGCAAAAAGCTAGGTAAGCTATTTTGTTGCAGCTCATAAGGTGGTGAAAAGGACTCT 

CCTGTGTTTCTTACTCATAGGCAAGGACAACATGTGCTTTTTGGTGAGCTGCTCATAATTCCTGAAATGTGTGGTGC 

CAGGGCAAGGGGGC CATC ACT GCAGTCAGGCCCTCAGAGGAGTCCTGCAGGCTTCCTACCAGTGGTCTCCAAGGGTG 

CAGGAGTAACTGGGGCTGGGCCAGCCTCCCCCCTTACAAGGCTGCTTTCCAGGAAGGGAGGTCTGGTGTATCTCATG 

GGAGAATCTGGGGTGTCTGTAGTGTCACCCCTCCAGCAGCGCCACAAGGACTGAGGTTGGGTAGGTGTGAGGTTCCA 

GAGGACAGCAGGACACTCTCGCATACTTTGCCAAATGAGGCCTGCTCAGAGGAGTAGGAGCTGAAAGATGGTGCCTT 

CCACCCTCTTGGGCTGTGTGCCCATCAGAGCAGGCTCAGCCTGCAAAGGCCCTGCATTCAGAGGTCTTGTAATCTAC 

TTGTTGCAGGAGAAAGAAGGTAAAAAATGATTTTTTTAAGAAAAGCTATTTTATTGCAGCTCTTTCCCAAGAGCTGT 

TCTGGGAATGGCTGGTCTTCATATTCCCAGTGGAGAGGGGAACAAGTGGGGCTGGGCATATACCTATTCCGGCTTCT 

AGTGGGATGGAGTTGGGGTATAGAAATTAACCAGGAAGATGTTTCCACCAAGCCTGCTGTGAGTCAATTGAGGGAGT 

GTTTGGGGTCCCAGGAGACTTGGACGGGGGGAGTTTGGGTAGACTAGGAAAGGAAAGTGCCATATCAGGGTACCGGT 

ACCGGCAAGCTCACATCTCAGCCAGGGGCCATGCCCCACTTCCCCTGACCCCAGCTGTCTTGTCTCCACTCTGTGAA 

ACCCACAGGGGATGTGATAAACAGGGCTATTAGGGGTATCAGCCACGTCGAGCCCCCAGACTCTGTGCACTTCAGAC 

CAGCAGCAGCAGGAGGGCTCCCGAGGGCCTTATGAGAAAACCTGTGTGGACATCCCTTGGTGTACACTAAGACAGAG 

CAGAGCCCAGCGCTCCCAAGCCTTCCTCCTTCCAGCTTCTACCTCCATGCTAGCATTGCTGGTGTTAGAGAGGAATT 

AACTTCCTGGTCTGTGCCCTTCTCTAGAAGAATATAAGATGCTCCTCCTCCTCACCCCTTCTCAGCCTCCTCCCAAG 

TCTTCCTCTTCTGCACCACCCCCGAGTCCAAACCCACCTCTTGCCCCAGCATTCAGGCTGGAAAACACTGATGTGGA 
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CTCAGTATGACAACTGAGATGGGGGAAGCCAGACATGTGAGGACGCTGTCCTCCGAGAGGTGTCCCCGGCTGTTAGC 

CAGCTGTGCTGTGGTGCTGTGGGXCTGTCATACCCTCCCTTGCTTCTGTTCACACTGGGAGGCCCACTCCTGGCTCA 

CCTCTCCCTCTCAGGGACCCACGTGGGAGCCTGGATCCCTGGACTGTCCTGGGCATAGGTTTCAGGGGCCTCCTTTG 

TTGTCATCAGAACCCAGAGGAATTCTTCTCCTAAAAAATACGTATGGCATACCAATCTGTGCGGGGCAGTGTCCTAA 

GCACTTAGACTACATCAGGGAAGAACACAGACCACATCCCTGTCCTCATGCGGCTTATGTTTTCTGGAGGAAAGTGG 

AGACACAAGTCCTTGGCTTTAGGGCTCCCCCGGCTGGGGGCTGTGCAGTCCGGTCAGGGCGGGAGGGGAAATGCACC 

GCTGCATGTGAACCTTACCAGCCCAGGCGGATGCCCCTTCCCCTTAGCACTACCCTGGCCTCCTGCATCCCCTCGCC 

TCATGTTCCTCCCACCTTCAAAGAATGAAGAGCCCCATGGGCCCAGCCCCTGCCCTGGGAACCAGGCAGCCTTCCAG 

ACCTCAGGGGCTGAGGCAGACTATTAGGGCAGGGCTGACTTTGGTGACACTGCCCATTCCCTCTCAGGCCAGCTCAG 

GTCACCCGGGCCTCTGACCCAGGCCTGTCACTTTGAGAGGGGCAAAACTGAGAGGGGCTTTTCCTAGAGAAAGAGAA 

CAAGGAGCTTGCCAGGCTTCATGTAGCCGACACACGTCTCAGGATTTTAAGTCCACATTGGCCTCACACTACCAGGG 

CCAATGCCCAAAATAAGGAGTTCCAATTTGGGGCCAAATGAGGAAGGACACAGACTCTGCCCTGGGATCTCCTGTGC 

TAGCGGCCAATGACAAATCCAGTCATTGGCCACCAGCCACCTCTGCAGTGGGGACCACACTAGCAGCCCTGACTCCA 

CACTCCTCCTGGGGACCCAAGAGGCAGTGTTGCTGTCTGCGTGTCCACCTTGGAATCTGGCTGAACTGGCTGGGAGG 

ACCAAGACTGCGGCTGGGGTGGGCAGGGAAGGGAAGCCGGGGGCTGCTGTGAGGGATCTTGGAGCTTCCCTGTAGCC 

CACCTTCCCCTTGCXTCATGTTTGTAGAGGAACCTTGTGCCGGCCAGGCCCAGTTTCCTTGTGTGATACACTAATGT 

ATTTGCTTTTTTTGGAAATAGAGAAAATCAATAAATTGCTAGTGTTTCTTTGAACTTTTTCGA 

SEQ ID NO:1603 

>HSB6PR # transcriptj #len 4543 (Includes node 29 - TAA seg 34 ; node 33 - 
TAA seg 8) 

TTCCATTTTCCTAATACCAGACACAAACAAGAGAGATGGCAAGGATCTGCAGTGAGGACATCGAGTGCAGTGGGCTG 

ACCATCCCCAAGGCTGTGCAGTACCTGAGCTCCCAGGATGAGAAGTACCAGGCCATTGGGGCCTATTACATCCAGCA 

TACCTGCTTCCAGGATGAATCTGCCAAGCAACAGGTCTATCAGCTGGGAGGCATCTGCAAGCTGGTGGACCTCCTCC 

GCAGCCCCAACCAGAACGTCCAGCAGGCCGCGGCAGGGGCCCTGCGCAACCTGGTGTTCAGGAGCACCACCAACAAG 

CTGGAGACCCGGAGGCAGAATGGGATCCGCGAGGCAGTCAGCCTCCTGAGGAGAACCGGGAACGCCGAGATCCAGAA 

GCAGCTGACTGGGCTGCTCTGGAACCTGTCTTCCACTGACGAGCTGAAGGAGGAACTCATTGCCGACGCCCTGCCTG 

TTCTGGCCGACCGCGTCATCATTCCCTTCTCTGGCTGGTGCGATGGCAATAGCAACATGTCCCGGGAAGTGGTGGAC 

CCTGAGGTCTTCTTCAATGCCACAGGCTGCTTGAGAAAGAGACTGGGCATGCGGGAGCTTCTGGCTCTTGTTCCGCA 

AAGGGCCACTAGTAGCAGGGTGAACCTGAGCTCGGCCGATGCAGGCCGCCAGACCATGCGTAACTACTCAGGGCTCA 

TTGATTCCCTCATGGCCTATGTCCAGAACTGTGTAGCGGCCAGCCGCTGTGACGACAAGTCTGTGGAAAACTGCATG 

TGTGTTCTGCACAACCTCTCCTACCGCCTGGACGCCGAGGTGCCCACCCGCTACCGCCAGCTGGAGTATAACGCCCG 

CAACGCCTACACCGAGAAGTCCTCCACTGGCTGCTTCAGCAACAAGAGCGACAAGATGATGAACAACAACTATGACT 

GCCCCCTGCCTGAGGAAGAGACCAACCCCAAGGGCAGCGGCTGGTTGTACCATTCAGATGCCATCCGCACCTACCTG 

AACCTCATGGGCAAGAGCAAGAAAGATGCTACCCTGGAGGCCTGTGCTGGTGCCCTGCAGAACCTGACAGCCAGCAA 

GGGGCTGATGTCCAGTGGCATGAGCCAGTTGATTGGGCTGAAGGAAAAGGGCCTGCCACAAATTGCCCGCCTCCTGC 

AATCTGGCAACTCTGATGTGGTGCGGTCCGGAGCCTCCCTCCTGAGCAACATGTCCCGCCACCCTCTGCTGCACAGA 

GTGATGGGGAACCAGGTGTTCCCGGAGGTGACCAGGCTCCTCACCAGCCACACTGGCAATACCAGCAACTCCGAAGA 



WO 2006/131783 



PCT/IB2005/004037 



687 

CATCTTGTCCTCGGCCTGCTACACTGTGAGGAACCTGATGGCCTCGCAGCCACAACTGGCCAAGCAGTACTTCTCCA 

GCAGCATGCTCAACAACATCATCAACCTGTGCCGAAGCAGTGCCTCACCCAAGGCCGCAGAAGCTGCCCGGCTTCTC 

CTGTCTGACATGTGGTCCAGCAAGGAACTGCAGGGTGTCCTCAGACAGCAAGGTTTCGATAGGAACATGCTGGGAAC 

CTTAGCTGGGGCCAACAGCCTCAGGAACTTCACCTCCCGATTCTAAGAAGAGACTGTCCAAGCAAGTTAGGCTTGCA 

GGAAGATATGACCCAGCTGAGAAGCCCTCAGGCCTCGCTGGATGGGGTTTTCTGTCCATCCTGTGCAGTATTTGGGA 

AAGTTCACAAGAAACTGAGAAGAAACCTAAAAACTGTGGATAGTGGAAAGATTTTTAGATTTTTTTTTTCCTTGGGG 

AAACTGGCAGGCAATGGGGGTTAGGGAGGTTGGGGCGGGGGGGGCTTTCTTGAGTTAAAGGGGCTTATATGTGATGT 

CAATATTTCTTCCTCTGAGAAATGGTATATATATGTGTCTAATGTAAGTGTGTGCATGCATGTGCGCGTGCATGTGT 

GTGTGTGTGAGTGTCTTAAAGCATAACCACAAACTGCAAAAAGCTAGGTAAGCTATTTTGTTGCAGCTCATAAGGTG 

GTGAAAAGGACTCTCCTGTGTTTCTTACTCATAGGCAAGGACAACATGTGCTTTTTGGTGAGCTGCTCATAATTCCT 

GAAATGTGTGGTGCCAGGGCAAGGGGGCCATCACTGCAGTCAGGCCCTCAGAGGAGTCCTGCAGGCTTCCTACCAGT 

GGTCTCCAAGGGTGCAGGAGTAACTGGGGCTGGGCCAGCCTCCCCCCTTACAAGGCTGCTTTCCAGGAAGGGAGGTC 

TGGTGTATCTCATGGGAGAATCTGGGGTGTCTGTAGTGTCACCCCTCCAGCAGCGCCACAAGGACTGAGGTTGGGTA 

GGTGTGAGGTTCCAGAGGACAGCAGGACACTCTCGCATACTTTGCCAAATGAGGCCTGCTCAGAGGAGTAGGAGCTG 

AAAGATGGTGCCTTCCACCCTCTTGGGCTGTGTGCCCATCAGAGCAGGCTCAGCCTGCAAAGGCCCTGCATTCAGAG 

GTCTTGTAATCTACTTGTTGCAGGAGAAAGAAGGTAAAAAATGATTTTTTTAAGAAAAGCTATTTTATTGCAGCTCT 

TTCCCAAGAGCTGTTCTGGGAATGGCTGGTCTTCATATTCCCAGTGGAGAGGGGAACAAGTGGGGCTGGGCATATAC 

CTATTCCGGCTTCTAGTGGGATGGAGTTGGGGTATAGAAATTAACCAGGAAGAXGTTTCCACCAAGCCTGCTGTGAG 

TCAATTGAGGGAGTGTTTGGGGTCCCAGGAGACTTGGACGGGGGGAGTTTGGGTAGACTAGGAAAGGAAAGTGCCAT 

ATCAGGGTACCGGTACCGGCAAGCTCACATCTCAGCCAGGGGCCATGCCCCACTTCCCCTGACCCCAGCTGTCTTGT 

CTCCACTCTGTGAAACCCACAGGGGATGTGATAAACAGGGCTATTAGGGGTATCAGCCACGTCGAGCCCCCAGACTC 

TGTGCACTTCAGACCAGCAGCAGCAGGAGGGCTCCCGAGGGCCTTATGAGAAAACCTGTGTGGACATCCCTTGGTGT 

ACACTAAGACAGAGCAGAGCCCAGCGCTCCCAAGCCTTCCTCCTTCCAGCTTCTACCTCCATGCTAGCATTGCTGGT 

GTTAGAGAGGAATTAACTTCCTGGTCTGTGCCCTTCTCTAGAAGAATATAAGATGCTCCTCCTCCTCACCCCTTCTC 

AGCCTCCTCCCAAGTCTTCCTCTTCTGCACCACCCCCGAGTCCAAACCCACCTCTTGCCCCAGCATTCAGGCTGGAA 

AACACTGATGTGGACTCAGTATGACAACTGAGATGGGGGAAGCCAGACATGTGAGGACGCTGTCCTCCGAGAGGTGT 

CCCCGGCTGTTAGCCAGCTGTGCTGTGGTGCTGTGGGTCTGTCATACCCTCCCTTGCTTCTGTTCACACTGGGAGGC 

CCACTCCTGGCTCACCTCTCCCTCTCAGGGACCCACGTGGGAGCCTGGATCCCTGGACTGTCCTGGGCATAGGTTTC 

AGGGGCCTCCTTTGTTGTCATCAGAACCCAGAGGAATTCTTCTCCTAAAAAATACGTATGGCATACCAATCTGTGCG 

GGGCAGTGTCCTAAGCACTTAGACTACATCAGGGAAGAACACAGACCACATCCCTGTCCTCATGCGGCTTATGTTTT 

CTGGAGGAAAGTGGAGACACAAGTCCTTGGCTTTAGGGCTCCCCCGGCTGGGGGCTGTGCAGTCCGGTCAGGGCGGG 

AGGGGAAATGCACCGCTGCATGTGAACCTTACCAGCCCAGGCGGATGCCCCTTCCCCTTAGCACTACCCTGGCCTCC 

TGCATCCCCTCGCCTCATGTTCCTCCCACCTTCAAAGAATGAAGAGCCCCATGGGCCCAGCCCCTGCCCTGGGAACC 

AGGCAGCCTTCCAGACCTCAGGGGCTGAGGCAGACTATTAGGGCAGGGCTGACTTTGGTGACACTGCCCATTCCCTC 

TCAGGCCAGCTCAGGTCACCCGGGCCTCTGACCCAGGCCTGTCACTTTGAGAGGGGCAAAACTGAGAGGGGCTTTTC 

CTAGAGAAAGAGAACAAGGAGCTTGCCAGGCTTCATGTAGCCGACACACGTCTCAGGATTTTAAGTCCACATTGGCC 

TCACACTACCAGGGCCAATGCCCAAAATAAGGAGTTCCAATTTGGGGCCAAATGAGGAAGGACACAGACTCTGCCCT 

GGGATCTCCTGTGCTAGCGGCCAATGACAAATCCAGTCATTGGCCACCAGCCACCTCTGCAGTGGGGACCACACTAG 
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CAGCCCTGACTCCACACTCCTCCTGGGGACCCAAGAGGCAGTGTTGCTGTCTGCGTGTCCACCTTGGAATCTGGCTG 
AACTGGCTGGGAGGACCAAGACTGCGGCTGGGGTGGGCAGGGAAGGGAAGCCGGGGGCTGCTGTGAGGGATCTTGGA 
GCTTCCCTGTAGCCCACCTTCCCCTTGCTTCATGTTTGTAGAGGAACCTTGTGCCGGCCAGGCCCAGTTTCCTTGTG 
TGATACACTAATGTATTTGCTTTTTTTGGAAATAGAGAAAATCAATAAATTGCTAGTGTTTCTTTGAACTTTTTCGA 

SEQ ID NO: 1604 

>HSB6PR_PEA_2_Pl # trn_0 lien 726 

MNHSPLKTALAYECFQDQDNSTLALPSDQKMKTGTSGRQRVQEQVMMTVKRQKSKSSQSSTLSHSNRGSMyDGLADN 
YNYGTTSRSS YYSKFQAGNGSWGYPI YNGTLKREPDNRRFSSYSQMENWSRHYPRGSCNTTGAGSDICFMQKIKASR 
SEPDLYCDPRGTLRKGTLGSKGQKTTQNRYSFYSTC5GQKA1KKCPVRPPSCASKQDPVY I PPISCNKDLSFGHSRA 
SSKICSEDIECSGLTIPKAVQYLSSQDEKYQAIGAYYIQHTCFQDESAKQQVYQLGGICKLVDLLRSPNQNVQQAAA 
GALRNLVFRSTTNKLETRRQNGIREAVSLLRRTGNAEIQKQLTGLLWNLSSTDELKEELIADALPVLADRVIIPFSG 
WCDGNSNMSREVVDPEVFFNATGCLRNLSSADAGRQTMRNYSGLIDSLMAYVQNCVAASRCDDKSVENCMCVLHNLS 
YRLDAEVPTRYRQLEYNARNAYTEKSSTGCFSNKSDKMMNNNYDCPLPEEETNPKGSGWLYHSDAIRTYLNLMGKSK 
KDATLEACAGALQNLTASKGLMSSGMSQLIGLKEKGLPQIARLLQSGNSDVVRSGASLLSNMSRHPLLHRVMGNQVF 
PEVTRLLTSHTGNTSNSEDILSSACYTVRNLMASQPQLAKQYFSSSMLNNIINLCRSSASPKAAEAARLLLSDMWSS 
KELQGVLRQQGFDRNMLGT LAGAN SLRN FT SRF 

SEQ ID NO:1605 

>HSB6PR_PEA_2_P6 # trn_5 lien 74 7 

MNHSPLKTALAYECFQDQDNSTLALPSDQKMKTGTSGRQRVQEQVMMTVKRQKSKSSQSSTLSHSNRGSMYDGLADN 
YNYGTTSRSS YYSKFQAGNGSWGYPIYNGTLKREPDNRRFSSYSQMENWSRHYPRGSCNTTGAGSDICFMQKIKASR 
SEPDLYCDPRGTLRKGTLGSKGQKTTQNRYSFYSTCSGQKAIKKCPVRPPSCASKQDPVYI PPISCNKDLSFGHSRA 
S SKI C SEDIECSGLTI PKAVQYLS SQDEKYQAI GAYYIQHTCFQDE SAKQQVYQLGGI CKLVDLLRS PNQNVQQAAA 
GALRNLVFRSTTNKLETRRQNGIREAVSLLRRTGNAEIQKQLTGLLWNLSSTDELKEELIADALPVLADRVIIPFSG 
WCDGNSNMSREVVDPEVFFNATGCLRKRLGMRELLALVPQRATSSRVNLSSADAGRQTMRNYSGLIDSLMAYVQNCV 
AASRCDDKSVENCMCVLHNLSYRLDAEVPTRYRQLEYNARNAYTEKSSTGCFSNKSDKMMNNNYDCPLPEEETNPKG 
SGWLYHSDAIRTYLNLMGKSKKDATLEACAGALQNLTASKGLMSSGMSQLIGLKEKGLPQIARLLQSGNSDVVRSGA 
SLLSNMSRHPLLHRVMGNQVFPEVTRLLTSHTGNTSNSEDILSSACYTVRNLMASQPQLAKQYFSSSMLNNIINLCR 
SSASPKAAEAARLLLSDMWSSKELQGVLRQQGFDRNMLGTLAGANSLRNFTSRF 

SEQ ID NO:1606 

>HSB6PR_PEA_2_P7 # trn_6 #len 516 

jyLARICSEDIECSGLTI PKAVQYLS SQDEKYQAI GAYYIQHTCFQDESAKQQVYQLGGICKLVDLLRS PNQNVQQAAA 
GALRNLVFRSTTNKLETRRQNGIREAVSLLRRTGNAEIQKQLTGLLWNLSSTDELKEELIADALPVLADRVIIPFSG 
WCDGNSNMSREVVDPEVFFNATGCLRKRLGMRELLALVPQRATSSRVNLSSADAGRQTMRNYSGLIDSLMAYVQNCV 
AASRCDDKSVENCMCVLHNLSYRLDAEVPTRYRQLEYNARNAYTEKSSTGCFSNKSDKMMNNNYDCPLPEEETNPKG 
SGWLYHSDAIRTYLNLMGKSKKDATLEACAGALQNLTASKGLMSSGMSQLIGLKEKGLPQIARLLQSGNSDVVRSGA 
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SLLSNMSRHPLLHRVMGNQVFPEVTRLLTSHTGNTSNSEDILSSACYTVRNLMASQPQLAKQYFSSSMLNNIINLCR 
SSASPKAAEAARLLLSDMWSSKELQGVLRQQGFDRNMLGTLAGANSLRNFTSRF 

SEQ ID NO:1607 

>HSB6PR # node_33 (TAA seg 8) #len 43 
TTCCATTTTCCTAATACCAGACACAAACAAGAGAGATGGCAAG 

SEQ ID NO:1608 

>HSB6PR # node_29 { TAA seg 34) lien 1199 

CTCTGAGAAATGGTATATATATGXGTCTAATGTAAGTGTGTGCATGCATGTGCGCGTGCATGTGTGTGTGTGTGAGT 

GTCTTAAAGCATAACCACAAACTGCAAAAAGCTAGGTAAGCTATTTTGTTGCAGCTCATAAGGTGGTGAAAAGGACT 

CTCCTGTGTTTCTTACTCATAGGCAAGGACAACATGTGCTTTTTGGTGAGCTGCTCATAATTCCTGAAATGTGTGGT 

GCCAGGGCAAGGGGGCCATCACTGCAGTCAGGCCCTCAGAGGAGTCCTGCAGGCTTCCTACCAGTGGTCTCCAAGGG 

TGCAGGAGTAACTGGGGCTGGGCCAGCCTCCCCCCTTACAAGGCTGCTTTCCAGGAAGGGAGGTCTGGTGTATCTCA 

TGGGAGAATCTGGGGTGTCTGTAGTGTCACCCCTCCAGCAGCGCCACAAGGACTGAGGTTGGGTAGGTGTGAGGTTC 

CAGAGGACAGCAGGACACTCTCGCATACTTTGCCAAATGAGGCCTGCTCAGAGGAGTAGGAGCTGAAAGATGGTGCC 

TTCCACCCTCTTGGGCTGTGTGCCCATCAGAGCAGGCTCAGCCTGCAAAGGCCCTGCATTCAGAGGTCTTGTAATCT 

ACTTGTTGCAGGAGAAAGAAGGTAAAAAATGATTTTTTTAAGAAAAGCTATTTTATTGCAGCTCTTTCCCAAGAGCT 

GTTCTGGGAATGGCTGGTCTTCATATTCCCAGTGGAGAGGGGAACAAGTGGGGCTGGGCATATACCTATTCCGGCTT 

CTAGTGGGATGGAGTTGGGGTATAGAAATTAACCAGGAAGATGTTTCCACCAAGCCTGCTGTGAGTCAATTGAGGGA 

GTGTTTGGGGTCCCAGGAGACTTGGACGGGGGGAGTTTGGGTAGACTAGGAAAGGAAAGTGCCATATCAGGGTACCG 

GTACCGGCAAGCTCACATCTCAGCCAGGGGCCATGCCCCACTTCCCCTGACCCCAGCTGTCTTGTCTCCACTCTGTG 

AAACCCACAGGGGATGTGATAAACAGGGCTATTAGGGGTATCAGCCACGTCGAGCCCCCAGACTCTGTGCACTTCAG 

ACCAGCAGCAGCAGGAGGGCTCCCGAGGGCCTTATGAGAAAACCTGTGTGGACATCCCTTGGTGTACACTAAGACAG 

AGCAGAGCCCAGCGCTCCCAAGCCTTCCTCCTTCCAGCTTCTAC 

SEQ ID NO:1609 

>T86235 # transcript_31 #len 2871 (Includes node 39 - TAA seg 44; node 37 - 
TAA seg 42) 

CTCCAGCAGCACCCGAGAGGGTCAGGAGAAAAGCGGAGGAAGCTGGGTAGGCCCTGAGGGGCCTCGGTAAGCCATCA 
TGACCACCCGGCAAGCCACGAAGGATCCCCTCCTCCGGGGTGTATCTCCTACCCCTAGCAAGATTCCGGTACGCTCT 
CAGAAACGCACGCCTTTCCCCACTGTTACATCGTGCGCCGTGGACCAGGAGAACCAAGATCCAAGGAGATGGGTGCA 
GAAACCACCGCTCAATATTCAACGCCCCCTCGTTGATTCAGCAGGCCCCAGGCCGAAAGCCAGGCACCAGGCAGAGA 
CATCACAAAGATTGAGGCTCCAGGGACCATAGAGTTTGTGGCTGACCCTGCAGCCCTGGCCACCATCCTGTCAGGTG 
AGGGTGTGAAGAGCTGTCACCTGGGGCGCCAGCCTAGTCTGGCTAAAAGAGTACTGGTTCGAGGAAGTCAGGGAGGC 
ACCACCCAGAGGGTCCAGGGTGTTCGGGCCTCTGCATATTTGGCCCCCAGAACCCCCACCCACCGACTGGACCCTGC 
CAGGGCTTCCTGCTTCTCTAGGCTGGAGGGACCAGGACCTCGAGGCCGGACATTGTGCCCCCAGAGGCTACAGGCTC 
TGATTTCACCTTCAGGACCTTCCTTTCACCCTTCCACTCGCCCCAGTTTCCAGGAGCTAAGAAGGGAGACAGCTGGC 
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AGCAGCCGGACTTCAGTGAGCCAGGCCTCAGGATTGCTCCTGGAGACCCCAGTCCAGCCTGCTTTCTCTCTTCCTAA 

AGGAGAACGCGAGGTTGTCACTCACTCAGATGAAGGAGGTGTGGCCTCTCTTGGTCTGGCCCAGCGAGTACCATTAA 

GAGAAAACCGAGAAATGTCACATACCAGGGACAGCCATGACTCCCACCTGATGCCCTCCCCTGCCCCTGTGGCCCAG 

CCCTTGCCTGGCCATGTGGTGCCATGTCCATCACCCTTTGGACGGGCTCAGCGTGTACCCTCCCCAGGCCCTCCAAC 

TCTGACCTCATATTCAGTGTTGCGGCGTCTCACCGTTCAACCTAAAACCCGGTTCACACCCATGCCATCAACCCCCA 

GAGTTCAGCAGGCCCAGTGGCTGCGTGGTGTCTCCCCTCAGTCCTGCTCTGAAGATCCTGCCCTGCCCTGGGAGCAG 

GTTGCCGTCCGGTTGTTTGACCAGGAGAGTTGTATAAGGTCACTGGAGGGTTCTGGGAAACCACCGGTGGCCACTCC 

TTCTGGACCCCACTCTAACAGAACCCCCAGCCTCCAGGAGGTGAAGATTCAACGCATCGGTATCCTGCAACAGCTGT 

TGAGACAGGAAGTAGAGGGGCTGGTAGGGGGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTCTGGATATGGTTGAA 

CTTCAGCCCCTGCTGACTGAGATTTCTAGAACTCTGAATGCCACAGAGCATAACTCTGGGACTTCCCACCTTCCTGG 

ACTGTTAAAACACTCAGGGCTGCCAAAGCCCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCCCTGCCCTCCGGCAG 

AGCCTGGGCCCCCAGAGGCCTTCTGTAGGAGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGGAACAGCTTGAAGTA 

CCAGAGCCCTACCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAGATACCGGAGTCCTC 

TCGCCAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTACTGTAGGATTG 

AGCCTGAGATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCGGGCCC 

CTTCAGCCCAGCACCCAGGGGCAGTCTGGACCCCCAGGGCCCTGCCCTAGGGTAGAGCTGGGGGCATCAGAGCCCTG 

CACCCTGGAACATAGAAGTCTAGAGTCCAGTCTACCACCCTGCTGCAGTCAGTGGGCTCCAGCAACCACCAGCCTGA 

TCTTCTCTTCCCAACACCCGCTTTGTGCCAGCCCCCCTATCTGCTCACTCCAGTCTTTGAGACCCCCAGCAGGCCAG 

GCAGGTAAGGAGTTGGCTGGGAAGGAGTGTGAACACAAGAGGTCCTCACCTCACTGTGAGCTGCACACCTGCCCTGC 

CCCTACCCCAGGCAATCTCATGCTTCCACACCTTCCACCCTGGCCCAGCCTGGCTCTCCCTCAGGAAGAGGGGAGGG 

GCTGCACTTCCAGCCCTGTGCTCCTAATTGGCTTGGCCGTTGGTGGGGGAGGAGGAGAGGACAGTACATGGTGGAAG 

TATAGGACCCCAGACCTCCCTCTAAATTTTCCATGCCCCTCAGGCCTCAGCAATCTGGCCCCTCGAACCCTAGCCCT 

GAGGGAGCGCCTCAAATCGTGTTTAACCGCCATCCACTGCTTCCACGAGGCTCGTCTGGACGATGAGTGTGCCTTTT 

ACACCAGCCGAGCCCCTCCCTCAGGCCCCACCCGGGTCTGCACCAACCCTGTGGCTACATTACTCGAATGGCAGGAT 

GCCCTGGTGAGACTCCAACCCACAGCCCAGCTGTGGCTGCACAGTGAGCCTGATGGGAGGTGGGGAACAGGGACAGG 

GGGCCACCTGGGCTTCTTCACAGAGAGGTCAGCAGGAAGGCTTGGCTACAGTGCAAGGTTGGCTGAGCTGTGACAAG 

GTCTTCTCTGTCTCCAGTGTTTCATTCCAGTTGGTTCTGCTGCCCCCCAGGGCTCTCCATGATGAGACAACCACTCC 

TGCCCTGCCGTACTTCTTCCTTTTAGCCCTTATTTATTGTCGGTCTGCCCATGGGACTGGGAGCCGCCCACTTTTGT 

CCT CAATAAAGT T T CTAAAGTA 
SEQ ID NO:1610 

>T86235 # transcript_32 #len 2514 (Includes node 39 - TAA seg 44) 

CTCCAGCAGCACCCGAGAGGGTCAGGAGAAAAGCGGAGGAAGCTGGGTAGGCCCTGAGGGGCCTCGGTAAGCCATCA 

TGACCACCCGGCAAGCCACGAAGGATCCCCTCCTCCGGGGTGTATCTCCTACCCCTAGCAAGATTCCGGTACGCTCT 

CAGAAACGCACGCCTTTCCCCACTGTTACATCGTGCGCCGTGGACCAGGAGAACCAAGATCCAAGGAGATGGGTGCA 

GAAACCACCGCTCAATATTCAACGCCCCCTCGTTGATTCAGCAGGCCCCAGGCCGAAAGCCAGGCACCAGGCAGAGA 

CATCACAAAGATTGAGGCTCCAGGGACCATAGAGTTTGTGGCTGACCCTGCAGCCCTGGCCACCATCCTGTCAGGTG 

AGGGTGTGAAGAGCTGTCACCTGGGGCGCCAGCCTAGTCTGGCTAAAAGAGTACTGGTTCGAGGAAGTCAGGGAGGC 
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ACCACCCAGAGGGTCCAGGGTGTTCGGGCCTCTGCATATTTGGCCCCCAGAACCCCCACCCACCGACTGGACCCTGC 

CAGGGCTTCCTGCTTCTCTAGGCTGGAGGGACCAGGACCTCGAGGCCGGACATTGTGCCCCCAGAGGCTACAGGCTC 

TGATTTCACCTTCAGGACCTTCCTTTCACCCTTCCACTCGCCCCAGTTTCCAGGAGCTAAGAAGGGAGACAGCTGGC 

AGCAGCCGGACTTCAGTGAGCCAGGCCTCAGGATTGCTCCTGGAGACCCCAGTCCAGCCTGCTTTCTCTCTTCCTAA 

AGGAGAACGCGAGGTTGTCACTCACTCAGATGAAGGAGGTGTGGCCTCTCTTGGTCTGGCCCAGCGAGTACCATTAA 

GAGAAAACCGAGAAATGTCACATACCAGGGACAGCCATGACTCCCACCTGATGCCCTCCCCTGCCCCTGTGGCCCAG 

CCCTTGCCTGGCCATGTGGTGCCATGTCCATCACCCTTTGGACGGGCTCAGCGTGTACCCTCCCCAGGCCCTCCAAC 

TCTGACCTCATATTCAGTGTTGCGGCGTCTCACCGTTCAACCTAAAACCCGGTTCACACCCATGCCATCAACCCCCA 

GAGTTCAGCAGGCCCAGTGGCTGCGTGGTGTCTCCCCTCAGTCCTGCTCTGAAGATCCTGCCCTGCCCTGGGAGCAG 

GTTGCCGTCCGGTTGTTTGACCAGGAGAGTTGTATAAGGTCACTGGAGGGTTCTGGGAAACCACCGGTGGCCACTCC 

TTCTGGACCCCACTCTAACAGAACCCCCAGCCTCCAGGAGGTGAAGATTCAACGCATCGGTATCCTGCAACAGCTGT 

TGAGACAGGAAGTAGAGGGGCTGGTAGGGGGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTCTGGATATGGTTGAA 

CTTCAGCCCCTGCTGACTGAGATTTCTAGAACTCTGAATGCCACAGAGCATAACTCTGGGACTTCCCACCTTCCTGG 

ACTGTTAAAACACTCAGGGCTGCCAAAGCCCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCCCTGCCCTCCGGCAG 

AGCCTGGGCCCCCAGAGGCCTTCTGTAGGAGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGGAACAGCTTGAAGTA 

CCAGAGCCCTACCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAGATACCGGAGTCCTC 

TCGCCAGGAACAGCTTGAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCGGGCCCCTTCAGCCCA 

GCACCCAGGGGCAGTCTGGACCCCCAGGGCCCTGCCCTAGGGTAGAGCTGGGGGCATCAGAGCCCTGCACCCTGGAA 

CATAGAAGTCTAGAGTCCAGTCTACCACCCTGCTGCAGTCAGTGGGCTCCAGCAACCACCAGCCTGATCTTCTCTTC 

CCAACACCCGCTTTGTGCCAGCCCCCCTATCTGCTCACTCCAGTCTTTGAGACCCCCAGCAGGCCAGGCAGGCCTCA 

GCAATCTGGCCCCTCGAACCCTAGCCCTGAGGGAGCGCCTCAAATCGTGTTTAACCGCCATCCACTGCTTCCACGAG 

GCTCGTCTGGACGATGAGTGTGCCTTTTACACCAGCCGAGCCCCTCCCTCAGGCCCCACCCGGGTCTGCACCAACCC 

TGTGGCTACATTACTCGAATGGCAGGATGCCCTGGTGAGACTCCAACCCACAGCCCAGCTGTGGCTGCACAGTGAGC 

CTGATGGGAGGTGGGGAACAGGGACAGGGGGCCACCTGGGCTTCTTCACAGAGAGGTCAGCAGGAAGGCTTGGCTAC 

AGTGCAAGGTTGGCTGAGCTGTGACAAGGTCTTCTCTGTCTCCAGTGTTTCATTCCAGTTGGTTCTGCTGCCCCCCA 

GGGCTCTCCATGATGAGACAACCACTCCTGCCCTGCCGTACTTCTTCCTTTTAGCCCTTATTTATTGTCGGTCTGCC 

CATGGGACTGGGAGCCGCCCACTTTTGTCCTCAATAAAGTTTCTAAAGTA 
SEQ ID N0:1611 

>T86235 # transcript_33 #len 2706 (Includes node 37 - TAA seg 42) 

CTCCAGCAGCACCCGAGAGGGTCAGGAGAAAAGCGGAGGAAGCTGGGTAGGCCCTGAGGGGCC^CGGTAAGCCATCA 

TGACCACCCGGCAAGCCACGAAGGATCCCCTCCTCCGGGGTGTATCTCCTACCCCTAGCAAGATTCCGGTACGCTCT 

CAGAAACGCACGCCTTTCCCCACTGTTACATCGTGCGCCGTGGACCAGGAGAACCAAGATCCAAGGAGATGGGTGCA 

GAAACCACCGCTCAATATTCAACGCCCCCTCGTTGATTCAGCAGGCCCCAGGCCGAAAGCCAGGCACCAGGCAGAGA 

CATCACAAAGATTGAGGCTCCAGGGACCATAGAGTTTGTGGCTGACCCTGCAGCCCTGGCCACCATCCTGTCAGGTG 

AGGGTGTGAAGAGCTGTCACCTGGGGCGCCAGCCTAGTCTGGCTAAAAGAGTACTGGTTCGAGGAAGTCAGGGAGGC 

ACCACCCAGAGGGTCCAGGGTGTTCGGGCCTCTGCATATTTGGCCCCCAGAACCCCCACCCACCGACTGGACCCTGC 

CAGGGCTTCCTGCTTCTCTAGGCTGGAGGGACCAGGACCTCGAGGCCGGACATTGTGCCCCCAGAGGCTACAGGCTC 
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TGATTTCACCTTCAGGACCTTCCTTTCACCCTTCCACTCGCCCCAGTTTCCAGGAGCTAAGAAGGGAGACAGCTGGC 

AGCAGCCGGACTTCAGTGAGCCAGGCCTCAGGATTGCTCCTGGAGACCCCAGTCCAGCCTGCTTTCTCTCTTCCTAA 

AGGAGAACGCGAGGTTGTCACTCACTCAGATGAAGGAGGTGTGGCCTCTCTTGGTCTGGCCCAGCGAGTACCATTAA 

GAGAAAACCGAGAAATGTCACATACCAGGGACAGCCATGACTCCCACCTGATGCCCTCCCCTGCCCCTGTGGCCCAG 

CCCTTGCCTGGCCATGTGGTGCCATGTCCATCACCCTTTGGACGGGCTCAGCGTGTACCCTCCCCAGGCCCTCCAAC 

TCTGACCTCATATTCAGTGTTGCGGCGTCTCACCGTTCAACCTAAAACCCGGTTCACACCCATGCCATCAACCCCCA 

GAGTTCAGCAGGCCCAGTGGCTGCGTGGTGTCTCCCCTCAGTCCTGCTCTGAAGATCCTGCCCTGCCCTGGGAGCAG 

GTTGCCGTCCGGTTGTTTGACCAGGAGAGTTGTATAAGGTCACTGGAGGGTTCTGGGAAACCACCGGTGGCCACTCC 

TTCTGGACCCCACTCTAACAGAACCCCCAGCCTCCAGGAGGTGAAGATTCAACGCATCGGTATCCTGCAACAGCTGT 

TGAGACAGGAAGTAGAGGGGCTGGTAGGGGGCCAGTGTGTCCCTCTTAATGGAGGCTCTTCTCTGGATATGGTTGAA 

CTTCAGCCCCTGCTGACTGAGATTTCTAGAACTCTGAATGCCACAGAGCATAACTCTGGGACTTCCCACCTTCCTGG 

ACTGTTAAAACACTCAGGGCTGCCAAAGCCCTGTCTTCCAGAGGAGTGCGGGGAACCACAGCCCTGCCCTCCGGCAG 

AGCCTGGGCCCCCAGAGGCCTTCTGTAGGAGTGAGCCTGAGATACCAGAGCCCTCCCTCCAGGAACAGCTTGAAGTA 

CCAGAGCCCTACCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTGCTGTAGGAGTGAGCCTGAGATACCGGAGTCCTC 

TCGCCAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCAGGCCCCTAGAGTCCTACTGTAGGATTG 

AGCCTGAGATACCGGAGTCCTCTCGCCAGGAACAGCTTGAGGTACCTGAGCCCTGCCCTCCAGCAGAACCCGGGCCC 

CTTCAGCCCAGCACCCAGGGGCAGTCTGGACCCCCAGGGCCCTGCCCTAGGGTAGAGCTGGGGGCATCAGAGCCCTG 

CACCCTGGAACATAGAAGTCTAGAGTCCAGTCTACCACCCTGCTGCAGTCAGTGGGCTCCAGCAACCACCAGCCTGA 

TCTTCTCTTCCCAACACCCGCTTTGTGCCAGCCCCCCTATCTGCTCACTCCAGTCTTTGAGACCCCCAGCAGGCCAG 

GCAGGTAAGGAGTTGGCTGGGAAGGAGTGTGAACACAAGAGGTCCTCACCTCACTGTGAGCTGCACACCTGCCCTGC 

CCCTACCCCAGGCAATCTCATGCTTCCACACCTTCCACCCTGGCCCAGCCTGGCTCTCCCTCAGGAAGAGGGGAGGG 

GCTGCACTTCCAGCCCTGTGCTCCTAATTGGCTTGGCCGTTGGTGGGGGAGGAGGAGAGGACAGTACATGGTGGAAG 

TATAGGACCCCAGACCTCCCTCTAAATTTTCCATGCCCCTCAGGCCTCAGCAATCTGGCCCCTCGAACCCTAGCCCT 

GAGGGAGCGCCTCAAATCGTGTTTAACCGCCATCCACTGCTTCCACGAGGCTCGTCTGGACGATGAGTGTGCCTTTT 

ACACCAGCCGAGCCCCTCCCTCAGGCCCCACCCGGGTCTGCACCAACCCTGTGGCTACATTACTCGAATGGCAGGAT 

GCCCTGTGTTTCATTCCAGTTGGTTCTGCTGCCCCCCAGGGCTCTCCATGATGAGACAACCACTCCTGCCCTGCCGT 

ACTTCTTCCTTTTAGCCCTTATTTATTGTCGGTCTGCCCATGGGACTGGGAGCCGCCCACTTTTGTCCTCAATAAAG 

TTTCTAAAGTA 
SEQ ID NO:1612 

>T86235_PEA_13_P25 # trn_31, 32, 33 #len 87 

MTTRQATKDPLLRGVSPTPSKIPVRSQKRTPFPTVTSCAVDQENQDPRRWVQKPPLNIQRPLVDSAGPRPKARHQAE 
TSQRLRLQGP 

SEQ ID NO:1613 

MJnique aa coded by T31,32,33 [found in T86235_PEA_13_P25] 
RLQGP 
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SEQ ID NO:1614 
>Forward primer - 
TTTTCAAATGGGTAGGGACCATC 

SEQ ID NO:1615 
>Reverse primer - 
TGAGTTTCTCTGGGACCCGGA 

SEQ ID NO: 1616 
>Amplicon seq: 

TTTTCAAATGGGTAGGGACCATCCATGGAGCAGCTGGAACAGCAGTGGGGAGCATCAGAACCAGCTCAACAGTTTGT 
CTACTGTCCGGTCCCAGAGAAACTCA 

SEQ ID NO:1617 
>Forward primer 
TGTTTCTCCAAATGCCAGAACC 

SEQ ID NO: 1618 
>Reverse primer - 
GGCTGGTGACCTGCTTTGA 

SEQ ID NO:1619 
>Amplicon seq: 

TGTTTCTCCAAATGCCAGAACCCAACATTGATAGTCCCTTGAACACACATGCTGCCGAGCTCTGGAAAAACCCCACA 
GCTTTTAAGAAGTACCTGCAAGAAACCTACTCAAAGCAGGTCACCAGCC 

SEQ ID NO:1620 
>Forward primer 
GAGGCGAGGAGTGTGGCAC 

SEQ ID NO:1621 
>Reverse primer - 
GCTGCGATGGGCACGTT 

SEQ ID NO:1622 
>Amplicon : 

GAGGCGAGGAGTGTGGCACTTTGGCGGGGAAGGGGCGGCTCAGCCCTCGGGCCCTCGCCCGCCCTCTCCGGGTCTGG 
AGCGTCTCCTCGCGCCATCCCTGCACCGCCAGGGGGAACGTGCCCATCGCAGC 
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SEQ ID NO:1623 
>Forward primer - 
CCTAGCAGCTCGTCTCCAGC 

SEQ ID NO:1624 
>Reverse primer - 
GGTGTCCTCGCCGAACCT 

SEQ ID NO:1625 
>Amplicon 

CCTAGCAGCTCGTCTCCAGCTAAAAATCCTCCAGAGCCACCGGCGAACACTGCTAGGCACAGCCTGGTGCCAAGCTA 
CGAAGCGCCCGCCGCCGCCGTGCGCAGGTTCGGCGAGGACACC 

SEQ ID No . : 1626 
>T23580_T10 

TTTTTGTTTGTGTACTTAATAAAGGGTAAATATGTCATGTTTGTTTGGAACAGTCATGGTTAATATCTATGTTGTCC 

CAGTATATCTATTAATAGAACTCTCTTTCACTCTCAACAGCGTCCTAGTCCGGATGACAAATTATATGGTTATCTCT 

CAGTAAAGGGTCTTTTTTTAAAATGATTTTTTTTTCAGGGGGTAGGGTAGGCAGGAAGCTTAAACTGGGTAATTTAG 

TTGTAGAAGAGTGCCCTGTGGCAAATAATTGATTATTCATTTCCAGCATCCCCTTTTCTTCTCCTTGACAGTTATTA 

AAAAAAAAAAAGTTACCAGCTTATGTCATTTTAAAGAACACTCGCCCTGAAAACTTCTGAGAGGTTGGCCATTTGAA 

ACCCTGGTTTTAGTGTCTGTATTATTAGTGAACTACCGTGTTCCCATGTGGCTACACAACCACAATTATGTACTATC 

TGGCTCTTTACCAAAGTTTGCAGACCTCTAATCTAGAGTGCGACATTTCCCCTCATTAACTCTTAGGTCCCTTGGCT 

CTAAAAGGGTATATTCATCTTGGCCCTATACAGGGAAAGGGGGAATGGGATTAATGATGTGCTTTGTAAGAAGAACC 

AATTTTAATTTTCACAAAGGCTTGACGTAGCTGTGAGAGAAAGGGTAAGAAGAAGCAGGCTTCTTCTTAGAAGTCTG 

AGATGGCCTAAAGTGGTGGGGGAAGAAGGGAGAGTGGGGAGAAAGAGAAACAAGAAAAGCTGAGAGTGAATTCCCCA 

GAGAGGTAGCCACTGATTCTGCCCTACTCTTTGCTGGAATTCTGGAAAACACCTGGGCTTCTAAAAGATAGGGAGCT 

CATGCATCATGGTAGGGCCACAGCTCAGGCTAGGGCCAGAGATAGCTCAGAGTAGCGCCACGGCTCAGGGTAGGTCC 

ACAGTGCAGGGTAGGGCCATAGCTCAGAGTAGGGCCATAGCTCATAACCACAGCTCAGGGTAGGACCTGCTGATCTA 

TTTGGGGACCCCCAGCAGAGCCTGTCTAATTGCATATCTTGAAAAGGATTGGAAAACTGTCATAATGACATTATTCC 

CTCTCACTTTCCTTGTCCAGGAAAGCCCAGCAGAATCGGAGAGGCTTTTCCGAGGAGCAGCTTCGCCAGGGACAGAA 

CGTAATAGGCCTGCAGATGGGCAGCAACAAGGGAGCCTCCCAGGCGGGCATGACAGGGTACGGGATGCCCAGGCAGA 

TCATGTAGGACGCGGCATCCTGCCCCTGGTAGAGAGGACGAATGTTCCACACCATGGTCTCTACGAAAAAGAAATAG 

TTAGTCACCTTCTGACCTTCTCCTCTTTCTCAAAGCCTTCTGTCCCTGGTTTTTGCAAGTGCTGCATTTCCGCCGAG 

AATCCGCGTTGCCTACTGCTGCCACCTCCTGTTCATTTAGAACTATGCAAAGACTCCGCTTCCGTTTTCCTGAGCTC 

CTCGGGCCCCAGAGTCTCTGTTTGATTATTTATTTATTTATTTATTTATTTGCCAAAAATTCTCCTCTTCAACTTAT 

AGAATGCACCTAATAAAGTAATTAGTCTTGTGTCTTACAGTG 
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SEQ ID NO: 1627 
>HUMOSTRO_PEA_l_PEA_l_P2 1 

MRIAVICFCLLGITCAIPVKQADSGSSEEKQLYNKYPDAVATWLNPDPSQKQNLLAPQVFLNFS 

SEQ ID NO: 1628 
>HUMOSTRO_PEA_l_PEA_l JP2 5 
MRIAVICFCLLGITCAI PVKQADSGSSEEKQH 

SEQ ID NO: 1629 
>HUMOSTRO__PEA_1_PEA_1_P30 

MRIAVICFCLLGITCAIPVKQADSGSSEEKQVSIFYVFI 

SEQ ID NO: 1630 
RPL19 -amplicon 

TGGCAAGAAGAAGGTCTGGTTAGACCCCAATGAGACCAATGAAATCGCCAATGCCAACTCCCGTCAGCAGATCCGGA 
AGCTCATCAAAGATGGGCTGATCA 

SEQ ID NO: 1631 

TATA box Forward primer 

CGGTTTGCTGCGGTAATCAT 

SEQ ID NO: 1632 

TATA box Reverse primer 

TTTCTTGCTGCCAGTCTGGAC 

SEQ ID NO: 1633 
TATA box -amplicon 

CGGTTTGCTGCGGTAATCATGAGGATAAGAGAGCCACGAACCACGGCACTGATTTTCAGTTCTGGGAAAATGGTGTG 
CACAGGAGCCAAGAGTGAAGAACAGTCCAGACTGGCAGCAAGAAA 

SEQ ID NO: 1634 
H61775seg8F2 

GAAGGCTCTTGTCACTTACTAGCCAT 

SEQ ID NO: 1635 
H61775seg8R2 
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TGTCACCATATTTAATCCTCCCAA 

SEQ ID NO: 1636 
H61775seg8 

GAAGGCTCTTGTCACTTACTAGCCATGTGATTTTGGAAAGAAACTTAACATTAATTCCTTCAGCTACAATGGAATTC 
TTGGGAGGATTAAATATGGTGACA 

SEQ ID NO: 1637 

M85491seg24F 

GGCGTCTTTCTCCCTCTGAAC 

SEQ ID NO: 1638 

M85491seg24R 

GTCCCATTCTGGGTGCTGTG 

SEQ ID NO: 1639 
M85491seg24 

GGCGTCTTTCTCCCTCTGAACCTCAGTTTCCACCTGTGTCGAGTGTGGGTGAGACCCCTCGCGGGGAGCTATGCAGG 
TTACGGAGAAAAGGCAGCACAGCACCCAGAATGGGAC 

SEQ ID NO: 1640 

Z21368 juncl7-21 Forward primer 
GGACGGATACAGCAGGAACG 

SEQ ID NO: 1641 

Z21368 juncl7-21 Reverse amplicon 
TATTTTCCAAAAAAGGCCAGCTC 

SEQ ID NO: 1642 

Z21368 juncl7-21 Amplicon 

GGACGGATACAGCAGGAACGAAAAAACATCCGACCCAACATTATTCTTGTGCTTACCGATGATCAAGATGTGGAGCT 
GGCCTTTTTTGGAAAATA 

SEQ ID NO: 1643 

Forward primer Z21368seg39F 

GTTGCATTTCTCAGTGCTGGTTT 

SEQ ID NO: 1644 
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Reverse primer Z21368seg39R 
AGGGTGCCGGGTGAGG 

SEQ ID NO: 1645 
Amplicon Z21 368seg39 : 

GTTGCATTTCTCAGTGCTGGTTTCTAATCAGACCAGTGGATTGAGTTTCTCTACCATCCTCCCCACGTTCTTCTCTA 
AGCTGCCTCCAAGCCTCACCCGGCACCCT 

SEQ ID NO: 1646 
HUMGRP5E j unc3-7F 
ACCAGCCACCTCAACCCA 

SEQ ID NO: 1647 

HUMGRP5Ejunc3-7R 

CTGGAGCAGAGAGTCTTTGCCT 

SEQ ID NO: 1648 
HUMGRP5E j imc3~7 

ACCAGCCACCTCAACCCAAGGCCCTGGGCAATCAGCAGCCTTCGTGGGATTCAGAGGATAGCAGCAACTTCAAAGAT 
GTAGGTTCAAAAGGCAAAGACTCTCTGCTCCAG 

SEQ ID NO: 1649 

Z44808junc8-ll Forward primer 
GAAGGCACAGGAAAAACAGATATTG 

SEQ ID NO: 1650 

Z44808junc8-ll Reverse primer 
TGGTGCTCTTGGTCACAGGAT 

SEQ ID NO: 1651 
Z44808junc8-ll Amplicon: 

GAAGGCACAGGAAAAACAGATATTGCATCACGTTACCCTACCCTTTGGACTGAACAGGTTAAAAGTCGGCAGAACAA 
AACCAATAAGAATTCAGTGTCATCCTGTGACCAAGAGCACCA 

SEQ ID NO: 1652 
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Forward primer AA161187 segl7F2 
CCCTGTGCCTTATTTGACCCT 

SEQ ID NO: 1653 

Reverse primer AA161187 segl7R2 
GCTGGGTAGACTGGGTGCA 

SEQ ID NO: 1654 
Amplicon AA161187 seg25: 

CCTGTGCCTTATTTGACCCTCATGCCAACCCCGGGAGGTGGAGACTGTTGCCCCACTCTGCAGATGCAGAAACGGAG 
GCTTGGCTGCTGCCAGGGGGAGGA 

SEQ ID NO: 1655 

Forward primer -M62069 segl9F 
GCTGATTGTCCCCATGAAGG 

SEQ ID NO: 1656 
Reverse primer- M62069 segl9 
TGGCATACGGGAACTCAGTG 

SEQ ID NO: 1657 
Amplicon : 

GCTGATTGTCCCCATGAAGGCCAGCCTTGAAGCTTGGTCAGTCTCCCTAACTGTATGATTGATCCCCACTTATTGCA 
CTACATCACTGAGTTCCCGTATGC 

SEQ ID NO: 1658 

Forward primer -M62069 seg29F 
ATTGAATAATTCAGCACCTGAGGC 

SEQ ID NO: 1659 

Reverse primer- M62069 seg29R 
TTCATATGGCTACTCCCCACCT 

SEQ ID NO: 1660 
Amplicon: 

ATTGAATAATTCAGCACCTGAGGCTGGTGGATGATTCTTTGCAATTTGGCAGGAATGGGAGAGTCGGGAGCAGTAGT 
TGGCAAGGTGGGGAGTAGCCATATGAA 
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SEQ ID NO: 1661 

Forward primer -HUMCA1X1A seg55F: TTCTCATAGTATTCCATTGATTGGGTA 
SEQ ID NO: 1662 

Reverse primer- HUMCA1X1A seg55R 
CACCGGTATGGAGAATAGCGA 

SEQ ID NO: 1663 
Amplicon : 

TTCTCATAGTATTCCATTGATTGGGTATACCAGGTTCTGTTTACTTTTACTTGGCAGTTGATAGAATAGGTGTAGTT 
TATACTTTTTCGCTATTCTCCATACCGGTG 

SEQ ID NO: 1664 
Forward primer: 
ACCCCAAACCCAACTTGATTC 

SEQ ID NO: 1665 
Reverse primer: 
TCAGTGGTGGAGCCAAGTCTC 

SEQ ID NO: 1666 
Amplicon 

ACCCCAAACCCAACTTGATTCCTGCCATATGGAGGAGGCTCTGGAGTCCTGCTCTGTGTGGTCCAGGTCCTTTCCAC 
CCTGAGACTTGGCTCCACCACTGA 

SEQ ID NO: 1667 
Forward primer: 
CTCCTGAACCCTACTCCAAGCA 

SEQ ID NO: 1668 
Reverse primer: 
CAGGCGATCCTATGGAAATCC 

SEQ ID NO: 1669 
Amplicon 

CTCCTGAACCCTACTCCAAGCACAGCCTCTGTCTGACTCCCTTGTCCTTCAAGAGAACTGTTCTCCAGGTCTCAGGG 
CCAGGATTTCCATAGGATCGCCTG 
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SEQ ID NO: 1670 

Forward primer Z25299 seg23F: 
CAAGCAATTGAGGGACCAGG 

SEQ ID NO: 1671 

Reverse primer Z25299 seg23R: C A A A A A A CAT T G T T A AT GAG A GAG AT G A C 

SEQ ID NO: 1672 
Amplicon Z25299 seg23F : 

CAAGCAATTGAGGGACCAGGAAGTGGATCCTCTAGAGATGAGGAGGCATTCTGCTGGATGACTTTTAAAAATGTTTT 

CTCCAGAGTCATCTCTCTCATTAACAATGTTTTTTG 

SEQ ID NO: 1673 

HSSTROL3 seg24 Forward Primer: 
ATTTCCATCCTCAACTGGCAGA 

SEQ ID NO: 1674 

HSSTROL3 seg24 Reverse Primer: 
TGCCCTGGAACCCACG 

SEQ ID NO: 1675 
HSSTROL3 seg24 Amplicon: 

ATTTCCATCCTCAACTGGCAGAGATGAGAGCCTGGAGCATTGCAGATGCCAGGGACTTCACAAATGAAGGCACAGCA 
TGGGAAACCTGCGTGGGTTCCAGGGCA 

SEQ ID NO: 1676 

HSSTROL3 seg20-21 Forward primer HSSTROL3 seg20-21F: TCTGCTGGCCACTGTGACTG 
SEQ ID NO: 1677 

HSSTROL3 seg20-21 Reverse primer HSSTROL3 seg20-2lR: GAAGAAAAAGAGCTCGCCTCG 
SEQ ID NO: 1678 

HSSTROL3 seg20-21 Amplicon HSSTROL3 seg20-21: 

TCTGCTGGCCACTGTGACTGCAGCATATGCCCTCAGCATGTGTCCCTCTCTCCCACCCCAGCCAGACGCCCCGCCAG 
ATGCCTGTGAGGCCTCCTTTGACGCGGTCTCCACCATCCGAGGCGAGCTCTTTTTCTTC 



SEQ ID NO: 1679 

Forward primer HSSTROL3 junc21-27F: ACATTTGGTTCTTCCAAGGGACTAC 
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SEQ ID NO: 1680 

Reverse primer HSSTROL3 j-unc21-27R: 
TCGATCTCAGAGGGCACCC 

SEQ ID NO: 1681 

Amplicon HSSTROL3 junc21-27: 

ACATTTGGTTCTTCCAAGGGACTACTGGCGTTTCCACCCCAGCACCCGGCGTGTAGACAGTCCCGTGCCCCGCAGGG 
CCACTGACTGGAGAGGGGTGCCCTCTGAGATCGA 

SEQ ID NO: 1682 
Rll723segl3F 

ACACTAAAAGAACAAACACCTTGCTC 

SEQ ID NO: 1683 
Rll723segl3R 
TCCTCAGAAGGCACATGAAAGA 

SEQ ID NO: 1684 
R11723segl3 - amplicon: 

ACACTAAAAGAACAAACACCTTGCTCTTCGAGATGAGACATTTTGCCAAGCAGTTGACCACTTAGTTCTCAAGAAGC 
AACTATCTCTTTCATGTGCCTTCTGAGGA 

SEQ ID NO: 1685 
R11723juncll-18F 
AGTGATGGAGCAAAGTGCCG 

SEQ ID NO: 1686 
R11723 juncll-18R 
CAGCAGCTGATGCAAACTGAG 

SEQ ID NO: 1687 

R11723 juncll-18 - amplicon: 

AGTGATGGAGCAAAGTGCCGGGATCATGTACCGCAAGTCCTGTGCATCATCAGCGGCCTGTCTCATCGCCTCTGCCG 
GGTACCAGTCCTTCTGCTCCCCAGGGAAACTGAACTCAGTTTGCATCAGCTGCTG 

SEQ ID NO: 1688 

H53626 junc24-27FlR3 Forward primer: 
GTCCTTCCAGTGCAAGACCCA 
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SEQ ID NO: 1689 

H53626 junc24-27FlR3 Reverse primer: 
TGGGCCTGGCAAAGCC 

SEQ ID NO: 1690 

H53626 junc24-27FlR3 Amplicon: 

GTCCTTCCAGTGCAAGACCCAAAACCGCCAGGGCCACCTGTGGCCTCCTCGTCCTCGGCCACTAGCCTGCCGTGGCC 
CGTGGTCATCGGCATCCCAGCCGGCGCTGTCTTCATCCTGGGCACCCTGCTCCTGTGGCTTTGCCAGGCCCA 

SEQ ID NO: 1691 

H53626 seg25Forward primer: 

CCGACGGCTCCTACCTCAA 

SEQ ID NO: 1692 

H53626 seg25Reverse primer: 

GGAAGCTGTAGCCCATGGTGT 

SEQ ID NO: 1693 
H53626 seg25Amplicon: 

CCGACGGCTCCTACCTCAATAAGCTGCTCATCACCCGTGCCCGCCAGGACGATGCGGGCATGTACATCTGCCTTGGC 
GCCAACACCATGGGCTACAGCTTCC 

SEQ ID NO:1694 
> Q9P2J2 

EGLGEQASWAMVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESVVLGCDLLPPAGRP 

PLHVIEWLRFGFLLPIFIQFGLYSPRIDPDYVGRVRLQKGASLQIEGLRVEDQGWYECRV 

FFLDQHIPEDDFANGSWVHLTVNSPPQFQETPPAVLEVQELEPVTLRCVARGSPLPHVTW 

KLRGKDLGQGQGQVQVQNGTLRIRRVERGSSGVYTCQASSTEGSATHATQLLVLGPPVIV 

VPPKNSTVNASQDVSLACHAEAYPANLTYSWFQDNINVFHISRLQPRVRILVDGSLRLLA 

TQPDDAGCYTCVPSNGLLHPPSASAYLTVLYPAQVTAMPPETPLPIGMPGVIRCPVRANP 

PLLFVSWTKDGKALQLDKFPGWSQGTEGSLIIALGNEDALGEYSCTPYNSLGTAGPSPVT 

RVLLKAPPAFIERPKEEYFQEVGRELLIPCSAQGDPPPVVSWTKVGRGLQGQAQVDSNSS 

LILRPLTKEAHGHWECSASNAVARVATSTNVYVLGTSPHVVTNVSVVALPKGANVSWEPG 

FDGGYLQRFSVWYTPLAKRPDRMHHDWVSLAVPVGAAHLLVPGLQPHTQYQFSVLAQNKL 

GSGPFSEIVLSAPEGLPTTPAAPGLPPTEIPPPLSPPRGLVAVRTPRGVLLHWDPPELVP 

KRLDGYVLEGRQGSQGWEVLDPAVAGTETELLVPGLIKDVLYEFRLVAFAGSFVSDPSNT 

ANVSTSGLEVYPSRTQLPGLLPQPVLAGVVGGVCFLGVAVLVSILAGCLLNRRRAARRRR 
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KRLRQDPPLIFSPTGKSAAPSALGSGSPDSVAKLKLQGSPVPSLRQSLLWGDPAGTPSPH 
PDPPSSRGPLPLEPICRGPDGRFVMGPTVAAPQERSGREQAEPRTPAQRLARSFDCSSSS 
PSGAPQPLCIEDISPVAPPPAAPPSPLPGPGPLLQYLSLPFFREMNVDGDWPPLEEPSPA 
APPDYMDTRRCPTSSFLRSPETPPVSPRESLPGAVVGAGATAEPPYTALADWTLRERLLP 
GLLPAAPRGSLTSQSSGRGSASFLRPPSTAPSAGGSYLSPAPGDTSSWASGPERWPRREH 
VVTVSKRRNTSVDENYEWDSEFPGDMELLETLHLGLASSRLRPEAEPELGVKTPEEGCLL 
NTAHVTGPEARCAALREEFLAFRRRRDATRARLPAYRQPVPHPEQATLL 



SEQ ID NO: 1695 
> AAQ88495 

MVWCLGLAVLSLVISQGADGRGKPEVVSVVGRAGESVVLGCDLLPPAGRPPLHVIEWLRFGFLLPIFIQF 
GLYSPRIDPDYVGRVRLQKGASLQIEGLRVEDQGWYECRVFFLDQHIPEDDFANGSWVHLTVNSPPQFQE 
TP PAVLE VQELE PVTLRCVARGS PLPHVTWKLRGKDLGQGQGQVQVQNGTLRI RRVERGS S GVYTCQAS S 
TEGSATHATQLLVLGPPVIVVPPKNSTVNASQDASLACHAEAYPANLTYSWFQDNINVFHISRLQPRVRI 
LVDGSLRLLATQPDDAGCYTCVPSNGLLHPPSASAYLTVLYPAQVTAMPPETPLPIGMPGVIRCPVRANP 
PLLFVSWTKDGKALQLDKFPGWSQGTEGSLIIALGNEDALGEYSCTPYNSLGTAGPSPVTRVLLKAPPAF 
IERPKEEYFQEVGRELLIPCSAQGDPPPAAPPSPLPGPGPLLQYLSLPFFREMNVDGDWPPLEEPSPAAP 
PDYMDTRRCPTSSFLRS PET PPVSPRES LPGA VVGAGATAEPPYTALADWTLRERLLPGLLPAAPRGSLT 
SQSSGRGSASFLRPPSTAPSAGGSYLSPAPGDTSSWASGPERWPRREHVVTVSKRRNTSVDENYEWDSEF 
PGDMELLETLHLGLASSRLRPEAEPELGVKTPEEGCLLNTAHVTGPEARCAALREEFLAFRRRRDATRAR 

LPAYRQPVPHPEQATLL 

SEQ ID NO:1696 

> Q9BSH7 

(SEE VTNC_HUMAN) 

SEQ ID NO: 1697 

> Q7Z2W2 

MKYSCCALVLAVLGTELLGSLCSTVRSPRFRGRIQQERKNIRPNIILVLTDDQDVELGSL 
QVMNKTRKIMEHGGATFINAFVTTPMCCPSRSSMLTGKYVHNHNVYTNNENCSSPSWQAM 
HEPRTFAVYLNNTGYRTVFFGKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKE 
KHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPN 
ASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNML 
VETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVL 
NIDLAPTILDIAGLDTPPDVDGKSVLKLLDPEKPGNRFRTNKKAKIWRDTFLVERGKFLR 
KKEESSKNIQQSNHLPKYERVKELCQQARYQTACEQPGQKWQCIEDTSGKLRIHKCKGPS 
DLLTVRQSTRNLYARGFHDKDKECSCRESGYRASRSQRKSQRQFLRNQGTPKYKPRFVHT 
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RQTRSLSVEFEGEI YDINLEEEEELQVLQPRNI AKRHDEGHKGPRDLQASSGGNRGRMLA 
DSSNAVGPPTTVRVTHKCFILPNDSIHCERELYQSARAWKDHKAYIDKEIEALQDKIKNL 
REVRGHLKRRKPEECSCSKQSYYNKEKGVKKQEKLKS-HLHPFKEAAQEVDSKLQLFKENN 

RRRKKERKEKRRQRKGEECSLPGLTCFTHDNNHWQTAPFWNLGSFCACTSSNNNTYWCLRTVNETHNFLFCEFATGF 

LEYFDMNTDPYQLTNTVHTVERGILNQLHVQLMELRSCQGYKQ 

CNPRPKNLDVGNKDGGSYDLHRGQLWDGWEG 



SEQ ID NO:1698 

> AAH12997 

LRSCQGYKQCNPRPKNLDVGNKDGGSYDLHRGQLWDGWEG 

SEQ ID NO:1699 

> Q8N441 

MTPSPLLLLLLPPLLLGAFPPAAAARGPPKMADKVVPRQVARLGRTVRLQCPVEGDPPPL 
TMWTKDGRTIHSGWSRFRVLPQGLKVKQVEREDAGVYVCKATNGFGSLSVNYTLVVLDDI 
SPGKESLGPDSSSGGQEDPASQQWARPRFTQPSKMRRRVIARPVGSSVRLKCVASGHPRP 
DITWMKDDQALTRPEAAEPRKKKWTLSLKNLRPEDSGKYTCRVSNRAGAINATYKVDVIQ 
RTRSKPVLTGTHPVNTTVDFGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVGG 
QKFVVLPTGDVWSRPDGSYLNKLLITRARQDDAGMYICLGANTMGYSFRSAFLTVLPDPK 
PPGPPVASSSSATSLPWPVVIGIPAGAVFILGTLLLWLCQAQKKPCTPAPAPPLPGHRPP 
GTARDRSGDKDLPSLAALSAGPGVGLCEEHGSPAAPQHLLGPGPVAGPKLYPKLYTDIHT 
HTHTHSHTHSHVEGKVHQHIHYQC 

SEQ ID NO:1700 

> Q9H4D7 

MTPSPLLLLLLPPLLLGAFPPAAAARGPPKMADKVVPRQVARLGRTVRLQCPVEGDPPPL 
TMWTKDGRTIHSGWSRFRVLPQGLKVKQVEREDAGVYVCKATNGFGSLSVNYTLWLDDI 
SPGKESLGPDSSSGGQEDPASQQWARPRFTQPSKMRRRVIARPVGSSVRLKCVASGHPRP 
DITWMKDDQALTRPEAAEPRKKKWTLSLKNLRPEDSGKYTCRVSNRAGAINATYKVDVIQ 
RTRSKPVLTGTHPVNTTVDFGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVGG 
QKFWLPTGDVWSRPDGSYLNKLLITRARQDDAGMYICLGANTMGYSFRSAFLTVLPDPK 
PQGPPVASSSSATSLPWPVVIGIPAGAVFILGTLLLWLCQAQKKPCTPAPAPPLPGHRPP 
GTARDRSGDKDLPSLAALSAGPGVGLCEEHGSPAAPQHLLGPGPVAGPKLYPKLYTDIHT 

HTHTHSHTHSHVEGKVHQHIHYQC 
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SEQ ID NO:1701 : 

> Q9HAP5 

MLQGPGSLLLLFLASHCCLGSARGLFLFGQPDFSYKRSNCKPIPANLQLCHGIEYQNMRL 
PNLLGHETMKEVLEQAGAWIPLVMKQCHPDTKKFLCSLFAPVCLDDLDETIQPCHSLCVQ 
VKDRCAPVMSAFGFPWPDMLECDRFPQDNDLCIPLASSDHLLPATEEAPKVCEACKNKND 
DDNDIMETLCKNDFALKIKVKEITYINRDTKIILETKSKTIYKLNGVSERDLKKSVLWLK 
DSLQCTCEEMNDINAPYLVMGQKQGGELV1TSVKRWQKGQREFKRISRSIRKLQC 

SEQ ID NO:1702 

> AAA59968 (see DCOR_HUMAN) 

SEQ ID NO: 1703 

> AAH14562 

MQLKCNDSKAIVKTLAATGTGFDCASKTEIQLVQSLGVPPERII YANPCKQVSQIKYAAN 
NGVQMMTFDSEVELMKVARAHPKAKLVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAK 
ELNIDVVGVSFHVGSGCTDPETFVQAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVK 
LKFEEITGVINPALDKYFPSDSGVRIIAEPGRYYVASAFTLAVNIIAKKIVLKEQTGSDD 
EDESSEQTFMYYVNDGVYGSFNCILYDHAHVKPLLQKRPKPDEKYYSSSIWGPTCDGLDR 
IVERCDLPEMHVGDWMLFENMGAYTVAAASTFNGFQRPTIYYVMSGPAWQLMQQFQNPDF 
PPEVEEQDASTLPVSCAWESGMKRHRAACASASINV 

SEQ ID NO: 1704 

> Q9NWT9 

MRPRSGPTRNPRLRAFAGVPTRGRTRGQSRRCAAEASAGPERDARPGAPAAGTMGAAHSASEEVRELEGKTGFSSDQ 
IEQLHRRFKQLSGDQPTIRKENFNNVPDLELNPIRSKIVRAFF 

DNRNLRKGPSGLADEINFEDFLTIMSYFRPIDTTMDEEQVELSRKEKLRFLFHMYDSDSD 
GRITLEEYRNVKWSRSCCRETLTSRRSPLAPSPTGP 



SEQ ID NO:1705 
> Q8IXD7 

MRILQLILLALATGLVGGETRIIKGFECKPHSQPWQAALFEKTRLLCGATLIAPRWLLTA 
AHCLKPWVSLTSPTHVSPDLSSSNYCLSHLSRYIVHLGQHNLQKEEGCEQTRTATESFPH 
PGFNNSLPNKDHRNDIMLVKMAS PVS ITWAVRPLTLS SRCVTAGTSCLI SGWGSTS S PQL 
RLPHTLRCANITIIEHQKCENAYPGNITDTMVCASVQEGGKDSCQGDSGGPLVCNQSLQG 
I I SWGQDPCAITRKPGVYTKVCKYVDWIQETMKNN 
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SEQ ID NO:1706 

> Q9NS21 

MSLLPRRAPPVSMRLLAAALLLLLLALYTARVDGSKCKCSRKGPKIRYSDVKKLEMKPKY 
PHCEEKMVIITTKSVSRYRGQEHCLHPKLQSTKRFIKWYNAWNEKRRFYEE 

SEQ ID NO:1707 

> Q8IXM0 

MYAQALLVVGVLQRQAAAQHLHEHPPKLLRGHRVQERVDDRAEVEKRLREGEEDHVRPEV 
GPRPVVLGFGRSHDPPNLVGHPAYGQCHNNQPWADTSRRERQRKEKHSMRTQ 

SEQ ID NO:1708 

> Q96AC2 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEVME 
QSAGIMYRKSCASSAACLIASAGYQSFCSPGKLNSVCISCCNTPLCNGPRPKKRGSSASA 

LRPGLRTTILFLKLALFSAHC 

SEQ ID NO:1709 

> Q8N2G4 

MWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDCSSPEFIVNCTVNVQDMCQKEVME 
QSAGIMYRKSCASSAACLIASAGYQSFCSPGKLNSVCISCCNTPLCNGPRPKKRGSSASA 
LRPGLRTTILFLKLASSRHTAELKEMPPPPALFFQPSPPTPHLPE 

SEQ ID NO: 1710 

> BAC85518 

MQAPRAAPAAPLSYDRRPRDSGRMWVLGIAATFCGLFLLPGFALQIQCYQCEEFQLNNDC 
SSPEFIVNCTVNVQDMCQKEVMEQSAGIMYRKSCASSAACLIASAGYQSFCSPGKLNSVC 
ISCCNTPLCSGPRPKKRGSSASALRPGLRTTILFLRLALFSAHC 



